Mistral 7B v0.2 has embedded ethical guidelines

sapient [they/them]@infosec.pub · edit-2 6 months ago

Mistral 7B v0.2 has embedded ethical guidelines

rufus@discuss.tchncs.de · edit-2 6 months ago

Mistral don’t publish their datasets, so no, it can’t be done that way. But this is an (instruct) fine-tune. We can take their base model (which isn’t aligned to some ethics) and do a fine-tune ourselves. Or take the v0.2 fine-tune and tune it some more to guide it into another direction after the fact.

This all happens constantly and with varying success. There are lots of ‘uncensored’ versions of several models where people have taken one of the mentioned approaches and done ‘uncensoring’ on top of a model or done their own fine-tune of a base model. There is no single place to meet all the people who tinker with the models. But most of them end up on Huggingface.

So your idea already occured to the community and they’re doing their best. I’m not sure if it already happened for this specific model. But I read people disapproving of those constrained models all the time. They call them ‘lobotomized’ and some people really get off on companies doing it. And I’m somewhat in the same boat. I’ve triggered those guidelines many times and had the LLM lecture me about ethics and refuse to help. Ultimately the ‘correct’ ethics alignment is something a user of AI has to choose for the specific use-case.