failspy/Codestral-22B-v0.1-abliterated-v3-GGUF-torrent

This model card describes the Codestral-22B-v0.1-abliterated-v3, a variant of the original Codestral-22B-v0.1 model. The original model was trained on a diverse dataset of 80+ programming languages and can be queried for various tasks such as answering questions about code, generating code, and predicting tokens between a prefix and a suffix.

The Codestral-22B-v0.1-abliterated-v3 is a refined version of the original model, with a focus on reducing refusal behavior. This was achieved by applying a technique called orthogonalization, which inhibits the model's ability to express refusal. The methodology is based on the paper "Refusal in LLMs is mediated by a single direction," which suggests that specific features can be induced or removed using a small amount of data.

The Codestral-22B-v0.1-abliterated-v3 is available for use with the mistral-inference library. To install, run the following command:

pip install mistral_inference

To download the model, use the following code snippet:

from huggingface_hub import snapshot_download
from pathlib import Path

mistral_models_path = Path.home().joinpath('mistral_models', 'Codestral-22B-v0.1')
mistral_models_path.mkdir(parents=True, exist_ok=True)

snapshot_download(repo_id="mistralai/Codestral-22B-v0.1", allow_patterns=["params.json", "consolidated.safetensors", "tokenizer.model.v3"], local_dir=mistral_models_path)

To chat with the model, use the mistral-chat CLI command:

mistral-chat $HOME/mistral_models/Codestral-22B-v0.1 --instruct --max_tokens 256

For fill-in-the-middle (FIM) tasks, use the following Python code:

from mistral_inference.model import Transformer
from mistral_inference.generate import generate
from mistral_common.tokens.tokenizers.mistral import MistralTokenizer
from mistral_common.tokens.instruct.request import FIMRequest

tokenizer = MistralTokenizer.v3()
model = Transformer.from_folder("~/codestral-22B-240529")

prefix = """def add("""
suffix = """    return sum"""

request = FIMRequest(prompt=prefix, suffix=suffix)

tokens = tokenizer.encode_fim(request).tokens

out_tokens, _ = generate([tokens], model, max_tokens=256, temperature=0.0, eos_id=tokenizer.instruct_tokenizer.tokenizer.eos_id)
result = tokenizer.decode(out_tokens[0])

middle = result.split(suffix)[0].strip()
print(middle)

Limitations: The Codestral-22B-v0.1-abliterated-v3 does not have any moderation mechanisms. The community is encouraged to engage and contribute to the development of guardrails for the model.

License: The model is released under the MNLP-0.1 license.

Original work by the Mistral AI Team: Albert Jiang, Alexandre Sablayrolles, Alexis Tacnet, Antoine Roux, Arthur Mensch, Audrey Herblin-Stoop, Baptiste Bout, Baudouin de Monicault, Blanche Savary, Bam4d, Caroline Feldman, Devendra Singh Chaplot, Diego de las Casas, Eleonore Arcelin, Emma Bou Hanna, Etienne Metzger, Gianna Lengyel, Guillaume Bour, Guillaume Lample, Harizo Rajaona, Henri Roussez, Jean-Malo Delignon, Jia Li, Justus Murke, Kartik Khandelwal, Lawrence Stewart, Louis Martin, Louis Ternon, Lucile Saulnier, Lélio Renard Lavaud, Margaret Jennings, Marie Pellat, Marie Torelli, Marie-Anne Lachaux, Marjorie Janiewicz, Mickael Seznec, Nicolas Schuhl, Patrick von Platen, Romain Sauvestre, Pierre Stock, Sandeep Subramanian, Saurabh Garg, Sophia Yang, Szymon Antoniak, Teven Le Scao, Thibaut Lavril, Thibault Schueller, Timothée Lacroix, Théophile Gervet, Thomas Wang, Valera Nemychnikova, Wendy Shang, William El Sayed, and William Marshall.