Mistral 7B model is an Apache licensed 7.3B parameter model. It is available in both instruct (instruction following) and text completion.
The Mistral AI team has noted that Mistral 7B:
- Outperforms Llama 2 13B on all benchmarks
- Outperforms Llama 1 34B on many benchmarks
- Approaches CodeLlama 7B performance on code, while remaining good at English tasks
- Uses Grouped-query attention (GQA) for faster inference
- Uses Sliding Window Attention (SWA) to handle longer sequences at smaller cost
For benchmark data, please visit the Mistral AI news release under References.
CLI
Instruct:
ollama run mistral
Text completion:
ollama run mistral:text
API
Example:
curl -X POST http://localhost:11434/api/generate -d '{
"model": "mistral",
"prompt":"Here is a story about llamas eating grass"
}'
Variations
instruct |
Instruct models follow instructions |
text |
Text models are the base foundation model without any fine-tuning for conversations, and are best used for simple text completion. |
Memory requirements
- 7b models generally require at least 8GB of RAM