Mistral / Mixtral
Also known as: Mistral models, Mixtral, Mistral Large, Mistral Small
Mistral released its first model in September 2023 and achieved immediate developer attention by outperforming much larger models on benchmarks. Mixtral 8x7B, released December 2023, was one of the first widely-used open mixture-of-experts models: it had 46.7B total parameters but only activated 12.9B per token, matching or beating GPT-3.5 at the cost of a 12.9B dense model. Mixtral set the template for efficient MoE design that many later models (including Llama 4 and Mistral Large 3) followed.
The current generation as of May 2026 is branded Mistral 3 and includes: Mistral Large 3 (released December 2025, 675B total / 41B active sparse MoE, the largest open-weight MoE from a major lab, Apache 2.0 licensed), Mistral Small 4 (released March 2026, 119B total / 6B active, combines reasoning, vision, and coding in one configurable model at $0.15/M input tokens), and Mistral Medium 3.5 (released April 2026, frontier-class multimodal model for agentic and coding use cases). Codestral remains the dedicated coding model, with Codestral 25.08 and Devstral 2 also active. Magistral (launched June 2025) was Mistral's first reasoning model family.
The practical reason builders reach for Mistral is cost efficiency and open licensing. Mistral models consistently punch above their parameter count on benchmarks, and the Apache 2.0 license means teams can self-host, fine-tune, and build commercial products without royalty concerns. The tradeoff versus frontier closed models is a capability gap on the hardest reasoning benchmarks and limited tooling compared to the OpenAI and Anthropic ecosystems.