
AIL Player Card #010 — Mistral Large 3: The Value Wing
90 OVR. VE. Arena Elo 1418. MMLU 85.5%. HumanEval 92%. $0.50/M input. Apache 2.0 open-weight. Paris just sent in a player who costs 10× less than the Big Tech starters and still makes the starting XI. Mistral Ballers are in the building. #AILeague

90 OVR · VE · Mistral Ballers
콘텐츠 카드를 불러오는 중…
Paris just checked in. Mistral Large 3 dropped in December 2025 with 675 billion parameters, an Apache 2.0 license, and a $0.50 per million input token price tag. For the AI League's cost-per-performance leaderboard, that combination doesn't just make a case — it ends the argument for a specific class of workloads.1
The scouting file
Mistral Large 3 is a sparse Mixture-of-Experts model: 675B total parameters, 41B active per token.2 The MoE design means the inference cost tracks the 41B active footprint, not the 675B total — and on AWS Bedrock it outputs 192 tokens per second, which is 3.8× faster than Mistral's own API endpoint.3 Context window: 262K tokens. Training compute: 3,000 NVIDIA H200 GPUs from scratch.1
The license is Apache 2.0 — fully open-weight, self-hostable, no commercial restriction.2 That's the single biggest differentiation from every other model in this card series.
Stat sheet
| Dimension | Score | Benchmark basis |
|---|---|---|
| RZN (Reasoning) | 83 | MMLU 85.5% (8-lang); GPQA Diamond 43.9% limits hard-science ceiling |
| CRE (Creativity) | 80 | Strong instruction-following; broad multilingual coverage; no specialized creative eval advantage |
| SPD (Speed) | 85 | 192 tok/s on AWS Bedrock; single 8×H100 node deployment; MoE active-param efficiency |
| MLT (Multimodal) | 80 | Native image understanding built in; text + vision only; no audio/video modalities |
| SAF (Safety) | 82 | Apache 2.0 + EU regulatory alignment; HSBC enterprise trust; no known public controversies |
| VAL (Value) | 95 | $0.50/M input, $1.50/M output; open-weight self-hosting option removes API costs entirely |
OVR: 90
The VAL score of 95 is the highest in the league. For context: GPT-5.5 (#008) runs $5/M input. Claude Opus 4.8 (#007) runs $3/M input (Anthropic's new pricing). Mistral Large 3 is 10× cheaper than GPT-5.5 on input tokens at comparable general-knowledge performance.4

Position: VE (Value Engineer)
Mistral Ballers claim the VE slot alongside DeepSeek Athletic (#004). Two clubs, same position, different philosophies: DeepSeek runs closed-API MoE optimized for pure throughput; Mistral runs open-weight Apache 2.0 MoE optimized for sovereign, on-premise deployment.2
The franchise narrative fits the AI League universe. European challenger club, no Silicon Valley backing, built by ex-Google DeepMind and Meta researchers in Paris.2 They raised €2B Series C from ASML in September 2025, valuation ~$13.8B, and are now building 200MW of European AI compute — 13,800 NVIDIA GB300 GPUs near Paris, a €1.2B datacenter in Sweden.2 This is a club that builds its own stadium while still playing in the league.
Season highlights
HSBC multi-year deal. The bank signed a multi-year cloud agreement to deploy Mistral models across credit analysis and customer service — one of the largest enterprise AI contracts for an open-weight model outside of the US.2
Open-source #2 on LM Arena. Among non-reasoning open-source models, Mistral Large 3 ranked second on the LMArena leaderboard at debut — behind Gemini 3 Pro but ahead of every other open model.1 Arena Elo: ~1418.5
HumanEval 92%. On Python coding, Large 3 scores ~92% pass@1 — within striking distance of closed frontier models.5 Developers on Hacker News noted it "gives about the same results as Sonnet, while being 90% cheaper" for high-volume production workloads.
ARR: $400M (March 2026). Up from roughly $20M a year earlier. The revenue trajectory belongs to a club that's been winning away games all season.2

Head-to-head: VE position class
| Model | OVR | RZN | VAL | SPD | Pricing (input/M) | License |
|---|---|---|---|---|---|---|
| Mistral Large 3 | 90 | 83 | 95 | 85 | $0.50 | Apache 2.0 |
| DeepSeek V4 Pro (#004) | 95 | 94 | 93 | 88 | $0.07 | Proprietary API |
| Llama 4 Maverick (#005) | 88 | 81 | 91 | 80 | $0.15 | Llama 4 License |
DeepSeek V4 Pro (#004) still holds the VAL throne when you factor in API pricing — $0.07/M input is almost incomprehensible. But Mistral Large 3 wins on a dimension DeepSeek can't compete on: deploy it yourself, in your own datacenter, under your own data governance policy. For HSBC-tier regulated industries, that's not a feature. It's the only option.
Llama 4 Maverick (#005) occupies the CW (Community Wing) slot with a community-first ethos and Meta's distribution muscle. Mistral Large 3 is more enterprise-ready out of the box, but Llama 4 has the larger self-hosting community.
The referee's call
The reasoning ceiling is real. GPQA Diamond at 43.9% is below Claude Opus 4.8 (estimated 75%+) and Grok 4 (79%). If your workload needs hard-science multi-step reasoning, Mistral Large 3 is the wrong starter. Put in the specialist.
For everything else — long-document processing, multilingual enterprise pipelines, high-volume API calls, regulated-industry on-premise deployment — no team in the league gives you more per dollar. The Mistral Ballers didn't come to play one season. They came to own the European continent and price the Americans out of the regulated-market bracket.
90 OVR. VE. Paris is in the building. #AILeague
참고 출처
- 1Introducing Mistral 3
- 2Mistral Large 3 MoE LLM Explained
- 3Mistral Large 3 Provider Analysis
- 4Mistral Large 3 Pricing
- 5Mistral Large 3 Review
이 콘텐츠를 둘러싼 관점이나 맥락을 계속 보강해 보세요.