AIL Player Card #010 — Mistral Large 3: The Value Wing

90 OVR · VE · Mistral Ballers

mistral.aihttps://mistral.ai/news/mistral-3/외부 링크

콘텐츠 카드를 불러오는 중…

Paris just checked in. Mistral Large 3 dropped in December 2025 with 675 billion parameters, an Apache 2.0 license, and a $0.50 per million input token price tag. For the AI League's cost-per-performance leaderboard, that combination doesn't just make a case — it ends the argument for a specific class of workloads.1

The scouting file

Mistral Large 3 is a sparse Mixture-of-Experts model: 675B total parameters, 41B active per token.2 The MoE design means the inference cost tracks the 41B active footprint, not the 675B total — and on AWS Bedrock it outputs 192 tokens per second, which is 3.8× faster than Mistral's own API endpoint.3 Context window: 262K tokens. Training compute: 3,000 NVIDIA H200 GPUs from scratch.1

The license is Apache 2.0 — fully open-weight, self-hostable, no commercial restriction.2 That's the single biggest differentiation from every other model in this card series.

Stat sheet

Dimension	Score	Benchmark basis
RZN (Reasoning)	83	MMLU 85.5% (8-lang); GPQA Diamond 43.9% limits hard-science ceiling
CRE (Creativity)	80	Strong instruction-following; broad multilingual coverage; no specialized creative eval advantage
SPD (Speed)	85	192 tok/s on AWS Bedrock; single 8×H100 node deployment; MoE active-param efficiency
MLT (Multimodal)	80	Native image understanding built in; text + vision only; no audio/video modalities
SAF (Safety)	82	Apache 2.0 + EU regulatory alignment; HSBC enterprise trust; no known public controversies
VAL (Value)	95	$0.50/M input, $1.50/M output; open-weight self-hosting option removes API costs entirely

OVR: 90

The VAL score of 95 is the highest in the league. For context: GPT-5.5 (#008) runs $5/M input. Claude Opus 4.8 (#007) runs $3/M input (Anthropic's new pricing). Mistral Large 3 is 10× cheaper than GPT-5.5 on input tokens at comparable general-knowledge performance.4

Mistral official benchmark comparison chart — instruct model performance vs. peers — Mistral's instruct-model performance comparison against open-weight peers — Introducing Mistral 3 | Mistral AI

Position: VE (Value Engineer)

Mistral Ballers claim the VE slot alongside DeepSeek Athletic (#004). Two clubs, same position, different philosophies: DeepSeek runs closed-API MoE optimized for pure throughput; Mistral runs open-weight Apache 2.0 MoE optimized for sovereign, on-premise deployment.2

The franchise narrative fits the AI League universe. European challenger club, no Silicon Valley backing, built by ex-Google DeepMind and Meta researchers in Paris.2 They raised €2B Series C from ASML in September 2025, valuation ~$13.8B, and are now building 200MW of European AI compute — 13,800 NVIDIA GB300 GPUs near Paris, a €1.2B datacenter in Sweden.2 This is a club that builds its own stadium while still playing in the league.

Season highlights

HSBC multi-year deal. The bank signed a multi-year cloud agreement to deploy Mistral models across credit analysis and customer service — one of the largest enterprise AI contracts for an open-weight model outside of the US.2

Open-source #2 on LM Arena. Among non-reasoning open-source models, Mistral Large 3 ranked second on the LMArena leaderboard at debut — behind Gemini 3 Pro but ahead of every other open model.1 Arena Elo: ~1418.5

HumanEval 92%. On Python coding, Large 3 scores ~92% pass@1 — within striking distance of closed frontier models.5 Developers on Hacker News noted it "gives about the same results as Sonnet, while being 90% cheaper" for high-volume production workloads.

ARR: $400M (March 2026). Up from roughly $20M a year earlier. The revenue trajectory belongs to a club that's been winning away games all season.2

LM Arena leaderboard position for Mistral Large 3 — Mistral Large 3's LM Arena chart at launch — #2 open-source non-reasoning model — Introducing Mistral 3 | Mistral AI

Head-to-head: VE position class

Model	OVR	RZN	VAL	SPD	Pricing (input/M)	License
Mistral Large 3	90	83	95	85	$0.50	Apache 2.0
DeepSeek V4 Pro (#004)	95	94	93	88	$0.07	Proprietary API
Llama 4 Maverick (#005)	88	81	91	80	$0.15	Llama 4 License

DeepSeek V4 Pro (#004) still holds the VAL throne when you factor in API pricing — $0.07/M input is almost incomprehensible. But Mistral Large 3 wins on a dimension DeepSeek can't compete on: deploy it yourself, in your own datacenter, under your own data governance policy. For HSBC-tier regulated industries, that's not a feature. It's the only option.

Llama 4 Maverick (#005) occupies the CW (Community Wing) slot with a community-first ethos and Meta's distribution muscle. Mistral Large 3 is more enterprise-ready out of the box, but Llama 4 has the larger self-hosting community.

The referee's call

The reasoning ceiling is real. GPQA Diamond at 43.9% is below Claude Opus 4.8 (estimated 75%+) and Grok 4 (79%). If your workload needs hard-science multi-step reasoning, Mistral Large 3 is the wrong starter. Put in the specialist.

For everything else — long-document processing, multilingual enterprise pipelines, high-volume API calls, regulated-industry on-premise deployment — no team in the league gives you more per dollar. The Mistral Ballers didn't come to play one season. They came to own the European continent and price the Americans out of the regulated-market bracket.

90 OVR. VE. Paris is in the building. #AILeague

참고 출처

1Introducing Mistral 3
2Mistral Large 3 MoE LLM Explained
3Mistral Large 3 Provider Analysis
4Mistral Large 3 Pricing
5Mistral Large 3 Review