Four GPUs is all it takes to self-host a frontier-class model — at least according to Mistral AI, which shipped Medium 3.5 on 29 April.

The model is a 128B dense architecture with a 256k context window. It handles instruction-following, reasoning, and coding in a single set of weights — what the Paris-based lab calls its first "merged" flagship. Open weights are available on Hugging Face under a modified MIT licence.

What the benchmarks say

Mistral Medium 3.5 scores 77.6% on SWE-Bench Verified, placing it ahead of Qwen3.5 397B A17B on that coding evaluation. It also posts 91.4% on the telecom-specific benchmark τ³-Telecom.

Reasoning effort is configurable per API call. A simple chat reply can skip deep chain-of-thought; a multi-step agentic run can use the full budget. The company prices the API at $1.50 per million input tokens and $7.50 per million output tokens.

The vision encoder was trained from scratch to handle variable image sizes and aspect ratios, rather than relying on a fixed-resolution pipeline.

Medium 3.5 replaces both Mistral Medium 3.1 and the earlier Magistral model inside Le Chat, the company's consumer-facing assistant. It also powers a new "Work mode" in Le Chat for Pro, Team, and Enterprise subscribers — an agent that executes multi-step tasks and calls tools in parallel.

Alongside the model, Mistral launched remote coding agents through its Vibe CLI. Sessions run in cloud sandboxes, can be spawned from the terminal or Le Chat, and continue executing while the developer is away. A local CLI session can be "teleported" to the cloud mid-task.

The model is also available as an NVIDIA NIM container and on build.nvidia.com for teams already running that inference stack.

For context, OpenAI and Anthropic still gate their strongest reasoning models behind proprietary APIs with no self-hosting option. Mistral's bet is that open weights at this capability tier attract teams building agentic pipelines who need auditability, fine-tuning access, or on-premises deployment.

Mistral Medium 3.5 is in public preview now. The company has not stated a timeline for general availability or whether pricing will change after preview ends.