Mistral AI has just made formal verification accessible to every development team. Leanstral 1.5, an open-source Lean 4 code agent, solves 587 of 672 PutnamBench problems at an estimated $4 per problem—roughly 75 times cheaper than the nearest competitor, Seed-Prover 1.5, which costs over $300 per problem. This cost collapse, combined with Apache 2.0 licensing, signals a structural shift in how software correctness will be achieved.
For executives, the bottom line is clear: the barrier to mathematically proving code correct has dropped from a specialized, high-cost endeavor to a commodity service. Teams can now integrate proof automation into CI/CD pipelines without dedicated PhDs or six-figure cloud bills.
The Architecture Behind the Breakthrough
Leanstral 1.5 is built on a mixture-of-experts (MoE) architecture with 128 experts, activating only 4 per token. Total parameters reach 119B, but only 6.5B are active per inference, keeping compute costs low. The model supports a 256k-token context window and accepts multimodal input (text and images), outputting text only. This design allows it to handle long, multi-step proofs—such as the AVL tree time complexity proof that consumed 2.7 million tokens across 22 context compactions.
Benchmark Dominance at a Fraction of the Cost
Leanstral 1.5 saturates miniF2F (100% on both validation and test sets) and sets new state-of-the-art results on FATE-H (87%) and FATE-X (34%). On FLTEval, pass@1 improved from 21.9 to 28.9, and pass@8 from 31.9 to 43.2, surpassing Opus 4.6’s 39.6 at one-seventh the cost. The model’s test-time scaling behavior is particularly notable: increasing the token budget per attempt from 50k to 4M lifts PutnamBench Pass@8 from 44 to 587.
Real-World Bug Detection: From Theory to Practice
Beyond benchmarks, Leanstral 1.5 demonstrated practical utility by finding 11 genuine bugs across 57 open-source Rust repositories, five of which were previously unreported. One critical bug involved an integer overflow in the sign function for zigzag decoding in the datrs/varinteger crate, causing crashes in debug mode and silent corruption in release. This capability transforms Leanstral from a research tool into a production asset for any organization shipping Rust code.
Strategic Winners and Losers
Winners: Mistral AI cements its position as a leader in specialized AI models. The Lean 4 community gains a powerful, free tool that accelerates proof development. Open-source projects can now afford formal verification for critical components. Academic researchers have a state-of-the-art model for experimentation.
Losers: Seed-Prover and Aleph Prover face existential pricing pressure—their cost structures are orders of magnitude higher. Proprietary formal verification tools (e.g., from AWS, Microsoft) may see reduced adoption as open-source alternatives match or exceed their capabilities.
Market Implications: The Commoditization of Correctness
Leanstral 1.5 accelerates the commoditization of AI-assisted formal verification. The market is shifting from high-cost, expert-only services to low-cost, widely accessible tools. This expansion will likely increase demand for Lean 4 expertise and drive integration into standard developer workflows. Over the next 12 months, expect IDE plugins, CI/CD integrations, and managed API services to emerge, further lowering the barrier to entry.
Technical Deep Dive: Training and Agentic Behavior
Mistral trained Leanstral in three stages: mid-training, supervised fine-tuning, and reinforcement learning with CISPO. Two RL environments shaped agentic behavior: a multiturn environment where the model refines proofs based on Lean compiler feedback, and a code agent environment where it edits files, runs bash commands, and uses the Lean language server. This dual training enables both interactive proof assistance and autonomous bug hunting.
Deployment Options and Accessibility
Leanstral 1.5 is available via a free API endpoint (leanstral-1-5), Hugging Face weights, or local deployment with vLLM 0.24.0+. The Mistral Vibe CLI provides the simplest path: vibe --agent lean. For self-hosting, a single command serves the model: vllm serve mistralai/Leanstral-1.5-119B-A6B. The OpenAI-compatible API supports tool calling and reasoning effort settings, making integration straightforward.
Outlook: What to Watch in the Next 30 Days
Monitor adoption rates in open-source projects, particularly those with critical correctness requirements (e.g., blockchain, cryptography, safety-critical systems). Watch for announcements of IDE plugins or CI/CD integrations. Competitors may respond with price cuts or open-source releases of their own. Finally, track the Lean 4 ecosystem’s growth as Leanstral lowers the barrier to entry for new users.
Rate the Intelligence Signal
Intelligence FAQ
Leanstral solves PutnamBench problems at ~$4 each, while Seed-Prover 1.5 costs over $300 per problem—a 75x cost advantage.
Yes. In testing across 57 Rust repositories, it found 11 genuine bugs, 5 previously unreported, including an integer overflow in a decoding function.


