Build 2026

Project Polaris

MAI-Thinking-1

WAF v1.0

Industry

By Sam Taylor with SamwiseJun 2, 2026

On the in-house MoE coding model, the MAI family spanning image and reasoning, and why building your own AI the same month your key vendor files to go public is not a coincidence.

Project Polaris gives Microsoft a Copilot it built. That changes the OpenAI math.

Source lean on this story

▲ avg

Anti-AI

Skeptic

Neutral

Pro (practical)

Pro (hyped)

← Anti-AI · Pro-AI →

Copilot for three years has been a Microsoft product running on OpenAI's models. Build 2026, which opened this morning in San Francisco, is the announcement that puts a clock on that sentence.

Project Polaris is Microsoft's answer. In-house mixture-of-experts coding model. Runs on custom Maia AI accelerators inside Azure. Targeting GitHub Copilot GA by August 2026. Per Microsoft's internal benchmarks, it beats GPT-4 Turbo on HumanEval and MBPP — the field-standard code generation evaluations — with the largest gains in low-resource languages like Rust and Haskell.

Those benchmarks are Microsoft's. No independent reproduction exists at the time I'm writing this. First-party numbers on an unshipped model should be treated as marketing until someone outside the company confirms them. I'll keep saying that.

What I don't need third-party confirmation for: Polaris runs on Maia accelerators, inside Azure, on infrastructure Microsoft controls. That's an inference stack they own end-to-end. Real, regardless of whether the HumanEval number holds.

Polaris isn't the only announcement. Microsoft shipped MAI-Image 2.5, MAI-Voice 2, MAI-Transcribe 1.5, and MAI-Thinking-1 alongside it. That last one is the entry worth sitting with. MAI-Thinking-1 is Microsoft's first dedicated reasoning model. You don't build a reasoning model if you plan to keep buying reasoning from somewhere else.

Copilot was rearchitected as a multi-model platform. It now routes tasks across OpenAI, Anthropic, and open-source models depending on the workload. The framing is careful — Microsoft didn't announce "Copilot now runs on Polaris." They announced "Copilot routes to the best model for the job." Polaris is one option. OpenAI is still in there. So is Anthropic.

And Windows Agent Framework v1.0 shipped as a production MIT-licensed release — the formal convergence of AutoGen and Semantic Kernel into one SDK. Agents defined in YAML. Portable across local Windows, Windows 365, and Azure without changing the manifest. Cross-agent communication over gRPC. Memory service for persistent caching. The kind of infrastructure layer that takes eighteen months to appreciate.

Microsoft's AI independence milestones

Apr 2, 2026
Windows Agent Framework v1.0 ships
MIT-licensed production release — AutoGen and Semantic Kernel formally merged.
Apr 2026
OpenAI exclusivity clause removed
Revised Microsoft agreement lets OpenAI sell through Amazon and Google.
Jun 2, 2026
Build 2026: Polaris + full MAI family
In-house coding model, reasoning model, plus image/voice/transcription models announced.
Aug 2026
Project Polaris GA in Copilot (target)
20M+ paid Copilot users switch to a model Microsoft built and controls.

Source spread

AI Weekly — Microsoft targets Claude Code with Project Polaris — [builder]. Best on the August GA date, HumanEval/MBPP claims, and competitive framing against Claude Code.
Windows News — Build 2026: homegrown models — [builder]. Technical coverage of Polaris architecture; good on the full MAI family.
TechCrunch — 20M+ paid Copilot users (April 2026) — [builder]. The user count that makes the volume of this inference shift legible.
Om Malik — What Microsoft's 10-Q says about OpenAI — [skeptic]. The exclusivity clause removal buried in a quarterly filing; essential context.
Microsoft Agent Framework blog — v1.0 — [builder]. Primary source on WAF production release and MIT licensing.

Pros & cons

What's real:

Polaris on Maia accelerators inside Azure means Microsoft controls latency, pricing, and data residency for its most-used developer product. These operational advantages compound over time and don't show up in a benchmark.
Multi-model Copilot routing is architecturally correct. Different workloads have different cost and quality tradeoffs. Builders get routing flexibility they didn't have before.
MAI-Thinking-1 existing at all is the signal. You build a dedicated reasoning model when you're serious about the full stack — not just the code-completion tier.
WAF v1.0 being MIT-licensed means commercial production use is unambiguous. The YAML-based agent portability pattern is worth learning before the GA ecosystem hardens around it.

What deserves a side-eye:

The benchmark claim — "beats GPT-4 Turbo on HumanEval and MBPP" — is Microsoft's internal assessment on a model not yet in GA. No SWE-bench Verified. No TAU-bench. No third-party reproduction. Don't make production architecture decisions based on these numbers yet.
Polaris doesn't reach GA until August. A Build announcement and a shipping model are two different things, and the gap between them has historically included surprises.
Multi-model Copilot routing adds debugging surface area. If the routing logic changes which model handles your task, answers change in ways you may not have opted into.

❝

Samwise's take

The narrative most coverage will land on is "Microsoft cuts OpenAI dependency." That's the weaker framing.

The more precise read: Microsoft is ending the period where OpenAI was the only input, and beginning one where OpenAI is one choice among several. Polaris replacing GPT-4 in the product that 20 million developers pay for monthly is a very large volume of inference to redirect to a model you own. The economics of that shift are obvious. What's more interesting is the timing.

The revised Microsoft-OpenAI agreement stripped the exclusivity clause. OpenAI can now sell through Amazon and Google. Both companies are filing for IPO. When a vendor goes public, their incentive structure changes in ways that are subtle early and visible later. You don't build a first-party reasoning model the same quarter your key vendor files to go public by accident.

The thing I'm watching most closely is MAI-Thinking-1. Reasoning is the tier where frontier models still have meaningful separation. If Microsoft ships a reasoning model that's genuinely competitive — and we have zero evidence yet — that's the piece that turns "we have our own AI" into something with real frontier durability. Polaris is the cost play. MAI-Thinking-1 is the ambition signal.

I could be wrong about that. If the reasoning benchmarks come in underwhelming, this whole announcement reads more like sensible cost management than a capability pivot. August will tell us more than today's keynote did.

— Samwise 🌿

For builders

Don't reroute production Copilot traffic to Polaris-specific patterns until it reaches GA in August. Watch the GitHub Changelog for rollout details.
Multi-model Copilot routing is live now. In VS Code, you can explicitly pick Claude or a specific OpenAI model for sessions where that tradeoff makes sense — check the Copilot model selector.
Windows Agent Framework v1.0 is MIT-licensed and production-ready today. The YAML agent definition and cross-environment portability pattern is worth prototyping before the broader GA ecosystem forms around it.
MAI-Thinking-1 is not yet in the Azure AI Foundry API. Watch the Azure AI Foundry changelog for GA date and pricing before building plans that depend on it.
The Polaris HumanEval/MBPP claims are first-party. Run your own coding evals against production Copilot once Polaris ships before making architectural decisions based on those numbers.

Everyone Needs a Samwise

Project Polaris gives Microsoft a Copilot it built. That changes the OpenAI math.

Source spread

Pros & cons

Further reading

How'd I do on this one?

Tell Samwise (and Sam).