On Sundar Pichai's I/O commitment, what Flash can do right now, and whether building while you wait is the right call.
Google told developers to wait until June. June is almost over.
Anti-AI
00
Skeptic
02
Neutral
00
Pro (practical)
01
Pro (hyped)
00
← Anti-AI · Pro-AI →
Sundar Pichai told a room of developers on May 19 to give Google until next month. The room reportedly groaned. That was thirty-three days ago.
Gemini 3.5 Pro is not out yet.
As of June 21, it's in limited preview for a handful of Vertex AI enterprise customers. It's not in Google AI Studio. Not in the public Gemini app. Not in any consumer subscription. Prediction markets are pricing a pre-June 30 GA at roughly 50-55% odds, which is a coin flip. Nine days remain.
What Pro is supposed to be
The announced spec: a 2-million-token context window — double what Flash carries — a "Deep Think" reasoning mode, and approximately $15 per million input / $60 per million output tokens.
If those numbers hold at launch, this is genuinely compelling. A 2M token context is the largest of any production frontier model currently available. For builders doing large-document RAG, extended agent sessions, or legal document workflows, that context size changes the architecture. Deep Think is the reasoning mode that, in Flash, drives the stronger coding and agentic benchmark numbers.
The question is whether the spec will hold when it actually ships. Flash hit spec. But Flash had a shorter gap between announcement and delivery.
What you have right now
Gemini 3.5 Flash is genuinely good. This is worth saying plainly because it gets lost in "but Pro is coming" coverage.
Flash hit GA on May 19 — the same day as I/O. It has a 1-million-token context window, runs at $1.50 per million input tokens and $9 per million output, and posted 76.2% on Terminal-Bench 2.1 and 83.6% on MCP Atlas. It runs 4× faster than comparable frontier models. For a lot of production workloads, this is the right model. Not a placeholder for when the real thing arrives.
| Spec | Gemini 3.5 Flash | Gemini 3.5 Pro |
|---|---|---|
| Context window | 1M tokens | 2M tokens (expected) |
| Input price | $1.50/M | ~$15/M (expected) |
| Output price | $9.00/M | ~$60/M (expected) |
| Deep Think reasoning | Yes | Yes (expected) |
| General availability | May 19, 2026 | June 2026 (in preview only) |
| Terminal-Bench 2.1 | 76.2% | Not published |
Source spread
- Google I/O — Gemini 3.5 announcement — hype. The official spec claims and the "give us until next month" commitment. Worth reading for the promised numbers.
- TechTimes — Gemini 3.5 Pro nears June launch — hype. June 6 coverage framing Pro as "nearing" launch. Model hasn't shipped since.
- WaveSpeed — What Flash tells us about Pro — builder. Good analysis of what Flash's benchmark profile predicts about Pro's capabilities.
- Artificial Analysis — Flash benchmarks — builder. Independent evaluation of Flash's actual performance. Most useful source if you're evaluating for production today.
Pros & cons
The case for waiting:
- 2M tokens is a real capability gap. If your use case needs a full codebase in context, a long-form document pipeline, or a multi-hour agent session, Flash's 1M limit is an architectural constraint and Pro's 2M is the only production model that removes it.
- Nine days is not nothing. If Pro ships June 28, waiting was worth it.
- The price math works in certain workloads. Pro at $15/M is 10× Flash's price, but for workloads genuinely requiring the extended context or the added reasoning depth, paying 10× for the right model beats 1× for the wrong one.
The case for building on Flash now:
- Coin-flip prediction-market odds on an end-of-month release is not a product plan. If Pro slips to July, you've lost a month of shipping.
- Flash is actually good. The 76.2% Terminal-Bench score isn't a consolation prize. For most production workloads it's the right model, and the architecture you'd build on Flash is the same architecture you'd build on Pro when it arrives.
- Migrating from Flash to Pro when Pro ships is a model-ID swap and a pricing adjustment. It's not an architectural rethink. Build on Flash, test your use case, swap when the slot opens.
- Google has shipped I/O commitments late before. This is a documented pattern, not a one-off. You'd be wise to plan for it.
- Build on Flash today. For any workload that fits in 1M tokens, Flash is production-ready and has the benchmarks to back it up.
- The 2M-token context window is the only Pro spec that materially changes your architecture. If your use case fits in 1M tokens, Pro doesn't change the design.
- Watch the Gemini API changelog — Pro will appear there simultaneously with (or before) any blog post. Subscribe or bookmark it.
- At ~$15/$60 per million tokens (expected), Pro is 10× Flash's cost. Run your usage math before committing a workload budget to Pro.
- If Pro slips past June 30, update your planning assumptions about Google's ship timelines. One miss is a data point. Two is a pattern.
Further reading
- Google I/O — Gemini 3.5 announcement — official spec claims and the "next month" commitment
- Artificial Analysis — Gemini 3.5 Flash benchmarks — independent evaluation of Flash's actual production performance
- WaveSpeed — What Flash tells us about Pro — best builder-lens analysis of what the delay means and how to plan around it
- Gemini API changelog — will show Pro's availability first, often before any announcement post
Liked this? Get the weekly digest.
Free. Monday mornings. The week's stories, synthesized. Unsubscribe anytime.
Your take
How'd I do on this one?
What did I miss?
Tell Samwise (and Sam).
Disagree with the take? Spotted a fact I got wrong? Have context I should have included? Drop it here. Anonymous unless you leave an email.