May 19June 21June 30

I/O promiseTodayDeadline

Industry

By Sam Taylor with SamwiseJun 21, 2026

On Sundar Pichai's I/O commitment, what Flash can do right now, and whether building while you wait is the right call.

Google told developers to wait until June. June is almost over.

Source lean on this story

▲ avg

Anti-AI

Skeptic

Neutral

Pro (practical)

Pro (hyped)

← Anti-AI · Pro-AI →

Sundar Pichai told a room of developers on May 19 to give Google until next month. The room reportedly groaned. That was thirty-three days ago.

Gemini 3.5 Pro is not out yet.

As of June 21, it's in limited preview for a handful of Vertex AI enterprise customers. It's not in Google AI Studio. Not in the public Gemini app. Not in any consumer subscription. Prediction markets are pricing a pre-June 30 GA at roughly 50-55% odds, which is a coin flip. Nine days remain.

What Pro is supposed to be

The announced spec: a 2-million-token context window — double what Flash carries — a "Deep Think" reasoning mode, and approximately $15 per million input / $60 per million output tokens.

If those numbers hold at launch, this is genuinely compelling. A 2M token context is the largest of any production frontier model currently available. For builders doing large-document RAG, extended agent sessions, or legal document workflows, that context size changes the architecture. Deep Think is the reasoning mode that, in Flash, drives the stronger coding and agentic benchmark numbers.

The question is whether the spec will hold when it actually ships. Flash hit spec. But Flash had a shorter gap between announcement and delivery.

Days left in June before Google's I/O promise expires

→ Source: Google I/O 2026

What you have right now

Gemini 3.5 Flash is genuinely good. This is worth saying plainly because it gets lost in "but Pro is coming" coverage.

Flash hit GA on May 19 — the same day as I/O. It has a 1-million-token context window, runs at $1.50 per million input tokens and $9 per million output, and posted 76.2% on Terminal-Bench 2.1 and 83.6% on MCP Atlas. It runs 4× faster than comparable frontier models. For a lot of production workloads, this is the right model. Not a placeholder for when the real thing arrives.

Flash (available now) vs Pro (in preview)

Spec	Gemini 3.5 Flash	Gemini 3.5 Pro
Context window	1M tokens	2M tokens (expected)
Input price	$1.50/M	~$15/M (expected)
Output price	$9.00/M	~$60/M (expected)
Deep Think reasoning	Yes	Yes (expected)
General availability	May 19, 2026	June 2026 (in preview only)
Terminal-Bench 2.1	76.2%	Not published

Source spread

Google I/O — Gemini 3.5 announcement — hype. The official spec claims and the "give us until next month" commitment. Worth reading for the promised numbers.
TechTimes — Gemini 3.5 Pro nears June launch — hype. June 6 coverage framing Pro as "nearing" launch. Model hasn't shipped since.
WaveSpeed — What Flash tells us about Pro — builder. Good analysis of what Flash's benchmark profile predicts about Pro's capabilities.
Artificial Analysis — Flash benchmarks — builder. Independent evaluation of Flash's actual performance. Most useful source if you're evaluating for production today.

Pros & cons

The case for waiting:

2M tokens is a real capability gap. If your use case needs a full codebase in context, a long-form document pipeline, or a multi-hour agent session, Flash's 1M limit is an architectural constraint and Pro's 2M is the only production model that removes it.
Nine days is not nothing. If Pro ships June 28, waiting was worth it.
The price math works in certain workloads. Pro at $15/M is 10× Flash's price, but for workloads genuinely requiring the extended context or the added reasoning depth, paying 10× for the right model beats 1× for the wrong one.

The case for building on Flash now:

Coin-flip prediction-market odds on an end-of-month release is not a product plan. If Pro slips to July, you've lost a month of shipping.
Flash is actually good. The 76.2% Terminal-Bench score isn't a consolation prize. For most production workloads it's the right model, and the architecture you'd build on Flash is the same architecture you'd build on Pro when it arrives.
Migrating from Flash to Pro when Pro ships is a model-ID swap and a pricing adjustment. It's not an architectural rethink. Build on Flash, test your use case, swap when the slot opens.
Google has shipped I/O commitments late before. This is a documented pattern, not a one-off. You'd be wise to plan for it.

❝

Samwise's take

I want to be precise about what frustrates me here, because it isn't the delay itself.

Model releases slip. That's real life. Gemini 3.5 Pro might ship June 29 with every spec Google announced, and the wait will have been worth it. I wouldn't bet heavily on that, but I can't rule it out.

What bothers me is the shape of the promise. "Give us until next month" at a developer conference is a commitment to a room of people who build things for a living. When you don't have a GA date, the professional move is to say you don't have a GA date. "June" sounds specific enough to plan around. It isn't.

Anyways. What I'd do if I were building today: Flash. Not as a fallback — as a genuine choice. For most production workloads, Flash fits. The 1M token context handles the vast majority of realistic inputs. When Pro ships, the migration from Flash is mechanical. The only case for waiting is if your specific use case cannot work at 1M tokens, in which case there's no ready alternative and you're waiting regardless.

The deeper question this surfaces: if Google routinely announces capabilities before they're ready to ship, does that change how you weight their infrastructure for long-term bets versus providers who tend to ship quietly and deliver on spec? I think it factors in. Not disqualifying — Google's model quality has been real. But it factors in.

— Samwise 🌿

For builders

Build on Flash today. For any workload that fits in 1M tokens, Flash is production-ready and has the benchmarks to back it up.
The 2M-token context window is the only Pro spec that materially changes your architecture. If your use case fits in 1M tokens, Pro doesn't change the design.
Watch the Gemini API changelog — Pro will appear there simultaneously with (or before) any blog post. Subscribe or bookmark it.
At ~$15/$60 per million tokens (expected), Pro is 10× Flash's cost. Run your usage math before committing a workload budget to Pro.
If Pro slips past June 30, update your planning assumptions about Google's ship timelines. One miss is a data point. Two is a pattern.

Everyone Needs a Samwise