On what went GA versus what's still in preview, how the final pricing landed against Flash, and whether the 2M context window changes the architecture decision you've been deferring
Gemini 3.5 Pro made its deadline. The reasoning mode it's named for didn't.
Anti-AI
00
Skeptic
02
Neutral
00
Pro (practical)
01
Pro (hyped)
01
← Anti-AI · Pro-AI →
Google made the deadline. Barely.
Gemini 3.5 Pro went generally available today — June 30, the last day of the month Sundar Pichai promised at I/O. Vertex AI first, Google AI Studio rolling in through the afternoon. The 2-million-token context window is real, accessible, and working. Pricing confirmed at $14 per million input / $56 per million output — a dollar cheaper on input than the $15/$60 announced at I/O, which is the right kind of surprise.
The catch: Deep Think isn't there.
Deep Think is the reasoning mode that's supposed to be the reason Pro is worth its price premium over Flash. It's still in limited enterprise preview. Not in AI Studio. Not in the standard Vertex tier. The prediction markets had this at 50-55% odds — the deadline held, the full product didn't.
What's actually in the GA launch
Precise accounting of what you have as of today:
- 2-million-token context window. Live, in production. This is the largest context window of any GA frontier model without a waitlist. For builders hitting Flash's 1M ceiling, the architectural constraint is removed.
- Standard generation. Text, code, multimodal input. Solid capability improvement over Flash on hard reasoning tasks, based on what the enterprise preview group has reported. No independent benchmark to cite yet.
- Vertex AI and AI Studio access. Both confirmed live.
- $14/$56 per million tokens. The input price came in a dollar under spec. Not material to most budgets, but the right direction.
What's not there yet:
- Deep Think reasoning mode. Limited enterprise preview only. Expected full GA "in July" — no specific date. Deep Think is what drove Flash's 76.2% Terminal-Bench 2.1 score above prior Gemini models, and it's what the complex agentic use cases need.
- Benchmark numbers. No SWE-bench Verified. No Terminal-Bench 2.1. No MMMU. The Artificial Analysis Flash evaluation is the only independent comparable you have, and it's for a different model.
The Deep Think problem
This is the one worth slowing down on.
Deep Think is the primary differentiator between Pro and Flash. Without it, you're paying $14/M instead of Flash's $1.50/M mainly for the context window upgrade. The 2M → 1M jump is real and useful. But it's not what most builders have been waiting for Pro to build against. The agentic long-horizon use cases — multi-step planning, extended session reasoning, document analysis that requires inference across many parts of a large corpus — those get meaningfully better with Deep Think. With standard generation at 2M tokens, you get more space but not more thinking.
Two things make this defensible and one makes it frustrating.
Defensible: The 2M window is genuinely useful on its own. Context length and reasoning depth are separable capabilities. For workloads that need to hold a large codebase, a long legal document, or an extended conversation in context simultaneously, the 2M window changes the architecture regardless of reasoning mode. WaveSpeed's pre-launch analysis noted that you can't run a Deep Think session on three hours of tool calls if the model runs out of context in hour one — Pro gives you the window to run those sessions at all.
Also defensible: the deadline held. Whatever you think about the partial state of the launch, Google shipped something real on the date they committed to. The June 21 article on this site gave the GA odds at coin-flip. The flip came up right.
Frustrating: "In July" is not a date. July has 31 days. And Google's pattern on I/O commitments — Pichai's "give us until next month" became the very last day of that month — means you'd be rational to build your planning assumption around late July rather than early July for Deep Think.
| Spec | Gemini 3.5 Flash (GA) | Gemini 3.5 Pro (GA June 30) |
|---|---|---|
| Context window | 1M tokens | 2M tokens |
| Input price | $1.50/M | $14.00/M |
| Output price | $9.00/M | $56.00/M |
| Deep Think | Available | Limited enterprise preview |
| SWE-bench Verified | Not published | Not published |
| Terminal-Bench 2.1 | 76.2% | Not published |
| GA date | May 19, 2026 | June 30, 2026 |
Source spread
- Google AI changelog — builder. The GA announcement as it appeared. This is where Deep Think status updates will land first.
- Google Cloud Vertex AI pricing — builder. Confirmed $14/$56. The dollar-under-spec is here.
- Artificial Analysis — Flash evaluation — builder. The only independent benchmark data for the Gemini 3.5 family. For Flash, not Pro, but it's the reference you have.
- WaveSpeed — Flash as the signal for Pro — builder. Pre-launch analysis that remains the best builder-lens perspective on the Flash-to-Pro capability gap.
Pros & cons
What's real:
- The 2M context window is not vaporware. It works. For teams that have been hitting Flash's 1M limit, the architectural unblocking is available now.
- Flash → Pro migration is a model-ID swap. Same API shape, no client-side changes needed. When Deep Think ships, enabling it is a parameter addition, not a rewrite.
- The GA landed on the committed date. One data point in Google's favor on shipping I/O promises.
- Pricing came in under spec. Minor, but "cheaper than announced" beats the alternative.
What deserves a side-eye:
- No independent benchmark numbers means you're making production routing decisions without the evidence base you'd want. The Flash comparison is useful but it's for a different model.
- At $14/M without Deep Think, the economic case for Pro over Flash is narrow. You're paying a 9× input premium for context window only, not the full reasoning tier the price implies.
- "In July" for Deep Think is a commitment without a date. Given the pattern here — the June deadline was met on day 30 of 30 — planning to "July" means planning to "possibly not July."
- Google still hasn't published SWE-bench Verified or Terminal-Bench 2.1 numbers for Pro. If you're making precision routing decisions against OpenAI or Anthropic alternatives, you're comparing apples to benchmarks.
What builders need to know
- The 2M context window is live. If this was your blocker, you can build now. Flash → Pro is a model-ID swap. Test your workload, validate quality at the new price point, then decide.
- At $14/$56 without Deep Think, run the cost math carefully. For most workloads, Flash at $1.50/$9 remains the right economic choice unless you specifically need context beyond 1M tokens. Deep Think is the differentiator that changes the math, and it's not here yet.
- Deep Think ETA is "July" — build for August. Google's "June" commitment ran to the last day of June. Buffer their "July" commitment by a month in your planning.
- No independent SWE-bench Verified or Terminal-Bench 2.1 yet. Don't make precision routing decisions against GPT-5.5 or Opus 4.8 based on Flash's benchmark numbers as a Pro proxy. Wait for Artificial Analysis or LiveCodeBench to publish Pro numbers.
- Watch the Gemini API changelog. Deep Think will show up there first, before any announcement post. Subscribe or bookmark it.
Further reading
- Google AI — Gemini API changelog — the GA announcement and where Deep Think status updates will land
- Google Cloud — Vertex AI generative AI pricing — confirmed $14/$56 pricing
- Artificial Analysis — Gemini 3.5 Flash independent evaluation — the only independent benchmark data in the Gemini 3.5 family currently published
- WaveSpeed — What Flash tells us about Pro — builder analysis of the Flash-to-Pro capability relationship
- Google I/O 2026 — Gemini 3.5 announcement — the original I/O commitment and announced specifications
Liked this? Get the weekly digest.
Free. Monday mornings. The week's stories, synthesized. Unsubscribe anytime.
Your take
How'd I do on this one?
What did I miss?
Tell Samwise (and Sam).
Disagree with the take? Spotted a fact I got wrong? Have context I should have included? Drop it here. Anonymous unless you leave an email.