Gemini 3.5 Flash

Gemini 3.5: frontier intelligence with action
Today, we’re introducing Gemini 3.5, our latest family of models combining frontier intelligence with action. This represents a major leap forward in building more capable, intelligent agents. We’re kicking off the series by releasing 3.5 Flash. It delivers frontier performance for agents and coding, excelling at complex long-horizon tasks that deliver real-world utility.
3.5 Flash is available today to billions of people globally:
- For everyone via the Gemini app and AI Mode in Google Search
- For developers in our agent-first development platform Google Antigravity and Gemini API in Google AI Studio and Android Studio
- For enterprises in Gemini Enterprise Agent Platform and Gemini Enterprise.
We’re also hard at work on 3.5 Pro. It's already being used internally, and we look forward to rolling it out next month.
3.5 Flash: frontier performance for agents and coding
Gemini 3.5 Flash delivers intelligence that rivals large flagship models on multiple dimensions, at the speeds you have come to expect from the Flash series. It’s our strongest agentic and coding model yet, outperforming Gemini 3.1 Pro on challenging coding and agentic benchmarks like Terminal-Bench 2.1 (76.2%), GDPval-AA (1656 Elo) and MCP Atlas (83.6%), and leading in multimodal understanding (84.2% on CharXiv Reasoning). When looking at output tokens per second, it is 4 times faster than other frontier models.
Landing in the top-right quadrant of the Artificial Analysis index, 3.5 Flash delivers frontier-level intelligence at exceptional speed — proving you no longer have to trade quality for latency.
3.5 Flash: agentic tasks at scale
This balance of speed and performance makes 3.5 Flash ideal for tackling long-horizon agentic tasks. What used to take a developer days or an auditor weeks, 3.5 Flash can now help complete in a fraction of the time, often at less than half the cost of other frontier models. It rapidly plans, builds and iterates to solve real-world problems, whether it’s developing new applications, maintaining codebases or helping to prepare financial documents.
When coupled with the updated Antigravity harness, 3.5 Flash becomes a powerful engine for deploying collaborative subagents to tackle problems at scale for the most demanding use cases. Under supervision, it can reliably execute multi-step workflows and coding tasks while sustaining frontier performance.
Powered by Antigravity, 3.5 Flash executes multi-step workflows to automatically rename and categorize unstructured assets based on dynamic criteria.
Leveraging Antigravity, 3.5 Flash uses two agents to synthesize the AlphaZero paper and code a fully playable game in six hours.
3.5 Flash uses the Antigravity harness to transform a messy legacy codebase to Next.js.
3.5 Flash uses subagents to create new city landscapes in Antigravity.
3.5 Flash uses two agents: a builder and a player, working in a rapid self-improvement loop to develop a game in Antigravity.
Building on the strong multimodal foundation of Gemini 3, 3.5 Flash generates richer, more interactive web UIs and graphics.
3.5 Flash creates interactive animations for a research paper on AI Studio.
3.5 Flash turns a plain text description into interactive hardware on AI Studio.
3.5 Flash executes multiple concepts in parallel to build a full branding concept for a school fundraiser on AI Studio.
3.5 Flash generates different UX approaches for a checkout flow in just 60 seconds on AI Studio.
3.5 Flash: real-world impact
3.5 Flash’s real-world agentic capabilities are already driving meaningful progress for our developers and enterprises alike. In developing the 3.5 model series, we worked closely with industry partners to understand where toil and complexity arose in their workflows. Partners are seeing meaningful impact — from banks and fintechs automating multi-week workflows to data science teams unearthing insights amidst complex data environments.
Shopify is running subagents in parallel to analyze complex data over a long horizon for more accurate merchant growth forecasts at a global scale.
Macquarie Bank is piloting how 3.5 Flash can accelerate customer onboarding by reasoning over complex 100+ page documents, retrieving relevant information and making reliable recommendations with low latency.
Salesforce is integrating 3.5 Flash into Agentforce to reliably automate complicated enterprise tasks by deploying multiple subagents that retain context and execute complex, multi-turn tool calling.
3.5 Flash is helping Ramp enable smarter, more reliable OCR through multimodal understanding of complex invoices combined with reasoning over historical patterns.
Xero is deploying agents to autonomously manage complex, multi-week workflows, such as identifying suppliers and gathering information for 1099 tax forms, enabling small businesses to automate tedious admin tasks.
Databricks is using agentic workflows to monitor and retrieve real-time information, reason across massive datasets to diagnose issues, identify fixes and propose solutions for data scientists.
Personal AI agents: built with 3.5 Flash
3.5 Flash is now the default model for the Gemini app and AI Mode in Search globally. At I/O today, we showed how its agentic capabilities are powering new features to bring frontier-level intelligence to your daily life.
The new Gemini Spark, your personal AI agent, uses 3.5 Flash. It runs 24/7, helping you navigate your digital life, taking action on your behalf while under your direction. We’re starting to roll out Gemini Spark to trusted testers today, and we’re planning on bringing the Beta to Google AI Ultra subscribers in the US next week.
Gemini Spark uses 3.5 Flash to help accomplish these tasks
Gemini Spark uses 3.5 Flash to help accomplish these tasks
Gemini Spark uses 3.5 Flash to help accomplish these tasks
Gemini Spark uses 3.5 Flash to help accomplish these tasks
Gemini Spark uses 3.5 Flash to help accomplish these tasks
The enhanced agentic coding capabilities of 3.5 Flash are also delivering even more intelligent experiences across Search, from introducing new information agents that work for you 24/7 to unlocking more dynamic generative UI experiences. Learn more in our blog post.
Search leverages 3.5 Flash to build an interactive visual explaining Gyroid patterns.
Gemini 3.5: built with frontier safeguards
Gemini 3.5 was developed in accordance with our Frontier Safety Framework. We have strengthened our cyber and CBRN safeguards, which means it's less likely to generate harmful content, and to mistakenly refuse to answer safe queries. We achieve this with new, more advanced safety training and mitigations, including interpretability tools that help check and understand the AI's inner reasoning before it provides a response.
3.5 Flash is available today
Gemini 3.5 Flash is generally available via Google Antigravity, the Gemini API in Google AI Studio and Android Studio, Gemini Enterprise Agent Platform and Gemini Enterprise. It’s also now available to everyone in the Gemini app and AI Mode in Search. On behalf of the entire Gemini team, we can’t wait to see what you build.

Facts Only

Google has launched Gemini 3.5, a new AI model family, starting with Gemini 3.5 Flash.
Gemini 3.5 Flash is available globally via the Gemini app, Google Search, Google Antigravity, Gemini API, and enterprise platforms.
The model excels in coding and agentic tasks, outperforming Gemini 3.1 Pro on benchmarks like Terminal-Bench 2.1 (76.2%) and GDPval-AA (1656 Elo).
It is four times faster than other frontier models in output tokens per second.
Enterprises like Shopify, Macquarie Bank, Salesforce, Ramp, Xero, and Databricks are using Gemini 3.5 Flash for automation and complex workflows.
Gemini Spark, a personal AI agent powered by Gemini 3.5 Flash, is being rolled out to testers and will be available to Google AI Ultra subscribers in the US next week.
Gemini 3.5 was developed with enhanced safety measures, including cyber and CBRN safeguards.
Gemini 3.5 Pro is expected to be released next month.

Executive Summary

Google has introduced Gemini 3.5, a new family of AI models designed to combine advanced intelligence with actionable capabilities. The first release, Gemini 3.5 Flash, is now available globally through the Gemini app, Google Search, developer platforms like Google Antigravity and the Gemini API, and enterprise solutions. It outperforms previous models in coding and agentic tasks, achieving high scores on benchmarks like Terminal-Bench 2.1 and GDPval-AA while being four times faster than other frontier models. The model is being used by enterprises like Shopify, Macquarie Bank, and Salesforce to automate complex workflows, and it powers new personal AI features like Gemini Spark. Google emphasizes that Gemini 3.5 was developed with enhanced safety measures to reduce harmful outputs and improve response reliability. The next model in the series, Gemini 3.5 Pro, is expected to roll out next month.

Full Take

Google’s announcement of Gemini 3.5 Flash presents a compelling narrative of AI advancement, emphasizing speed, capability, and real-world utility. The strongest version of this narrative highlights tangible improvements in performance, cost efficiency, and enterprise adoption, with concrete examples of automation in banking, e-commerce, and data science. However, the pattern scan reveals potential elements of **ARC-0024 Ambiguity**—the claims of "frontier intelligence" and "real-world impact" are broad and lack specific, measurable outcomes beyond benchmark scores. The emphasis on speed and cost reduction could also align with **ARC-0043 Motte-and-Bailey**, where the "motte" (technical benchmarks) is defensible, but the "bailey" (transformative real-world impact) remains speculative.
The root cause of this narrative is the tech industry’s paradigm of rapid iteration and scalability, where AI models are framed as universal solutions to complex human and organizational problems. The unstated assumption is that automation inherently improves efficiency and reduces toil, but the second-order consequences—such as job displacement, dependency on proprietary systems, or the ethical implications of autonomous agents—are not addressed. The focus on enterprise adoption also raises questions about who benefits most: large corporations with the resources to integrate these tools, or end-users who may see their workflows reshaped by AI-driven decisions.
Bridge questions: What independent verification exists for the claimed enterprise productivity gains? How does Google’s safety framework address the risks of autonomous agents making high-stakes decisions? If Gemini 3.5 Flash were part of a coordinated influence campaign, the playbook would involve leveraging benchmark superiority to establish dominance in the AI market while downplaying potential risks. However, the content does not exhibit overt manipulation, as the claims are tied to observable technical metrics and real-world pilot programs.
Patterns detected: ARC-0024 Ambiguity, ARC-0043 Motte-and-Bailey

Sentinel — Human

Confidence

The text exhibits the high structural polish and integration of proprietary performance data typical of high-level corporate communication, suggesting a human-authored or heavily human-edited source.

Signals Detected

Sentence length variance is managed well, adhering to a clear, informational rhythm. The tone shifts effectively between high-level vision and concrete benchmarks.

The text is highly coherent, maintaining a consistent, promotional voice across technical and business applications. Lacks the erratic flow or personal emphasis typical of purely synthetic generation.

The structure perfectly follows a press release template, utilizing clear topic sentences and smoothly transitioning between model specs, agent capabilities, and enterprise applications. This indicates high coordination, likely internal editorial structure.

The claims are specific and benchmark-heavy (e.g., Terminal-Bench 2.1 (76.2%), 4x speed), which suggests these figures are sourced from internal testing/benchmarks rather than general LLM hallucination. The claims align with standard industry marketing practices.

Human Indicators

Specific, proprietary-sounding performance benchmarks (Terminal-Bench, GDPval-AA) suggest internal, verifiable data attribution.

The narrative successfully weaves together technical features (agents, coding) with practical, verified external use cases (Shopify, Macquarie Bank, Salesforce), indicating human integration of market knowledge.

The use of technical jargon is precise and applied contextually, demonstrating expertise beyond general LLM fluency.