xiji2646-netizen

xiji2646-netizen

Gemini 3.5 Flash launched today - quick breakdown for anyone running agent workloads

Google shipped 3.5 Flash at I/O 2026. The “budget” Flash model now beats 3.1 Pro on coding and tool-calling benchmarks.

Key numbers (from Google):

  • MCP Atlas (tool calling): 83.6% vs 3.1 Pro’s 78.2%
  • Terminal-Bench (coding): 76.2% vs 70.3%
  • Finance Agent v2: 57.9% vs 43.0%
  • 4x faster, ~40% cheaper than Pro
  • $1.50/M input, $9/M output, $0.15/M cached

Where it does NOT win:

  • Computer Use: not supported (GPT-5.5 only)
  • SWE-Bench Pro: Opus 4.7 still leads
  • Abstract reasoning: 3.1 Pro still edges it

My quick take on model routing:

  • Multi-tool agent loops → Flash
  • Heavy code refactoring → Opus 4.7
  • GUI automation → GPT-5.5

Anyone tested it on real agent workflows yet? Curious how the 4x speed claim holds up in practice.

Where Next?

Popular Ai topics Top

AstonJ
I saw this clip of Elon Musk talking about AI and wondered what others think - are you looking forward to AI? Or do you find it concerning?
New
AstonJ
This video about multi-agent AI is a really nice watch - it only took them a few million tries to master certain strategies - doing much ...
#ai
New
Eiji
Today, I tried to find some information and few times I not only got completely wrong answers, but even fake GitHub links … Every time I ...
#ai
New
AstonJ
AI has been a hot topic here on Devtalk recently, so along that theme: How useful do you think AI dev tools are right now and how useful ...
New
apoorv-2204
I’m reaching out to all software engineers, especially senior developers — I really want to hear your thoughts. I’ve always loved buildi...
New
xiji2646-netizen
Woke up to this today: Claude Code’s complete source code exposed via npm source map. Not a snippet. All 512,000 lines. 1,900 TypeScript ...
New
xiji2646-netizen
I’ve been following Seedance 2.0 since ByteDance dropped it in February, and after a few weeks of testing through third-party APIs, I wan...
New
xiji2646-netizen
I’ve been tracking this for the past two weeks and wanted to see if others are experiencing the same thing. BridgeBench (independent hal...
New
xiji2646-netizen
Just went through the Anthropic migration guide for Opus 4.7 and there are more gotchas than the announcement implied. Curious if others ...
New
xiji2646-netizen
Cursor cloud agent development This month’s updates: Codex got real Windows sandboxing (May 13) ...
New

Other popular topics Top

PragmaticBookshelf
Take your Go skills to the next level by learning how to design, develop, and deploy a distributed service. Start from the bare essential...
New
PragmaticBookshelf
Ruby, Io, Prolog, Scala, Erlang, Clojure, Haskell. With Seven Languages in Seven Weeks, by Bruce A. Tate, you’ll go beyond the syntax—and...
New
brentjanderson
Bought the Moonlander mechanical keyboard. Cherry Brown MX switches. Arms and wrists have been hurting enough that it’s time I did someth...
New
AstonJ
I’ve been hearing quite a lot of comments relating to the sound of a keyboard, with one of the most desirable of these called ‘thock’, he...
New
AstonJ
This looks like a stunning keycap set :orange_heart: A LEGENDARY KEYBOARD LIVES ON When you bought an Apple Macintosh computer in the e...
New
PragmaticBookshelf
Learn different ways of writing concurrent code in Elixir and increase your application's performance, without sacrificing scalability or...
New
Help
I am trying to crate a game for the Nintendo switch, I wanted to use Java as I am comfortable with that programming language. Can you use...
New
PragmaticBookshelf
Author Spotlight: Peter Ullrich @PJUllrich Data is at the core of every business, but it is useless if nobody can access and analyze ...
New
New
AnfaengerAlex
Hello, I’m a beginner in Android development and I’m facing an issue with my project setup. In my build.gradle.kts file, I have the foll...
New