xiji2646-netizen

xiji2646-netizen

Gemini 3.5 Flash launched today - quick breakdown for anyone running agent workloads

Google shipped 3.5 Flash at I/O 2026. The “budget” Flash model now beats 3.1 Pro on coding and tool-calling benchmarks.

Key numbers (from Google):

  • MCP Atlas (tool calling): 83.6% vs 3.1 Pro’s 78.2%
  • Terminal-Bench (coding): 76.2% vs 70.3%
  • Finance Agent v2: 57.9% vs 43.0%
  • 4x faster, ~40% cheaper than Pro
  • $1.50/M input, $9/M output, $0.15/M cached

Where it does NOT win:

  • Computer Use: not supported (GPT-5.5 only)
  • SWE-Bench Pro: Opus 4.7 still leads
  • Abstract reasoning: 3.1 Pro still edges it

My quick take on model routing:

  • Multi-tool agent loops → Flash
  • Heavy code refactoring → Opus 4.7
  • GUI automation → GPT-5.5

Anyone tested it on real agent workflows yet? Curious how the 4x speed claim holds up in practice.

Where Next?

Popular Ai topics Top

AstonJ
I saw this clip of Elon Musk talking about AI and wondered what others think - are you looking forward to AI? Or do you find it concerning?
New
Eiji
Today, I tried to find some information and few times I not only got completely wrong answers, but even fake GitHub links … Every time I ...
#ai
New
AstonJ
Loads of news stories about DeepSeek here in the last few days, no surprise as it’s been making headlines across the world! Currently a h...
New
AstonJ
This is a very quick guide, you just need to: Download LM Studio: https://lmstudio.ai/ Click on search Type DeepSeek, then select the o...
New
AstonJ
AI has been a hot topic here on Devtalk recently, so along that theme: How useful do you think AI dev tools are right now and how useful ...
New
Eiji
Yesterday a very interesting to discuss situation have happen. While StackOverflow still suffer a lot, because of chat bots, but yesterda...
New
xiji2646-netizen
Woke up to this today: Claude Code’s complete source code exposed via npm source map. Not a snippet. All 512,000 lines. 1,900 TypeScript ...
New
xiji2646-netizen
There’s a GitHub repo at forrestchang/andrej-karpathy-skills that’s sitting at 97.8k stars. It’s a single CLAUDE.md file with four behavi...
New
xiji2646-netizen
Curious how others deal with this: you start a refactoring task with your AI coding assistant, close the terminal, come back – and it has...
New
xiji2646-netizen
Codex mobile in the ChatGPT app https://techcrunch.com/wp-content/uploads/2026/05/App-view.png?resize=1200,675) Codex shipped a batch o...
New

Other popular topics Top

dasdom
No chair. I have a standing desk. This post was split into a dedicated thread from our thread about chairs :slight_smile:
New
DevotionGeo
The V Programming Language Simple language for building maintainable programs V is already mentioned couple of times in the forum, but I...
New
Margaret
Hello everyone! This thread is to tell you about what authors from The Pragmatic Bookshelf are writing on Medium.
1147 29994 760
New
PragmaticBookshelf
Rails 7 completely redefines what it means to produce fantastic user experiences and provides a way to achieve all the benefits of single...
New
PragmaticBookshelf
Build efficient applications that exploit the unique benefits of a pure functional language, learning from an engineer who uses Haskell t...
New
Help
I am trying to crate a game for the Nintendo switch, I wanted to use Java as I am comfortable with that programming language. Can you use...
New
AstonJ
If you want a quick and easy way to block any website on your Mac using Little Snitch simply… File > New Rule: And select Deny, O...
New
PragmaticBookshelf
Programming Ruby is the most complete book on Ruby, covering both the language itself and the standard library as well as commonly used t...
New
First poster: bot
zig/http.zig at 7cf2cbb33ef34c1d211135f56d30fe23b6cacd42 · ziglang/zig. General-purpose programming language and toolchain for maintaini...
New
NewsBot
Node.js v22.14.0 has been released. Link: Release 2025-02-11, Version 22.14.0 'Jod' (LTS), @aduh95 · nodejs/node · GitHub
New