xiji2646-netizen

xiji2646-netizen

Has anyone else noticed Opus 4.6 getting worse at coding tasks?

I’ve been tracking this for the past two weeks and wanted to see if others are experiencing the same thing.

BridgeBench (independent hallucination benchmark) now shows Opus 4.6 at #10 with a 33% fabrication rate — down from #2 with 83.3% accuracy just weeks ago. That’s one in three responses containing fabricated information.

The root cause appears to be two default changes:

  • Effort level default dropped from “high” to “medium” (March 3, 2026)

  • Adaptive thinking introduced (Feb 9, 2026) — under medium effort, some turns get zero reasoning tokens

An AMD exec analyzed 6,852 sessions and measured a 67% reasoning depth drop. @om_patel5’s A/B test (same prompt, 4.6 vs 4.5) showed 4.6 failing 5/5 while 4.5 passed 5/5.

What’s working for me:


export CLAUDE_CODE_EFFORT_LEVEL=max

export CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING=1

Or just /effort max\ per session.

Some devs are switching back to Opus 4.5 entirely (`claude-opus-4-5-20251101`).

Curious: are you seeing the same patterns? Have the env vars helped? Anyone found other workarounds?

References: BridgeBench (bridgebench.ai/hallucination), GitHub Issue #42796

First Post!

peterchancc

peterchancc

Saw on X/Twitter that some people are also experiencing the same issue.

Where Next?

Popular Ai topics Top

AstonJ
I saw this clip of Elon Musk talking about AI and wondered what others think - are you looking forward to AI? Or do you find it concerning?
New
Eiji
Today, I tried to find some information and few times I not only got completely wrong answers, but even fake GitHub links … Every time I ...
#ai
New
AstonJ
Loads of news stories about DeepSeek here in the last few days, no surprise as it’s been making headlines across the world! Currently a h...
New
AstonJ
I have a feeling we’re going to see a lot of threads about DeepSeek, so have put up a portal for it :003:
New
apoorv-2204
General thoughts on google gemini ? IMHO , when compared chatgpt and claude sonnnet its pretty shit, and its feels broken,
#ai
New
kammy
Hi everyone! The other day I was having a debate with my friends about whether or not the top LLM models are “good at design.” I’d love ...
New
Eiji
Yesterday a very interesting to discuss situation have happen. While StackOverflow still suffer a lot, because of chat bots, but yesterda...
New
nix0097
Hello I hope you’re doing well. I’m looking to develop a custom chatbot and would love to collaborate with you on this project. The chat...
New
xiji2646-netizen
Anthropic launched Claude Design this week and there’s a lot of noise about the generation demos and the stock reaction. But the feature ...
New
xiji2646-netizen
DeepSeek just released V4 and the pricing is hard to ignore. V4-Flash: $0.28/M output tokens. V4-Pro: $2.19/M. Both with 1M token contex...
New

Other popular topics Top

New
PragmaticBookshelf
Free and open source software is the default choice for the technologies that run our world, and it’s built and maintained by people like...
New
AstonJ
You might be thinking we should just ask who’s not using VSCode :joy: however there are some new additions in the space that might give V...
New
AstonJ
poll poll Be sure to check out @Dusty’s article posted here: An Introduction to Alternative Keyboard Layouts It’s one of the best write-...
New
AstonJ
Do the test and post your score :nerd_face: :keyboard: If possible, please add info such as the keyboard you’re using, the layout (Qw...
New
DevotionGeo
The V Programming Language Simple language for building maintainable programs V is already mentioned couple of times in the forum, but I...
New
AstonJ
Continuing the discussion from Thinking about learning Crystal, let’s discuss - I was wondering which languages don’t GC - maybe we can c...
New
PragmaticBookshelf
Create efficient, elegant software tests in pytest, Python's most powerful testing framework. Brian Okken @brianokken Edited by Kat...
New
New
Fl4m3Ph03n1x
Background Lately I am in a quest to find a good quality TTS ai generation tool to run locally in order to create audio for some videos I...
New