xiji2646-netizen

Has anyone else noticed Opus 4.6 getting worse at coding tasks?

I’ve been tracking this for the past two weeks and wanted to see if others are experiencing the same thing.

BridgeBench (independent hallucination benchmark) now shows Opus 4.6 at #10 with a 33% fabrication rate — down from #2 with 83.3% accuracy just weeks ago. That’s one in three responses containing fabricated information.

The root cause appears to be two default changes:

Effort level default dropped from “high” to “medium” (March 3, 2026)
Adaptive thinking introduced (Feb 9, 2026) — under medium effort, some turns get zero reasoning tokens

An AMD exec analyzed 6,852 sessions and measured a 67% reasoning depth drop. @om_patel5’s A/B test (same prompt, 4.6 vs 4.5) showed 4.6 failing 5/5 while 4.5 passed 5/5.

What’s working for me:


export CLAUDE_CODE_EFFORT_LEVEL=max

export CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING=1

Or just /effort max\ per session.

Some devs are switching back to Opus 4.5 entirely (`claude-opus-4-5-20251101`).

Curious: are you seeing the same patterns? Have the env vars helped? Anyone found other workarounds?

References: BridgeBench (bridgebench.ai/hallucination), GitHub Issue #42796

1 comment

#ai #opus

0 104 1

2026-04-16 14:09:33 UTC

Most Liked

peterchancc

Saw on X/Twitter that some people are also experiencing the same issue.

Post #2

Where Next?

View thread on forum

opus

Home AI>Chat

#ai #opus

0 104 1

Last post

Popular Ai topics

AI>Chat

Artificial Intelligence - something to look forward to, or fear?

I saw this clip of Elon Musk talking about AI and wondered what others think - are you looking forward to AI? Or do you find it concerning?

#ai #discussion

9 1309 3

2025-01-20 11:39:04 UTC

New

AI>Chat

How are you using AI in your professional and personal life?

How are you using AI in my life? How the day to day life is changed around you? professional and in personal life? I it use for autocom...

#ai

12 1546 8

2026-07-08 16:02:36 UTC

New

AI>Chat

Phind Has Shut Down

Yesterday a very interesting to discuss situation have happen. While StackOverflow still suffer a lot, because of chat bots, but yesterda...

#ai #llm #llms

2 51 2

2026-01-18 00:28:27 UTC

New

AI>Chat

Has anyone else noticed Opus 4.6 getting worse at coding tasks?

I’ve been tracking this for the past two weeks and wanted to see if others are experiencing the same thing. BridgeBench (independent hal...

#ai #opus

0 104 1

2026-04-16 14:09:33 UTC

New

AI>Chat

Has anyone tried the Karpathy CLAUDE.md rules? (97.8k stars)

There’s a GitHub repo at forrestchang/andrej-karpathy-skills that’s sitting at 97.8k stars. It’s a single CLAUDE.md file with four behavi...

#claude

0 37 1

2026-04-30 01:16:04 UTC

New

AI>Chat

How are you handling the Claude Opus 4.7 migration?

Anthropic shipped Opus 4.7 last week and the agentic coding improvements look real. But the breaking changes are giving me pause. Specif...

#claude

0 1 0

2026-04-30 15:01:59 UTC

New

AI>Chat

Claude Code, Markdown, and the Case for HTML Artifacts

Claude Code, Markdown, and the Case for HTML Artifacts I do not think Markdown is going away. It is still the right format for README f...

#claude

0 0 0

2026-05-09 15:24:57 UTC

New

AI>Chat

Anyone tried mattpocock/skills for Claude Code? Here is what I found after a week

Been using the skills repo (77K stars, #1 on GitHub Trending recently) with Claude Code. Sharing what worked and what did not. What work...

#claude

0 0 0

2026-05-13 16:00:07 UTC

New

AI>Chat

Anyone using Codex hooks in production?

Codex mobile in the ChatGPT app https://techcrunch.com/wp-content/uploads/2026/05/App-view.png?resize=1200,675) Codex shipped a batch o...

#chatgpt #codex

0 0 0

2026-05-15 16:41:49 UTC

New

AI>Chat

What I Learned Comparing Seedance 2.0 API Providers as an Indie Developer

I got tired of comparing Seedance 2.0 API providers. There are simply too many. I have personally come across at least 20–30 platforms, ...

#ai #api #seedance #seedance2

0 1 0

2026-07-24 04:43:37 UTC

New

Other popular topics

General Dev>Learning Resources

The Pragmatic Programmer, 20th Anniversary Edition

Andy and Dave wrote this influential, classic book to help their clients create better software and rediscover the joy of coding. Almost ...

pragprog.com

#pragprog #published-book /book-the-pragmatic-programmer-20th-anniversary-edition

4 4782 0

2020-04-18 18:22:46 UTC

New

Data Science

Genetic Algorithms in Elixir

From finance to artificial intelligence, genetic algorithms are a powerful tool with a wide array of applications. But you don't need an ...

#pragprog #ai /elixir #published-book /book-genetic-algorithms-in-elixir

25 5243 6

2021-02-09 12:32:09 UTC

New

Backend>Questions

Erlang's not installing on macOS Big Sur "You are natively building Erlang/OTP for a later version of MacOSX than current version"

Just done a fresh install of macOS Big Sur and on installing Erlang I am getting: asdf install erlang 23.1.2 Configure failed. checking ...

#macos /erlang #big-sur #asdf

10 6212 8

2021-01-16 12:33:23 UTC

New

Linux>Chat

RancherOS is in end of life

Oh just spent so much time on this to discover now that RancherOS is in end of life but Rancher is refusing to mark the Github repo as su...

#linux #rancheros

10 6358 6

2021-01-30 21:04:03 UTC

New

Backend>Learning Resources

Concurrent Data Processing in Elixir

Learn different ways of writing concurrent code in Elixir and increase your application's performance, without sacrificing scalability or...

pragprog.com

#pragprog /elixir #published-book /book-concurrent-data-processing-in-elixir

78 6059 24

2021-09-04 12:35:42 UTC

New

Backend>Learning Resources

Python Testing with pytest, Second Edition

Create efficient, elegant software tests in pytest, Python's most powerful testing framework. Brian Okken @brianokken Edited by Kat...

pragprog.com

#pragprog /python #published-book /book-python-testing-with-pytest-second-edition

16 7461 4

2021-06-25 16:57:39 UTC

New

Backend>Learning Resources

Effective Haskell

Build efficient applications that exploit the unique benefits of a pure functional language, learning from an engineer who uses Haskell t...

pragprog.com

#pragprog /haskell #published-book /book-effective-haskell

15 10218 1

2022-02-16 10:09:51 UTC

New

General Dev>Learning Resources

A Common-Sense Guide to Data Structures and Algorithms in Python, Volume 1

Big O Notation can make your code faster by orders of magnitude. Get the hands-on info you need to master data structures and algorithms ...

pragprog.com

#pragprog /python #published-book /book-a-common-sense-guide-to-data-structures-and-algorithms-in-python-volume-1

24 5988 11

2024-01-29 15:52:29 UTC

New

General Dev>In The News

X can’t stop spread of explicit, fake AI Taylor Swift images

Will Swifties’ war on AI fakes spark a deepfake porn reckoning?

arstechnica.com

/swift

0 8379 0

2024-01-26 05:47:12 UTC

New

AI>Chat

Post your DeepSeek results

Curious what kind of results others are getting, I think actually prefer the 7B model to the 32B model, not only is it faster but the qua...

/deepseek

15 4275 15

2025-03-06 23:29:12 UTC

New

AI>Chat

What I Learned Comparing Seedance 2.0 API Providers as an Indie Developer

AI>Chat

AI video creator needed for a new MMORPG (unpaid)

AI>Chat

Has anyone seen a model's cost swing 60x on the same task?

AI>Chat

Gemini 3.5 Flash launched today - quick breakdown for anyone running agent workloads

AI>Chat

Anyone using Codex hooks in production?

AI>Chat

Anyone else running multiple coding agents right now?

AI>Chat

Anyone tried mattpocock/skills for Claude Code? Here is what I found after a week

AI>Chat

Claude Code, Markdown, and the Case for HTML Artifacts

AI>Chat

Anthropic's agents now review their own past sessions and self-improve. Thoughts?

AI>Chat

Anyone else hitting Claude Code rate limits way too fast?

AI>Chat

AI Chat ❯

Latest on Devtalk

React Native v0.87.0-rc.3 released!

Hybrid>Official News

React Native v0.86.2 released!

Hybrid>Official News

Erlang OTP-29.0.4, OTP-28.5.0.4 and OTP-27.3.4.15 released!

Backend>Official News

After the MVP is done, how to gain visibility and customers?

Frontend>Chat

Rohboter — Discover, Compare & Finance Commercial Robots

General Dev>In The News

Giving Money Away Can Be Harder than Making It

General Dev>In The News

Ghosted After a Job Interview? Report the Company

General Dev>In The News

Humans Haven’t Stopped Evolving

Science/Tech>Science

Your site, your rules: new AI traffic options for all customers

AI>In The News

An ESP32 based plane radar

General Dev>In The News

Creating Hotspot network - it registers the I/F in NetworkInterfaces but it does not in ConnectivityManager

Android>Questions

Loops? Graphs? Prolog!

Backend>In The News

How OpenAI Lost Control of an AI Model—and What Needs to Change

AI>In The News

The new rules of context engineering for Claude 5 generation models | Claude by Anthropic

AI>In The News

Turn And Face The Strange (Fly.io CEO Kurt Mackey is stepping down)

General Dev>In The News

Devtalk ❯

We ❤️ helpful members!

We reward our most helpful members via our MOTM scheme - by giving away a whopping 25 books per year!

Sub Categories:

We're in Beta

About us Mission Statement See our Roadmap

Has anyone else noticed Opus 4.6 getting worse at coding tasks?

xiji2646-netizen

Has anyone else noticed Opus 4.6 getting worse at coding tasks?

Most Liked

peterchancc

Where Next?

Popular Ai topics

Artificial Intelligence - something to look forward to, or fear?

How are you using AI in your professional and personal life?

Phind Has Shut Down

Has anyone else noticed Opus 4.6 getting worse at coding tasks?

Has anyone tried the Karpathy CLAUDE.md rules? (97.8k stars)

How are you handling the Claude Opus 4.7 migration?

Claude Code, Markdown, and the Case for HTML Artifacts

Anyone tried mattpocock/skills for Claude Code? Here is what I found after a week

Anyone using Codex hooks in production?

What I Learned Comparing Seedance 2.0 API Providers as an Indie Developer

Other popular topics

The Pragmatic Programmer, 20th Anniversary Edition

Genetic Algorithms in Elixir

Erlang's not installing on macOS Big Sur "You are natively building Erlang/OTP for a later version of MacOSX than current version"

RancherOS is in end of life

Concurrent Data Processing in Elixir

Python Testing with pytest, Second Edition

Effective Haskell

A Common-Sense Guide to Data Structures and Algorithms in Python, Volume 1

X can’t stop spread of explicit, fake AI Taylor Swift images

Post your DeepSeek results

Sponsor Spotlight

AI>Chat

Latest on Devtalk

We ❤️ helpful members!

Devtalk Sponsors

Categories:

Sub Categories:

Popular Portals

Devtalk Sponsors

We're in Beta