xiji2646-netizen

xiji2646-netizen

Anthropic's agents now review their own past sessions and self-improve. Thoughts?

Anthropic shipped something called Dreaming for Managed Agents this week. It’s a scheduled background process that runs between sessions — the agent reviews its own past conversation transcripts, extracts patterns, and writes learnings into memory. No human in the loop unless you want one.

The framing that stuck with me: individual sessions are blind to cross-session patterns. A support agent won’t notice it made the same classification error 12 times this month. Dreaming is designed to surface exactly that kind of signal.

It ships alongside Outcomes (automated output grading against developer-defined rubrics) and multi-agent orchestration (coordinator + up to 20 parallel subagents, now in public beta). The three are meant to work as a loop: orchestration decomposes work, Outcomes grades it, Dreaming remembers the failures.

Still in research preview, not GA.

A few things I’m genuinely uncertain about:

The “automatic” mode lets the agent write directly to its own memory without approval. That’s a meaningful amount of autonomy over its own behavior. How do you audit what it’s actually learning? If it develops a subtly wrong heuristic over three months of self-reinforcement, how do you catch that before it’s deeply embedded?

Also curious about the human-review mode in practice — if you’re approving every proposed memory update, does that scale? Or does it become a bottleneck that defeats the purpose?

For those building on Managed Agents or similar systems: are you thinking about self-improvement loops as a feature you want, or a risk you’d rather control tightly? And does the “agent with three months of experience vs. freshly deployed agent” framing change how you think about agent versioning and rollbacks?

Where Next?

Popular Ai topics Top

AstonJ
I saw this clip of Elon Musk talking about AI and wondered what others think - are you looking forward to AI? Or do you find it concerning?
New
AstonJ
This video about multi-agent AI is a really nice watch - it only took them a few million tries to master certain strategies - doing much ...
#ai
New
apoorv-2204
I’m reaching out to all software engineers, especially senior developers — I really want to hear your thoughts. I’ve always loved buildi...
New
kammy
Hi everyone! The other day I was having a debate with my friends about whether or not the top LLM models are “good at design.” I’d love ...
New
AstonJ
Tucker: You’ve had complaints from one programmer who said you steal people’s stuff without paying them and he winded up being murdered.
New
Eiji
Yesterday a very interesting to discuss situation have happen. While StackOverflow still suffer a lot, because of chat bots, but yesterda...
New
nix0097
Hello I hope you’re doing well. I’m looking to develop a custom chatbot and would love to collaborate with you on this project. The chat...
New
xiji2646-netizen
Woke up to this today: Claude Code’s complete source code exposed via npm source map. Not a snippet. All 512,000 lines. 1,900 TypeScript ...
New
xiji2646-netizen
Been using a two-stage workflow for AI video production that’s been consistently more reliable than text-to-video: Generate a 3×3 stor...
New
xiji2646-netizen
Anthropic shipped Opus 4.7 last week and the agentic coding improvements look real. But the breaking changes are giving me pause. Specif...
New

Other popular topics Top

PragmaticBookshelf
Free and open source software is the default choice for the technologies that run our world, and it’s built and maintained by people like...
New
PragmaticBookshelf
Learn from the award-winning programming series that inspired the Elixir language, and go on a step-by-step journey through the most impo...
New
AstonJ
Or looking forward to? :nerd_face:
503 14742 279
New
PragmaticBookshelf
Use WebRTC to build web applications that stream media and data in real time directly from one user to another, all in the browser. ...
New
New
PragmaticBookshelf
Author Spotlight: Peter Ullrich @PJUllrich Data is at the core of every business, but it is useless if nobody can access and analyze ...
New
New
PragmaticBookshelf
Develop, deploy, and debug BEAM applications using BEAMOps: a new paradigm that focuses on scalability, fault tolerance, and owning each ...
New
PragmaticBookshelf
Build modern server-driven web applications using htmx. Whatever programming language you use, you’ll write less (and cleaner) code. ...
New
PragmaticBookshelf
As digital systems increasingly run the world, mastery of the recurring patterns of software development risk is the key to fast and effe...
New