xiji2646-netizen

xiji2646-netizen

Anthropic's agents now review their own past sessions and self-improve. Thoughts?

Anthropic shipped something called Dreaming for Managed Agents this week. It’s a scheduled background process that runs between sessions — the agent reviews its own past conversation transcripts, extracts patterns, and writes learnings into memory. No human in the loop unless you want one.

The framing that stuck with me: individual sessions are blind to cross-session patterns. A support agent won’t notice it made the same classification error 12 times this month. Dreaming is designed to surface exactly that kind of signal.

It ships alongside Outcomes (automated output grading against developer-defined rubrics) and multi-agent orchestration (coordinator + up to 20 parallel subagents, now in public beta). The three are meant to work as a loop: orchestration decomposes work, Outcomes grades it, Dreaming remembers the failures.

Still in research preview, not GA.

A few things I’m genuinely uncertain about:

The “automatic” mode lets the agent write directly to its own memory without approval. That’s a meaningful amount of autonomy over its own behavior. How do you audit what it’s actually learning? If it develops a subtly wrong heuristic over three months of self-reinforcement, how do you catch that before it’s deeply embedded?

Also curious about the human-review mode in practice — if you’re approving every proposed memory update, does that scale? Or does it become a bottleneck that defeats the purpose?

For those building on Managed Agents or similar systems: are you thinking about self-improvement loops as a feature you want, or a risk you’d rather control tightly? And does the “agent with three months of experience vs. freshly deployed agent” framing change how you think about agent versioning and rollbacks?

Where Next?

Popular Ai topics Top

AstonJ
This video about multi-agent AI is a really nice watch - it only took them a few million tries to master certain strategies - doing much ...
#ai
New
Eiji
Today, I tried to find some information and few times I not only got completely wrong answers, but even fake GitHub links … Every time I ...
#ai
New
AstonJ
AI has been a hot topic here on Devtalk recently, so along that theme: How useful do you think AI dev tools are right now and how useful ...
New
AstonJ
Tucker: You’ve had complaints from one programmer who said you steal people’s stuff without paying them and he winded up being murdered.
New
xiji2646-netizen
Just went through the Anthropic migration guide for Opus 4.7 and there are more gotchas than the announcement implied. Curious if others ...
New
xiji2646-netizen
Been using a two-stage workflow for AI video production that’s been consistently more reliable than text-to-video: Generate a 3×3 stor...
New
xiji2646-netizen
DeepSeek just released V4 and the pricing is hard to ignore. V4-Flash: $0.28/M output tokens. V4-Pro: $2.19/M. Both with 1M token contex...
New
xiji2646-netizen
Alibaba just opened public API access for HappyHorse 1.0, the model currently ranked #1 on Video Arena’s blind tests. What caught my att...
New
xiji2646-netizen
There’s a GitHub repo at forrestchang/andrej-karpathy-skills that’s sitting at 97.8k stars. It’s a single CLAUDE.md file with four behavi...
New
xiji2646-netizen
Curious how others deal with this: you start a refactoring task with your AI coding assistant, close the terminal, come back – and it has...
New

Other popular topics Top

AstonJ
If it’s a mechanical keyboard, which switches do you have? Would you recommend it? Why? What will your next keyboard be? Pics always w...
New
PragmaticBookshelf
Write Elixir tests that you can be proud of. Dive into Elixir’s test philosophy and gain mastery over the terminology and concepts that u...
New
AstonJ
Or looking forward to? :nerd_face:
502 14279 275
New
AstonJ
You might be thinking we should just ask who’s not using VSCode :joy: however there are some new additions in the space that might give V...
New
Exadra37
I am asking for any distro that only has the bare-bones to be able to get a shell in the server and then just install the packages as we ...
New
PragmaticBookshelf
Learn different ways of writing concurrent code in Elixir and increase your application's performance, without sacrificing scalability or...
New
Margaret
Hello everyone! This thread is to tell you about what authors from The Pragmatic Bookshelf are writing on Medium.
1147 29994 760
New
AstonJ
If you get Can't find emacs in your PATH when trying to install Doom Emacs on your Mac you… just… need to install Emacs first! :lol: bre...
New
New
AnfaengerAlex
Hello, I’m a beginner in Android development and I’m facing an issue with my project setup. In my build.gradle.kts file, I have the foll...
New