xiji2646-netizen

xiji2646-netizen

Anthropic's agents now review their own past sessions and self-improve. Thoughts?

Anthropic shipped something called Dreaming for Managed Agents this week. It’s a scheduled background process that runs between sessions — the agent reviews its own past conversation transcripts, extracts patterns, and writes learnings into memory. No human in the loop unless you want one.

The framing that stuck with me: individual sessions are blind to cross-session patterns. A support agent won’t notice it made the same classification error 12 times this month. Dreaming is designed to surface exactly that kind of signal.

It ships alongside Outcomes (automated output grading against developer-defined rubrics) and multi-agent orchestration (coordinator + up to 20 parallel subagents, now in public beta). The three are meant to work as a loop: orchestration decomposes work, Outcomes grades it, Dreaming remembers the failures.

Still in research preview, not GA.

A few things I’m genuinely uncertain about:

The “automatic” mode lets the agent write directly to its own memory without approval. That’s a meaningful amount of autonomy over its own behavior. How do you audit what it’s actually learning? If it develops a subtly wrong heuristic over three months of self-reinforcement, how do you catch that before it’s deeply embedded?

Also curious about the human-review mode in practice — if you’re approving every proposed memory update, does that scale? Or does it become a bottleneck that defeats the purpose?

For those building on Managed Agents or similar systems: are you thinking about self-improvement loops as a feature you want, or a risk you’d rather control tightly? And does the “agent with three months of experience vs. freshly deployed agent” framing change how you think about agent versioning and rollbacks?

Where Next?

Popular Ai topics Top

apoorv-2204
How are you using AI in my life? How the day to day life is changed around you? professional and in personal life? I it use for autocom...
#ai
New
AstonJ
Tucker: You’ve had complaints from one programmer who said you steal people’s stuff without paying them and he winded up being murdered.
New
xiji2646-netizen
Just went through the Anthropic migration guide for Opus 4.7 and there are more gotchas than the announcement implied. Curious if others ...
New
xiji2646-netizen
DeepSeek just released V4 and the pricing is hard to ignore. V4-Flash: $0.28/M output tokens. V4-Pro: $2.19/M. Both with 1M token contex...
New
xiji2646-netizen
Anthropic shipped Opus 4.7 last week and the agentic coding improvements look real. But the breaking changes are giving me pause. Specif...
New
xiji2646-netizen
Been using Claude Code on Max for a few weeks and kept running into rate limits by early afternoon. Same tasks as colleagues who weren’t ...
New
xiji2646-netizen
Claude Code, Markdown, and the Case for HTML Artifacts I do not think Markdown is going away. It is still the right format for README f...
New
xiji2646-netizen
Cursor cloud agent development This month’s updates: Codex got real Windows sandboxing (May 13) ...
New
xiji2646-netizen
Codex mobile in the ChatGPT app https://techcrunch.com/wp-content/uploads/2026/05/App-view.png?resize=1200,675) Codex shipped a batch o...
New
xiji2646-netizen
Google shipped 3.5 Flash at I/O 2026. The “budget” Flash model now beats 3.1 Pro on coding and tool-calling benchmarks. Key numbers (fro...
New

Other popular topics Top

Devtalk
Reading something? Working on something? Planning something? Changing jobs even!? If you’re up for sharing, please let us know what you’...
1063 23050 405
New
New
AstonJ
Or looking forward to? :nerd_face:
503 14512 277
New
siddhant3030
I’m thinking of buying a monitor that I can rotate to use as a vertical monitor? Also, I want to know if someone is using it for program...
New
AstonJ
This looks like a stunning keycap set :orange_heart: A LEGENDARY KEYBOARD LIVES ON When you bought an Apple Macintosh computer in the e...
New
PragmaticBookshelf
Tailwind CSS is an exciting new CSS framework that allows you to design your site by composing simple utility classes to create complex e...
New
First poster: AstonJ
Jan | Rethink the Computer. Jan turns your computer into an AI machine by running LLMs locally on your computer. It’s a privacy-focus, l...
New
PragmaticBookshelf
Explore the power of Ash Framework by modeling and building the domain for a real-world web application. Rebecca Le @sevenseacat and ...
New
PragmaticBookshelf
Fight complexity and reclaim the original spirit of agility by learning to simplify how you develop software. The result: a more humane a...
New
CommunityNews
Open-source implementation of the classic GTA engine now running directly in your browser. Experience the reVC technology demo on DOS.Zon...
New