xiji2646-netizen

xiji2646-netizen

Anyone else hitting Claude Code rate limits way too fast?

Been using Claude Code on Max for a few weeks and kept running into rate limits by early afternoon. Same tasks as colleagues who weren’t hitting limits at all. Figured it was just quota differences, but it turns out the issue was entirely on my end.

Anthropic just published an engineering post explaining how Claude Code’s cost structure actually works, and it changed how I use the tool.

The core mechanic: every request is built as a prefix chain (system prompt → tools → project docs → messages). The API caches that chain. If the prefix matches on the next request, those tokens cost 1/10 the normal price. If anything in the prefix changes, the cache invalidates from that point forward — full price recalculation.

The things I was doing that were silently killing my cache:

  • Switching between Sonnet and Opus mid-conversation with /model. Cache is model-bound, so every switch wiped everything I’d accumulated.

  • Opening a new claude session for every task instead of continuing the previous one.

  • Adding MCP tools mid-session when I needed them. Tool definitions are part of the cached prefix.

The fix that made the biggest difference: claude --resume. It restores your last session and picks up the cache chain where it left off. I’d never used it before.

Also: long conversations actually get cheaper over time because of how Claude Code’s compaction works. I was doing the opposite — short sessions, frequent restarts — which meant I was always paying full price for the first turns.

Full writeup here if you want the details:

https://www.anthropic.com/engineering/claude-code-prompt-caching

Curious if others have noticed the model-switching issue specifically — that one surprised me the most.

Where Next?

Popular Ai topics Top

AstonJ
Watching any? Any favourites? :upside_down_face:
New
Eiji
Today, I tried to find some information and few times I not only got completely wrong answers, but even fake GitHub links … Every time I ...
#ai
New
kammy
Hi everyone! The other day I was having a debate with my friends about whether or not the top LLM models are “good at design.” I’d love ...
New
xiji2646-netizen
Woke up to this today: Claude Code’s complete source code exposed via npm source map. Not a snippet. All 512,000 lines. 1,900 TypeScript ...
New
xiji2646-netizen
DeepSeek just released V4 and the pricing is hard to ignore. V4-Flash: $0.28/M output tokens. V4-Pro: $2.19/M. Both with 1M token contex...
New
xiji2646-netizen
Alibaba just opened public API access for HappyHorse 1.0, the model currently ranked #1 on Video Arena’s blind tests. What caught my att...
New
xiji2646-netizen
There’s a GitHub repo at forrestchang/andrej-karpathy-skills that’s sitting at 97.8k stars. It’s a single CLAUDE.md file with four behavi...
New
xiji2646-netizen
Anthropic shipped Opus 4.7 last week and the agentic coding improvements look real. But the breaking changes are giving me pause. Specif...
New
xiji2646-netizen
Anthropic shipped something called Dreaming for Managed Agents this week. It’s a scheduled background process that runs between sessions ...
New
xiji2646-netizen
I was reading through a curated list of 60 real-world Claude Fable 5 cases (each logged with input, process, output, and an evidence tag)...
New

Other popular topics Top

PragmaticBookshelf
Rust is an exciting new programming language combining the power of C with memory safety, fearless concurrency, and productivity boosters...
New
AstonJ
This looks like a stunning keycap set :orange_heart: A LEGENDARY KEYBOARD LIVES ON When you bought an Apple Macintosh computer in the e...
New
Exadra37
Oh just spent so much time on this to discover now that RancherOS is in end of life but Rancher is refusing to mark the Github repo as su...
New
AstonJ
We’ve talked about his book briefly here but it is quickly becoming obsolete - so he’s decided to create a series of 7 podcasts, the firs...
New
Help
I am trying to crate a game for the Nintendo switch, I wanted to use Java as I am comfortable with that programming language. Can you use...
New
PragmaticBookshelf
Programming Ruby is the most complete book on Ruby, covering both the language itself and the standard library as well as commonly used t...
New
hilfordjames
There appears to have been an update that has changed the terminology for what has previously been known as the Taskbar Overflow - this h...
New
PragmaticBookshelf
Author Spotlight: Peter Ullrich @PJUllrich Data is at the core of every business, but it is useless if nobody can access and analyze ...
New
New
PragmaticBookshelf
Build modern server-driven web applications using htmx. Whatever programming language you use, you’ll write less (and cleaner) code. ...
New