Why Claude Code Keeps Stopping [Fix]

April 17, 20265 min read

Claude Code stops mid-sentence, says "I need to start a new session," or goes completely unresponsive. You check the status—you are mid-limit. The context window is full. But you were not writing that much code. The real culprit: tool output noise. A few file reads and grep searches filled your context with 80% garbage.

Claude Code is not crashing. It is doing what you asked—working on your code. But every file read, grep search, and command output dumps tokens into your context. After 30-50 tool calls, the window fills up, and Claude has to stop. The tokens are not going to your prompts or Claude's responses. They are going to timestamps, blank lines, repeated file paths, and decorative formatting that Claude reads but never uses.

What it looks like when you hit the limit

  • Claude Code stops mid-sentence and says "I can't continue"
  • Responses become very slow or time out
  • You get a message about usage limits or rate limiting
  • The session becomes unresponsive for 30+ seconds between messages
  • Claude cannot read files or run commands, even simple ones
  • You have to start a completely new session to continue working
This is NOT a bug. Claude Code is working as designed. Your context window simply ran out of space.

Why does this happen so fast?

A single file read of a 500-line file returns 12,000-15,000 tokens. A grep with 30 matches returns 8,000-12,000 tokens. A build log with errors returns 10,000-20,000 tokens. Do that 5-10 times and you have consumed 50,000-150,000 tokens—and 80% of it is noise: line numbers, file paths, timestamps, blank lines, and status messages that Claude never needed.

Your conversation history (your prompts + Claude's responses) might only be 10,000-20,000 tokens. But the tool outputs are 50,000-100,000 tokens. The ratio of signal to noise is roughly 1:3 to 1:5. And that is on a 1 million token context window. You hit the limit not because you wrote that much code, but because tool outputs are bloated.

Fix 1: Token Limits proxy (the permanent fix)

Install Token Limits to strip noise before tool outputs reach your context. This is the permanent fix—one install, compression happens on every request, you stop hitting limits.

  1. npm install -g token-limits
  2. token-limits start (or add to startup scripts)
  3. Claude Code: cmd+comma → Tools → API URL → http://localhost:4800
  4. Done. Compression happens automatically on all future tool calls.
With Token Limits installed, most developers run full-day sessions without hitting context limits. Without it, heavy tool usage typically hits limits every 1-3 hours.

Fix 2: /compact command (immediate relief)

If you are already mid-session and Claude Code is about to stop, type /compact in your chat. This command rewrites your conversation history in compressed form, shrinking it by 40-60% and freeing up space to keep working.

/compact is a one-time fix for the current session. It works, but you will hit the limit again in another 1-3 hours unless you install the proxy.

Fix 3: Start new sessions earlier

If you can't install the proxy yet, start a new Claude Code chat every 60-90 minutes instead of waiting for it to crash. The new session gets a fresh context window. This is not ideal (you lose previous context), but it keeps you working.

Real example: Why stopping happens

You start a session with 1M token context window. You paste a file (3k tokens), ask Claude to refactor it (2k tokens), Claude responds (5k tokens). You are at 10k. Then you ask Claude to search the codebase (15k tokens for results), review a second file (12k tokens), run a build (20k tokens), grep for a pattern (18k tokens), review git history (16k tokens). You are now at 108k tokens consumed, but your actual conversation is maybe 20k of meaningful discussion. The other 88k is tool output noise. At 150k, Claude starts slowing down. At 200k+, it becomes unresponsive. You have to start over.

With Token Limits, that same session compresses the tool outputs from 88k down to 10k. Now you can do 8-10x more work before hitting the limit.

Stop Claude Code from stopping

Token Limits proxy stops the stopping. Install once, compression happens automatically on every file read, grep, and command. Full-day sessions without context limits.

FAQ

Is Claude Code actually broken if it keeps stopping?

No. It is working as designed. The context window is full. The issue is tool output noise, not a bug in Claude Code.

Does /compact permanently fix the problem?

No. /compact gives temporary relief for the current session. You will hit the limit again in 1-3 hours. Install Token Limits for a permanent fix.

Can I use both /compact and Token Limits?

Yes, and you should. /compact clears old history when context is already high. Token Limits prevents history from bloating in the first place.

Does this happen on all plans?

Yes—Pro, Max, and Team. Max has higher limits, so you hit limits less often, but tool output noise is the same on all plans.

Why does Claude Code not compress automatically?

Anthropic could, but it trades compression ratio for universal compatibility. Token Limits is purpose-built for coding outputs, so it compresses 60-90% instead of 30-40%.