Stop Wasting Tokens in Claude Code

April 4, 20266 min read

Most Claude Code sessions waste 60-80% of tokens on bloated tool outputs. Each grep, file read, and ls command returns kilobytes of formatting, paths, and duplicate content. Token Limits proxy strips this noise automatically, giving you more conversation per token.

Claude Code relies on tools: file reads, searches, listings, diffs. These tools are verbose by design—they return complete, structured output. But they also return tons of noise: repeated headers, paths you already know, duplicate lines, verbose formatting.

Where are your tokens being wasted?

ToolTypical Output SizeNoise %Compressible Tokens
grep (50 matches)15,000 tokens75%11,000 tokens
file read (medium)12,000 tokens70%8,400 tokens
ls (500 items)8,000 tokens80%6,400 tokens
git diff16,000 tokens75%12,000 tokens
npm list10,000 tokens75%7,500 tokens

A single grep search with 50 matches returns 15,000 tokens. Of those, 75% is duplicative—repeated file paths, line numbers, formatting brackets. Remove that noise and you get 11,000 tokens of savings in one tool call.

What exactly is being wasted?

  • Repeated file paths: Seen in every match; path is known context
  • Line numbers: Helpful for humans; AI only needs the content
  • Decorative formatting: Brackets, quotes, extra spaces
  • Duplicate headers: Same column headers repeated in lists
  • Blank lines: Visual spacing that costs tokens with zero info

How much can you save?

A typical Claude Code session with 20 tool calls might consume 200k-400k tokens total. Of those, the majority are from tool outputs. With compression, that drops by 60-80%. Same session, same results, fraction of the usage.

The fix: Token Limits proxy

Install Token Limits as a proxy between Claude Code and the API. It intercepts tool results, compresses them, and forwards the compressed version to Claude. All compression happens locally on your machine.

  1. npm install -g token-limits
  2. Run: token-limits start
  3. In Claude Code settings (cmd+comma): Tools > API URL > http://localhost:4800
  4. Done. All future tool calls are compressed.

Real compression numbers

ToolBefore CompressionAfter CompressionSavings
grep (50 matches)15,000 tokens2,700 tokens82%
file read12,000 tokens2,100 tokens82%
ls (500 items)8,000 tokens1,200 tokens85%
git diff16,000 tokens2,400 tokens85%
npm list10,000 tokens1,500 tokens85%

Cut your Claude Code token usage by 80%

Token Limits proxy automatically compresses every grep, file read, ls, and diff before Claude reads it. Set it up in 2 minutes — runs locally, no API key needed for compression.

FAQ

What exactly does Token Limits compress?

Timestamps, blank lines, repeated paths, line numbers, duplicate headers, emoji, and verbose formatting. It preserves file contents, error messages, search matches, and all meaningful information.

Does compression affect accuracy?

No. Compression removes format noise that does not contain information. Claude still sees the actual content—files, errors, search results—just without the padding.

How much bandwidth does the proxy use?

Almost zero difference. Compressed output travels over the same API connection as uncompressed. No bandwidth savings because the API still sends full output; we just compress what Claude reads.

Is it secure? Does it store my data?

The proxy runs locally on your machine. No data is sent to external servers. It is open-source and auditable.

What about Cursor, Windsurf, or other IDEs?

For IDEs other than Claude Code, use the Token Limits MCP server instead. It provides 8 compressed tools (local_read, expand, search, etc.) that replace the defaults.