Stop Wasting Tokens in Claude Code — Cut 60-80%

April 4, 2026—Token Limits Team—6 min read

Most Claude Code sessions waste 60-80% of tokens on bloated tool outputs. Each grep, file read, and ls command returns kilobytes of formatting, paths, and duplicate content. Token Limits proxy strips this noise automatically, giving you more conversation per token.

Claude Code relies on tools: file reads, searches, listings, diffs. These tools are verbose by design—they return complete, structured output. But they also return tons of noise: repeated headers, paths you already know, duplicate lines, verbose formatting.

Where are your tokens being wasted?

Tool	Typical Output Size	Noise %	Compressible Tokens
grep (50 matches)	15,000 tokens	75%	11,000 tokens
file read (medium)	12,000 tokens	70%	8,400 tokens
ls (500 items)	8,000 tokens	80%	6,400 tokens
git diff	16,000 tokens	75%	12,000 tokens
npm list	10,000 tokens	75%	7,500 tokens

A single grep search with 50 matches returns 15,000 tokens. Of those, 75% is duplicative—repeated file paths, line numbers, formatting brackets. Remove that noise and you get 11,000 tokens of savings in one tool call.

What exactly is being wasted?

✓Repeated file paths: Seen in every match; path is known context
✓Line numbers: Helpful for humans; AI only needs the content
✓Decorative formatting: Brackets, quotes, extra spaces
✓Duplicate headers: Same column headers repeated in lists
✓Blank lines: Visual spacing that costs tokens with zero info

How much can you save?

A typical Claude Code session with 20 tool calls might consume 200k-400k tokens total. Of those, the majority are from tool outputs. With compression, that drops by 60-80%. Same session, same results, fraction of the usage.

The fix: Token Limits proxy

Install Token Limits as a proxy between Claude Code and the API. It intercepts tool results, compresses them, and forwards the compressed version to Claude. All compression happens locally on your machine.

npm install -g token-limits
Run: token-limits start
In Claude Code settings (cmd+comma): Tools > API URL > http://localhost:4800
Done. All future tool calls are compressed.

Real compression numbers

Tool	Before Compression	After Compression	Savings
grep (50 matches)	15,000 tokens	2,700 tokens	82%
file read	12,000 tokens	2,100 tokens	82%
ls (500 items)	8,000 tokens	1,200 tokens	85%
git diff	16,000 tokens	2,400 tokens	85%
npm list	10,000 tokens	1,500 tokens	85%

Cut your Claude Code token usage by 80%

Token Limits proxy automatically compresses every grep, file read, ls, and diff before Claude reads it. Set it up in 2 minutes — runs locally, no API key needed for compression.

Get Token Limits View Setup Guide

FAQ

What exactly does Token Limits compress?

Timestamps, blank lines, repeated paths, line numbers, duplicate headers, emoji, and verbose formatting. It preserves file contents, error messages, search matches, and all meaningful information.

Does compression affect accuracy?

No. Compression removes format noise that does not contain information. Claude still sees the actual content—files, errors, search results—just without the padding.

How much bandwidth does the proxy use?

Almost zero difference. Compressed output travels over the same API connection as uncompressed. No bandwidth savings because the API still sends full output; we just compress what Claude reads.

Is it secure? Does it store my data?

The proxy runs locally on your machine. No data is sent to external servers. It is open-source and auditable.

What about Cursor, Windsurf, or other IDEs?

For IDEs other than Claude Code, use the Token Limits MCP server instead. It provides 8 compressed tools (local_read, expand, search, etc.) that replace the defaults.