Updates
Changelog
What's new in Token Limits.
v3.0.0
LaunchMarch 2026Smart caching — Repeated file reads, searches, and diffs are detected and deduplicated automatically, saving thousands of tokens per session.
AI-powered summaries — Large outputs, old conversation context, and verbose errors are intelligently summarized so your AI retains key information without the noise.
Token-aware output limits — Results are automatically sized to fit within tool output limits, preventing truncation errors in Claude Code.
Auto-restart on update — Running
token-limits update now automatically restarts the proxy.Settings control — New
config --no-optimize flag to prevent Token Limits from modifying your Claude Code settings.Project mapping — New
local_map tool gives the AI a structural overview of your codebase in one call, reducing exploratory reads by 3-5x.Live dashboard improvements — Cost estimates for AI summaries, compression stats by tool, and real-time request history.
Windows support — Clear guidance for Windows users to use the MCP server via Claude Desktop.
v2.x
Early AccessClaude Code proxy — Automatic compression of all tool results, file reads, and conversation context.
MCP server — Seven compressed tools for Claude Desktop, Cursor, Windsurf, VS Code, and JetBrains.
Paste compressor — Free browser tool for compressing logs, errors, and build output before pasting into AI chats.
Multi-platform — Native binaries for macOS (Intel + Apple Silicon) and Linux (x64 + ARM64).
Live dashboard — Real-time compression stats at localhost:4800.