Updates

Changelog

What's new in Token Limits.

v3.0.0

LaunchMarch 2026

Smart caching— Repeated file reads, searches, and diffs are detected and deduplicated automatically, saving thousands of tokens per session.

AI-powered summaries— Large outputs, old conversation context, and verbose errors are intelligently summarized so your AI retains key information without the noise.

Token-aware output limits— Results are automatically sized to fit within tool output limits, preventing truncation errors in Claude Code.

Auto-restart on update — Running token-limits update now automatically restarts the proxy.

Settings control — New config --no-optimize flag to prevent Token Limits from modifying your Claude Code settings.

Project mapping — New local_map tool gives the AI a structural overview of your codebase in one call, reducing exploratory reads by 3-5x.

Live dashboard improvements— Cost estimates for AI summaries, compression stats by tool, and real-time request history.

Windows support— Clear guidance for Windows users to use the MCP server via Claude Desktop.

v2.x

Early Access

Claude Code proxy— Automatic compression of all tool results, file reads, and conversation context.

MCP server— Seven compressed tools for Claude Desktop, Cursor, Windsurf, VS Code, and JetBrains.

Paste compressor— Free browser tool for compressing logs, errors, and build output before pasting into AI chats.

Multi-platform— Native binaries for macOS (Intel + Apple Silicon) and Linux (x64 + ARM64).

Live dashboard— Real-time compression stats at localhost:4800.