Updates

Changelog

What's new in Token Limits.

v3.0.0

LaunchMarch 2026
Smart caching — Repeated file reads, searches, and diffs are detected and deduplicated automatically, saving thousands of tokens per session.
AI-powered summaries — Large outputs, old conversation context, and verbose errors are intelligently summarized so your AI retains key information without the noise.
Token-aware output limits — Results are automatically sized to fit within tool output limits, preventing truncation errors in Claude Code.
Auto-restart on update — Running token-limits update now automatically restarts the proxy.
Settings control — New config --no-optimize flag to prevent Token Limits from modifying your Claude Code settings.
Project mapping — New local_map tool gives the AI a structural overview of your codebase in one call, reducing exploratory reads by 3-5x.
Live dashboard improvements — Cost estimates for AI summaries, compression stats by tool, and real-time request history.
Windows support — Clear guidance for Windows users to use the MCP server via Claude Desktop.

v2.x

Early Access
Claude Code proxy — Automatic compression of all tool results, file reads, and conversation context.
MCP server — Seven compressed tools for Claude Desktop, Cursor, Windsurf, VS Code, and JetBrains.
Paste compressor — Free browser tool for compressing logs, errors, and build output before pasting into AI chats.
Multi-platform — Native binaries for macOS (Intel + Apple Silicon) and Linux (x64 + ARM64).
Live dashboard — Real-time compression stats at localhost:4800.