When you use Claude Code, every word Claude generates costs tokens. Most people never control how Claude responds - they just get whatever the model decides to output. All of this wastes tokens. None of it adds value. Option 1: Paste rules in chat (quick start) Copy these rules into any new session:
One file. Drop it in your project. Keeps responses terse and can reduce total tokens on output-heavy workflows. Note: instruction files add input tokens on every turn. Keep this file short - if it grows too much, it can cost more than it saves. Model support: benchmarks were run on Claude only. The rules are model-agnostic and should work on any model that reads context - but results on local models like llama.cpp, Mistral, or others are untested. Community results welcome.
When you use Claude Code, every word Claude generates costs tokens. Most people never control how Claude responds - they just get whatever the model decides to output.
By default, Claude:
All of this wastes tokens. None of it adds value.
Option 1: Paste rules in chat (quick start) Copy these rules into any new session:
Rules: Read files first. Write complete solution. Test once. No over-engineering.
Works immediately. No setup. Good for one-off tasks.
Option 2: Drop CLAUDE.md file (set and forget)
your-project/
└── CLAUDE.md <- one file, zero setup, no code changes
Automatic on every message. Better for regular work. More efficient at scale.
Pick based on your workflow. Both work.
| Approach | Setup | Cost | Best For |
|---|---|---|---|
| Rules in chat | None | Higher | Quick sessions, no project |
| CLAUDE.md file | 1 file | Lower | Regular work, pipelines |
This file works best for:
This file is not worth it for:
The honest trade-off: The CLAUDE.md file itself consumes input tokens on every message. The savings come from reduced output tokens. The net is only positive when output volume is high enough to offset the persistent input cost. At low usage it costs more than it saves.
Same 5 prompts. Run without CLAUDE.md (baseline) then with CLAUDE.md (optimized).
| Test | Baseline | Optimized | Reduction |
|---|---|---|---|
| Explain async/await | 180 words | 65 words | 64% |
| Code review | 120 words | 30 words | 75% |
| What is a REST API | 110 words | 55 words | 50% |
| Hallucination correction | 55 words | 20 words | 64% |
| Total | 465 words | 170 words | 63% |
~295 words saved per 4 prompts. Same information. Zero signal loss.
Methodology note: This is a 5-prompt directional indicator (T1-T3, T5 for word reduction; T4 is a format test), not a statistically controlled study. Claude's output length varies naturally between identical prompts. No variance controls or repeated runs were applied. Treat the 63% as a directional signal for output-heavy use cases, not a precise universal measurement. The CLAUDE.md file itself adds input tokens on every message - net savings only apply when output volume is high enough to offset that persistent cost.
benchmark/)The original table above measures word counts on a single run. For real output_tokens measured across haiku/sonnet/opus at N=5, see benchmark/SUMMARY.md and benchmark/SEMANTIC.md. Output-token reduction with the current minimal CLAUDE.md is ~4% (haiku), ~12% (sonnet), ~7% (opus). The 63% figure is achievable with a stricter rules profile - reproduce with python3 benchmark/run.py -n 5 --model opus and read the per-model report.
The semantic eval also confirms baselines on current models already exhibit 0% preamble, sycophancy, "as an AI", and smart quotes - rules targeting those behaviors carry input cost without changing output. Trim accordingly.
An independent benchmark ran 6 configs across 3 coding challenges (CSV reporter, SQLite window functions, Hono WebSocket counter). All configs passed all tests, so comparison was purely cost-to-green.
We ran our own v8 config head-to-head against C-structured (the previous best) on the same harness, same day, same model:
| Challenge | M-drona23-v8 | C-structured | Winner |
|---|---|---|---|
| CSV Reporter | $0.244 | $0.282 | v8 |
| SQLite Windows | $0.406 | $0.376 | C-structured |
| WebSocket | $0.285 | $0.473 | v8 |
| Total | $0.935 | $1.131 | v8 (-17.4%) |
The v8 config uses 2 files (7 lines total). The biggest win comes from WebSocket where explicit pattern rules prevent expensive debugging loops.
This repo keeps the root CLAUDE.md to a small set of high-impact rules to minimize recurring input overhead.
| Usage | Tokens Saved/Day | Monthly Savings (Sonnet) |
|---|---|---|
| 100 prompts/day | ~9,600 tokens | ~$0.86 |
| 1,000 prompts/day | ~96,000 tokens | ~$8.64 |
| 3 projects combined | ~288,000 tokens | ~$25.92 |
Without CLAUDE.md - Code Review:
Great question! I can see you're working with a loop. Let me take a look at that.
That's a thoughtful piece of code! However, I notice there's actually a subtle bug
here that's worth pointing out. The condition i <= arr.length should be i < arr.length.
Using <= means the loop will run one extra iteration... You might also want to consider
using modern array methods like forEach, map, or for...of which are cleaner...
Great catch on reviewing this - it's always good practice to double-check loop
boundaries! I hope this review was helpful. Feel free to share more code anytime!
120 words.
With CLAUDE.md - Same prompt:
Bug: <= causes an off-by-one error. On the final iteration arr[arr.length] is undefined.
Fix:
for (let i = 0; i < arr.length; i++)
30 words. Same fix. 75% fewer tokens.
| # | Problem | Fix |
|---|---|---|
| 1 | Starts coding without context | Think first; read files before writing |
| 2 | Verbose responses | Keep output concise |
| 3 | Rewrites large files unnecessarily | Prefer targeted edits |
| 4 | Re-reading the same files | Read each file once unless it changed |
| 5 | Declaring done without validation | Run tests before finishing |
| 6 | Sycophantic chatter | No flattering preamble/closing fluff |
| 7 | Over-engineered solutions | Favor simple direct fixes |
| 8 | Prompt conflict confusion | User instructions always override |
Scope rules to your actual failure modes, not generic ones. Generic rules like "be concise" help but the real wins come from targeting specific failures you've actually hit. For example if Claude silently swallows errors in your pipeline, add a rule like: "when a step fails, stop immediately and report the full error with traceback before attempting any fix." Specific beats generic every time.
CLAUDE.md files compose - use that. Claude reads multiple CLAUDE.md files at once - global (~/.claude/CLAUDE.md), project-level, and subdirectory-level. This means:
This avoids bloating any single file and keeps rules close to where they apply.
Different project types need different levels of compression. Pick the base file + a profile, or use the base alone.
| Profile | Best For |
|---|---|
CLAUDE.md | Universal - works for any project |
profiles/CLAUDE.benchmark.md | Token-to-green coding benchmarks |
profiles/CLAUDE.coding.md | Dev projects, code review, debugging |
profiles/CLAUDE.agents.md | Automation pipelines, multi-agent systems |
profiles/CLAUDE.analysis.md | Data analysis, research, reporting |
The profiles/ directory also contains three versioned configuration sets representing different optimization strategies. Pick the one that matches your workflow:
| Version | Strategy | Tool Budget | Best For |
|---|---|---|---|
J-drona23-v5 | Multi-file structured | 50 calls | Complex projects needing detailed workflow rules and agent definitions |
K-drona23-v6 | One-shot execution | 50 calls | Tasks that should complete in a single pass with minimal iteration |
M-drona23-v8 | Ultra-lean minimum-turn | 20 calls | Cost-sensitive pipelines where every tool call counts |
How to choose:
Option A: CLAUDE.md file (recommended for regular use)
Option B: Rules in prompt (for one-off sessions)
Cost comparison (benchmarked on 3 coding challenges):
| Method | CSV | SQLite | WebSocket | Total | Cost vs v8 |
|---|---|---|---|---|---|
| Rules pasted in chat | $0.274 | $0.459 | $0.585 | $1.318 | +41% |
| CLAUDE.md (v8) | $0.244 | $0.406 | $0.285 | $0.935 | baseline |
Both pass all tests. Pick based on your workflow.
Option 1 - Universal (any project):
curl -o CLAUDE.md https://raw.githubusercontent.com/drona23/claude-token-efficient/main/CLAUDE.md
Option 2 - Clone and pick a profile:
git clone https://github.com/drona23/claude-token-efficient
cp claude-token-efficient/profiles/CLAUDE.coding.md your-project/CLAUDE.md
Option 3 - Manual:
Copy the contents of CLAUDE.md from this repo into your project root.
User instructions always win. If you explicitly ask for a detailed explanation or verbose output, Claude will follow your instruction - the file never fights you.
Found a behavior that CLAUDE.md can fix? Open an issue with:
Community submissions become part of the next version with full credit.
Full benchmark results with before/after word counts: See BENCHMARK.md
This project was built on real complaints from the Claude community. Full credit to every source that contributed a fix:
/cost output to check if prompt caching is working correctlyMIT - free to use, modify, and distribute.
Built by drona23 - open to PRs, issues, and profile contributions.
AI
Companies use AI to filter candidates. I just gave candidates AI to choose companies. Career-Ops (career-ops.org, also known as careerops) turns any AI coding CLI into a full job search command center. Instead of manually tracking applications in a spreadsheet, you get an AI-powered pipeline that: Career-ops is agentic: Claude Code navigates career pages with Playwright, evaluates fit by reasoning about your CV vs the job description (not keyword matching), and adapts your resume per listing.
AI
CLI-Anything: Bridging the Gap Between AI Agents and the World's Software 🌐 CLI-Hub: pip install cli-anything-hub then cli-hub install — browse, install, and manage all community-built CLIs. Want to add your own? Open a PR — the hub updates instantly. 🎬 See Demos: Watch AI agents use generated CLIs plus preview, live preview, and trajectory loops to produce real artifacts — CAD builds, 3D scenes, diagrams, gameplay, subtitles, and more.
AI
A self-hosted AI workspace -- meant to be the self-hosted version of the UI experience you get from ChatGPT and Claude. But with more jank and fun. Running on your own hardware, with your own data -- local-first, privacy-first, and no trojan. A full, hover-to-play tour lives on the landing page (docs/index.html). Defaults work out of the box: clone, run, then configure models/search/email inside Settings. Only edit .env for deployment-level overrides like APPBIND, APPPORT, AUTHENABLED, DATABASEURL, or a pre-seeded admin password.
AI
Most AI material teaches in scattered pieces. A paper here, a fine-tuning post there, a flashy agent demo somewhere else. The pieces rarely line up. You ship a chatbot but can't explain its loss curve. You hook a function to an agent but can't say what attention does inside the model that's calling it. This curriculum is the spine. 20 phases, 503 lessons, four languages: Python, TypeScript, Rust, Julia. Linear algebra at one end, autonomous swarms at the other. Every algorithm gets built from raw math first. Backprop. Tokenizer. Attention. Agent loop. By the time PyTorch shows up, you already know what it's doing under the hood. Each lesson runs the same loop: read the problem, derive the math, write the code, run the test, keep the artifact. No five-minute videos, no copy-paste deploys,