Do reasoning or thinking tokens cost money?
Yes. The hidden thinking tokens a model spends before answering are billed at the same rate as output tokens. Higher reasoning effort means more of them, so effort directly raises cost.
Reasoning models think before they answer, and those thinking tokens bill at the same rate as the answer. Crank effort to maximum on every request and you can pay many times over for quality you never needed. The fix is simple: match effort to the shape of the task, not its importance. Here is how to choose, model by model, and how to see what it is costing you.
Quick answer
Reasoning effort controls how many hidden thinking tokens a model spends before answering, and those tokens bill at output rates. Use low or minimal effort for recall, extraction, and formatting; keep medium as your default; reserve high or max for architecture, hard debugging, and genuine multi-step reasoning. A task that spends 4,000 thinking tokens before a 500-token answer can cost roughly 9x the bare answer, while accuracy gains from extra thinking flatten quickly. Tokens 4 Breakfast shows the token spend that effort drives, by session and model, so a high-effort habit does not quietly inflate your bill.
Each step stands on its own — skip to the one that matches where you are.
Reasoning models produce hidden thinking tokens before the visible answer, and higher effort means a larger thinking budget. The catch is that those thinking tokens are billed at the same rate as output tokens, so effort is a direct multiplier on cost, not a free quality dial.
Match effort to how much reasoning the task needs, not how important it feels. Recall, extraction, reformatting, and simple edits need almost none, so use low or minimal. Multi-step problems with real depth, like architecture, tricky debugging, or algorithms, are where high effort earns its cost. When unsure, start at medium and escalate only on evidence.
Claude Code exposes Low, Medium, High, and Max, and you can cap the ceiling with MAX_THINKING_TOKENS (8000 is a sensible default) or dial it per task with /effort. Note that the default recently shifted toward high, which can quietly raise costs unless you override it. OpenAI's GPT-5.x and Codex take a reasoning_effort from none through xhigh, and Gemini exposes a thinking budget. The idea is the same everywhere: spend less thinking on shallow tasks.
A request that burns 4,000 thinking tokens before a 500-token answer costs roughly 9x the bare answer. Multiply that across hundreds of daily calls and max effort becomes one of the largest lines on your bill. The payoff is not linear either: accuracy from extra thinking improves logarithmically, with steep early gains that flatten fast, so most of the spend past medium buys very little.
You cannot tune what you cannot see. Tokens 4 Breakfast tracks the token spend your effort settings drive, broken out by session and model, live in the Mac menu bar. When a day spikes, you can tell whether a high-effort habit or an Opus-heavy session is behind it, and dial effort down where it was never needed.
Pro tips
Short, direct answers to the things people ask most about this.
Yes. The hidden thinking tokens a model spends before answering are billed at the same rate as output tokens. Higher reasoning effort means more of them, so effort directly raises cost.
No. Accuracy from extra thinking improves logarithmically, with steep early gains that flatten quickly. On recall, extraction, and formatting it adds cost with little or no quality gain, so reserve high effort for genuine multi-step reasoning.
Tokens 4 Breakfast shows your token spend by session and model in the Mac menu bar, so you can spot when a high-effort habit or an expensive model is inflating your bill and adjust before the invoice arrives.
Privacy-first analytics
We use optional analytics to understand aggregate website usage and make the product easier to discover. Google Analytics only loads after you accept. No account data, app data, or personal AI usage is sent, and you can change your choice anytime.
Read the privacy policy