The quickest wins
Start a new conversation every 15 to 20 messages instead of letting one chat grow forever. Edit your original message and regenerate rather than sending a follow-up. Turn off extended thinking and web search when you do not need them. Batch related questions into one message. Together these usually matter more than any single setting.
Why long chats burn through your limit
Claude re-reads the whole conversation on every turn, so by message twenty a simple question can carry thousands of tokens of history. The longer the chat, the more each new message costs — even if the message itself is short. Starting fresh resets that weight. If you need context from the old chat, ask Claude to summarize it, then paste the summary as the first message in a new conversation.
Edit instead of following up
When a reply misses, resist the urge to send a correction. Click the edit icon on your original message, fix the prompt, and regenerate. The bad exchange is replaced rather than added to the history, so you do not pay to keep a wrong turn in context for the rest of the conversation.
Turn off features that add tokens silently
Extended thinking, web search, and connectors all add tokens to responses — even when the task does not need them. They are powerful, but leave them off by default and switch them on only when a specific task calls for deeper reasoning or live lookups. Picking the right model matters too: a heavier model costs more per message, so reserve it for genuinely hard problems.
See your limit before you hit it
These habits help, but you still cannot see how close you are to a reset while you work. Tokens 4 Breakfast keeps your live Claude usage — current session and weekly limit, pulled from claude.ai — in the macOS menu bar, so you get a warning before a session cuts out instead of after. Free for one provider, about two minutes to set up.