The quickest wins
Highlight only the relevant function, failing test, or diff instead of letting Cursor pull the whole codebase. Ask for a plan before edits when the task is ambiguous, then narrow the implementation to the files that actually need to change. Route simple tasks to a cheaper model. Watch your fast-request consumption so you notice drift before the month ends.
Scope the context
Cursor can include broad codebase context automatically, which is convenient and expensive. For focused work in a single file, select the specific function or block and ask about that. Reserve full-codebase context for genuine cross-cutting questions. Old terminal output, generated files, and large logs rarely need to be in the request — leave them out.
Plan before you edit
Ambiguous requests make the model wander, reading files and exploring before it acts — and every exploration is paid context. Ask for a short plan first, confirm the approach, then request the implementation against the specific files. A scoped plan-then-edit loop costs less than one broad request that has to discover what you meant.
Match the model to the task
Not every task needs the strongest model. Summaries, renames, boilerplate, and mechanical edits run fine on a cheaper model; save the premium model for ambiguous debugging and architecture calls where a wrong answer costs more than the tokens.
Keep Cursor spend next to the rest of your stack
Cursor is rarely your only AI tool, and its cost is easy to lose among Claude, OpenAI, and your subscriptions. Tokens 4 Breakfast shows Cursor spend alongside every other provider in one macOS menu bar total, so drift is visible while you can still act on it. Free for one provider, one-time $7.99 for the full picture.