Home / Guides / How Much Does Gemini Cost

How Much Does Gemini Cost? (2026 Pricing Breakdown)

Gemini is unusual: the CLI is free for most developers, then the price jumps in two very different directions, a flat Code Assist subscription or pay-per-token API billing. Here's exactly what each one costs in 2026, what you pay when you blow past the free quota, and how to keep an eye on your real spend.

Last updated June 2026 · By Soren Starck

The Short Answer

For most individual developers, Gemini costs $0. The Gemini CLI runs on the free Gemini Code Assist for Individuals tier, which gives you a famously generous quota, roughly 60 requests per minute and 1,000 requests per day, at no charge. You only start paying when you outgrow that, and there are two paths:

  • Gemini Code Assist (paid): a flat monthly subscription, Standard (~$22.80/user/mo annual) or Enterprise (~$45/user/mo), with higher quotas, an SLA, and admin controls.
  • Gemini API: pay-per-token billing through Google AI Studio / Vertex AI. You pay only for what you send and receive, but cost scales directly with usage.

That is the entire Gemini pricing story in one paragraph. The rest of this guide breaks down each tier with real numbers and worked examples so you can predict your bill.

Gemini Pricing at a Glance

TierPriceWhat You GetBest For
Code Assist (Free)$0~60 RPM / 1,000 RPD, Flash + Pro models, CLI accessMost solo developers
Code Assist Standard~$22.80/user/mo*Higher quotas, SLA, admin controlsGoogle Cloud teams
Code Assist Enterprise~$45/user/mo*Highest quotas, code customization, security reviewLarger orgs
Gemini APIPer tokenPay-as-you-go, no flat fee, Flash & Pro tiersApps & heavy automation

*Standard is typically billed annually (~$22.80/user/mo). Prices shift as Google rebalances tiers, always cross-check the official Gemini Code Assist pricing page. Quota figures are approximate.

The Free Tier (Where Most People Live)

The free Gemini CLI tier is the most generous in its class. You get access to both Gemini Flash and Gemini Pro models with no credit card, and the daily request budget is high enough that most solo developers never hit it. For day-to-day coding, refactors, and questions, you can run the CLI all day and pay nothing.

The catch is that the free tier meters requests, not dollars, and there is no native warning before you run out. When you exceed roughly 60 requests per minute or 1,000 per day you get a 429 Resource Exhausted error, and the daily quota resets on UTC midnight, not your local timezone. If you want the full breakdown of those caps, see our Gemini rate limits guide.

Gemini Code Assist (Paid Subscriptions)

When the free quota becomes a bottleneck, a Code Assist subscription buys headroom for a flat, predictable monthly fee, no per-token math required.

  • Standard (~$22.80/user/month, billed annually): higher request quotas, an enterprise SLA, and basic admin controls. The natural step up for Google Cloud-native shops standardizing on Gemini.
  • Enterprise (~$45/user/month): the highest quotas plus code customization (grounding on your own repos), security review features, and deeper IDE integration.

The advantage of a subscription is predictability: you know your bill in advance and it doesn't spike on a heavy week. The trade-off is that if your usage is spiky, light most weeks, heavy occasionally, you may pay for headroom you rarely use. The plan calculator can help you compare a flat subscription against projected API spend for your actual workload.

Gemini API: Pay-Per-Token Pricing

The Gemini API has no monthly fee, you pay only for the tokens you send (input) and receive (output). This is what you fall back on for production apps, automation, or any usage that outgrows the CLI quotas. At 2026 list prices:

ModelInput / M tokensOutput / M tokensCached input / M
Gemini 3 Flash$0.30$2.50$0.075
Gemini 3 Pro$1.25$10.00$0.31

Two things matter here. First, output is far more expensive than input, roughly 8x on both models, so chatty, code-heavy generations cost more than long-context reads. Second, cached input is dramatically cheaper (about 4x off on Pro, 4x off on Flash), which is why reusing the same large context across turns saves real money. The 1M-token context window is powerful, but a single 500K-token request is not free, it counts against your token bill at the input rate.

SessionWatcher

Know your real Gemini spend.
Stop guessing per-token costs.

Native macOS menu bar app. Track Claude and Codex usage, costs, and rate limits in real-time.

★★★★★Trusted by developers daily
nicojerome

“Fast, simple, and does exactly what it should. Definitely worth it.”

@nicojerome on GitHub

Download Free

macOS 14+. 7-day Bundle trial. No credit card.

Worked Examples

Token math is abstract until you put real workloads against it. Here are three profiles using the API rates above. (Subscription users on the CLI pay $0 for these as long as they stay under quota, these numbers show what the same work would cost on pay-per-token billing.)

Light: casual CLI use on Flash

A developer asking quick questions and doing small edits might burn ~30K tokens/hour, mostly input. Over a 2-hour day that's ~60K tokens. On Gemini 3 Flash, even assuming a 20% output share, that's a fraction of a cent per day, well under $1/month. On the free CLI tier, it's simply $0.

Moderate: daily IDE + CLI on Pro

A working developer running Gemini Pro 4 hours/day at ~75K tokens/hour lands near 300K tokens/day, or roughly 6.5M tokens/month. At Pro rates with a 20% output share and moderate caching, that's in the range of $15–$35/month on the API. A flat Code Assist Standard subscription (~$22.80/mo) is competitive here and removes the quota anxiety.

Heavy: agentic CLI sessions on Pro

Agentic, all-day CLI work burns 150K–400K tokens/hour. Six hours/day, five days/week pushes you past 20M–40M tokens/month. On Gemini 3 Pro that can run $100–$300+/month on the API, the point where Enterprise Code Assist or aggressive context caching starts to pay off. This is also where silent 429 lockouts on the free tier cost you time, not just money.

The Hidden Cost: Lockouts and Bill Surprises

Gemini's pricing has two failure modes that don't show up on the sticker price. On the free tier, the cost is lost time: you hit a 429 mid-task, the CLI retries silently, and you wait until UTC midnight for the daily quota to reset. On the API, the cost is a surprise invoice: token billing is invisible until it arrives, and a few heavy agentic days can multiply your bill.

Both problems have the same root cause, no real-time visibility. Neither AI Studio nor the CLI warns you in the moment. If you can see you're at 80% of your daily quota by lunch, you can pace yourself or switch to Flash for the cheap turns. If you can watch estimated API spend tick up, you never get a shock invoice.

How to Track Your Real Gemini Cost

There are two practical ways to understand what Gemini is actually costing you:

1. The Google Cloud / AI Studio dashboard: shows historical token usage and billing, but with a delay and no live, in-session view. Useful for the monthly reconciliation, useless for pacing yourself today.

2. SessionWatcher: SessionWatcher for Gemini watches your CLI activity live from the macOS menu bar, requests against the daily quota, tokens per minute on long-context calls, per-model breakdown (Flash vs Pro), estimated spend, and notifications at 80% / 95% so you can pace before a 429 or a bill spike. It turns Gemini's invisible metering into something you can actually see and control.

Important: that's a different price from Gemini itself.

Google charges you for Gemini (free tier, Code Assist subscription, or API tokens as above). SessionWatcher is a separate, optional macOS app that monitors that usage. Gemini monitoring is included with SessionWatcher Pro, $49 one-time or $24/year, which covers all 7 supported tools (Claude Code, Codex, Cursor, Copilot, Gemini, opencode, and more). Every purchase has a 30-day refund.

Which Option Is Right for You?

  • Solo dev, light-to-moderate use: stay on the free CLI tier. It's genuinely free and generous.
  • Steady daily use, predictable budget: Code Assist Standard (~$22.80/mo) buys headroom and an SLA without per-token math.
  • Production apps or heavy automation: the Gemini API, with caching and Flash-for-cheap-turns to control cost.
  • Teams with security/compliance needs: Code Assist Enterprise (~$45/mo) for code customization and review features.

Whichever you pick, the only way to make the right call, and to avoid both lockouts and bill surprises, is to know your actual usage. That's exactly what SessionWatcher shows you.

SessionWatcher

Track your real Gemini costs.
Included with Pro. 30-day refund.

Native macOS menu bar app. Track Claude and Codex usage, costs, and rate limits in real-time.

★★★★★Trusted by developers daily
nicojerome

“Fast, simple, and does exactly what it should. Definitely worth it.”

@nicojerome on GitHub

Download Free

macOS 14+. 7-day Bundle trial. No credit card.

Frequently Asked Questions

Is the Gemini CLI free?

Yes. The Gemini CLI runs on the free Gemini Code Assist for Individuals tier, roughly 60 requests per minute and 1,000 per day, with access to both Flash and Pro models. You only pay when you exceed those quotas and move to a paid Code Assist plan or the per-token Gemini API.

How much does the Gemini API cost per token?

At 2026 list prices, Gemini 3 Flash is about $0.30/M input and $2.50/M output; Gemini 3 Pro is about $1.25/M input and $10/M output. Cached input is far cheaper (~$0.075/M Flash, ~$0.31/M Pro). Output dominates cost on generation-heavy work.

How much is Gemini Code Assist Standard?

Standard is about $22.80 per user per month (billed annually), and Enterprise is roughly $45 per user per month. Both add higher quotas, an SLA, and admin controls on top of the free CLI tier. Prices shift, so check the official Gemini pricing page.

How can I track what Gemini actually costs me?

SessionWatcher tracks your Gemini CLI usage live from the macOS menu bar, requests against quota, tokens per minute, and estimated spend. Note this is separate from Google's charges: Gemini monitoring is included with SessionWatcher Pro ($49 one-time or $24/year, all 7 tools, 30-day refund).