How to Check Gemini Usage – RPM, RPD, Tokens & 1M Context Window

Understanding Gemini Usage Limits

Gemini limits are stricter than most LLM APIs because they cap on three axes: RPM (requests per minute), RPD (requests per day), and TPM (tokens per minute). Each model - Gemini 2.5 Pro, Flash, Flash-Lite - has its own caps.

On the free tier, Pro is the most restricted (5 RPM / 100 RPD). On Tier 1 (paid), caps jump to 360 RPM and 10,000 RPD. The 1M-token context window doesn't change rate caps - it just means a single request can be very expensive.

Method 1: Google AI Studio (Free Tier)

Open aistudio.google.com
Click Get API key in the sidebar
Open the key's usage view to see RPM/RPD per model

Method 2: Google Cloud Console (Paid Tier)

Go to console.cloud.google.com
Open Quotas & System Limits → search “Generative Language”
Review per-model RPM, RPD, and TPM
For costs, open Billing → Reports

Both dashboards lag 5–60 minutes and don't alert you in real time. If your CLI session is hammering Gemini, you'll know only after you've already hit the cap.

Method 3: Real-Time Monitoring with SessionWatcher

SessionWatcher for Gemini is a native macOS menu bar app that tracks Gemini CLI activity live.

RPM in last 60 seconds - see how close to throttle you are
Daily request count vs RPD cap
Token burn per model - Pro vs Flash vs Flash-Lite
Cost estimate for paid tier - real-time spend
macOS notifications at 80% and 95%

SessionWatcher

Stop chasing 429s.
Watch your RPM live.

Native macOS menu bar app. Track Claude and Codex usage, costs, and rate limits in real-time.

★★★★★Trusted by developers daily

“Fast, simple, and does exactly what it should. Definitely worth it.”

@nicojerome on GitHub

Download Free

macOS 14+. 7-day Bundle trial. No credit card.

FinderFileEditViewGoWindowHelp

Mon Jan 1 12:00 AM

Free vs Tier 1 Limits (Approximate)

Model	Free RPM / RPD	Tier 1 RPM / RPD
Gemini 2.5 Pro	5 / 100	150 / 1,000
Gemini 2.5 Flash	10 / 250	1,000 / 10,000
Gemini 2.5 Flash-Lite	15 / 1,000	4,000 / 30,000

Numbers shift as Google rebalances. Always cross-check against the official docs. The pattern stays: Pro is the tightest, Flash is in the middle, Flash-Lite is the most permissive.

SessionWatcher vs. Cloud Console

Feature	SessionWatcher	AI Studio / Cloud Console
Real-time RPM	Yes	Lagged
Daily quota count	Live	Snapshot
429 prevention	macOS notifications	None
Setup required	10 seconds	Login each time
Workflow interruption	None (menu bar)	Switch to browser
Cost	$49 one-time (Pro)	Free

Frequently Asked Questions

How do I check my Gemini usage?

Free tier in aistudio.google.com, paid tier in console.cloud.google.com. Both lag. SessionWatcher for Gemini shows live RPM/RPD/tokens in the macOS menu bar.

What are the Gemini free-tier limits?

Pro is the tightest (~5 RPM, 100 RPD). Flash and Flash-Lite are higher. Always check the docs - Google rebalances them.

Why does Gemini CLI sometimes 429?

You hit your RPM cap. Gemini returns 429 with retry-after. SessionWatcher tracks RPM in real time so you can pace before the cap.

Is there a Gemini monitor app for macOS?

Yes - SessionWatcher for Gemini. Part of Pro ($49 one-time), 10-second setup.