Home / Guides / Gemini Usage
How to Check Your Gemini CLI & API Usage
Gemini enforces both requests-per-minute and requests-per-day caps, with separate quotas per model. Here's how to track both live and stop hitting 429s mid-flow.
Last updated April 2026 · By Soren Starck
Understanding Gemini Usage Limits
Gemini limits are stricter than most LLM APIs because they cap on three axes: RPM (requests per minute), RPD (requests per day), and TPM (tokens per minute). Each model - Gemini 2.5 Pro, Flash, Flash-Lite - has its own caps.
On the free tier, Pro is the most restricted (5 RPM / 100 RPD). On Tier 1 (paid), caps jump to 360 RPM and 10,000 RPD. The 1M-token context window doesn't change rate caps - it just means a single request can be very expensive.
Method 1: Google AI Studio (Free Tier)
- Open aistudio.google.com
- Click Get API key in the sidebar
- Open the key's usage view to see RPM/RPD per model
Method 2: Google Cloud Console (Paid Tier)
- Go to console.cloud.google.com
- Open Quotas & System Limits → search “Generative Language”
- Review per-model RPM, RPD, and TPM
- For costs, open Billing → Reports
Both dashboards lag 5–60 minutes and don't alert you in real time. If your CLI session is hammering Gemini, you'll know only after you've already hit the cap.
Method 3: Real-Time Monitoring with SessionWatcher
SessionWatcher for Gemini is a native macOS menu bar app that tracks Gemini CLI activity live.
- RPM in last 60 seconds - see how close to throttle you are
- Daily request count vs RPD cap
- Token burn per model - Pro vs Flash vs Flash-Lite
- Cost estimate for paid tier - real-time spend
- macOS notifications at 80% and 95%
SessionWatcherStop chasing 429s.
Watch your RPM live.
Native macOS menu bar app. Track Claude and Codex usage, costs, and rate limits in real-time.
“Fast, simple, and does exactly what it should. Definitely worth it.”
@nicojerome on GitHub
macOS 14+. $2.99 one-time purchase.

Free vs Tier 1 Limits (Approximate)
| Model | Free RPM / RPD | Tier 1 RPM / RPD |
|---|---|---|
| Gemini 2.5 Pro | 5 / 100 | 150 / 1,000 |
| Gemini 2.5 Flash | 10 / 250 | 1,000 / 10,000 |
| Gemini 2.5 Flash-Lite | 15 / 1,000 | 4,000 / 30,000 |
Numbers shift as Google rebalances. Always cross-check against the official docs. The pattern stays: Pro is the tightest, Flash is in the middle, Flash-Lite is the most permissive.
SessionWatcher vs. Cloud Console
| Feature | SessionWatcher | AI Studio / Cloud Console |
|---|---|---|
| Real-time RPM | Yes | Lagged |
| Daily quota count | Live | Snapshot |
| 429 prevention | macOS notifications | None |
| Setup required | 10 seconds | Login each time |
| Workflow interruption | None (menu bar) | Switch to browser |
| Cost | $2.99 one-time | Free |
Frequently Asked Questions
How do I check my Gemini usage?
Free tier in aistudio.google.com, paid tier in console.cloud.google.com. Both lag. SessionWatcher for Gemini shows live RPM/RPD/tokens in the macOS menu bar.
What are the Gemini free-tier limits?
Pro is the tightest (~5 RPM, 100 RPD). Flash and Flash-Lite are higher. Always check the docs - Google rebalances them.
Why does Gemini CLI sometimes 429?
You hit your RPM cap. Gemini returns 429 with retry-after. SessionWatcher tracks RPM in real time so you can pace before the cap.
Is there a Gemini monitor app for macOS?
Yes - SessionWatcher for Gemini. $2.99 one-time, 10-second setup.