Home / Guides / Gemini Usage

How to Check Your Gemini CLI & API Usage

Gemini enforces both requests-per-minute and requests-per-day caps, with separate quotas per model. Here's how to track both live and stop hitting 429s mid-flow.

Last updated April 2026 · By Soren Starck

Understanding Gemini Usage Limits

Gemini limits are stricter than most LLM APIs because they cap on three axes: RPM (requests per minute), RPD (requests per day), and TPM (tokens per minute). Each model - Gemini 2.5 Pro, Flash, Flash-Lite - has its own caps.

On the free tier, Pro is the most restricted (5 RPM / 100 RPD). On Tier 1 (paid), caps jump to 360 RPM and 10,000 RPD. The 1M-token context window doesn't change rate caps - it just means a single request can be very expensive.

Method 1: Google AI Studio (Free Tier)

  1. Open aistudio.google.com
  2. Click Get API key in the sidebar
  3. Open the key's usage view to see RPM/RPD per model

Method 2: Google Cloud Console (Paid Tier)

  1. Go to console.cloud.google.com
  2. Open Quotas & System Limits → search “Generative Language”
  3. Review per-model RPM, RPD, and TPM
  4. For costs, open BillingReports

Both dashboards lag 5–60 minutes and don't alert you in real time. If your CLI session is hammering Gemini, you'll know only after you've already hit the cap.

Method 3: Real-Time Monitoring with SessionWatcher

SessionWatcher for Gemini is a native macOS menu bar app that tracks Gemini CLI activity live.

  • RPM in last 60 seconds - see how close to throttle you are
  • Daily request count vs RPD cap
  • Token burn per model - Pro vs Flash vs Flash-Lite
  • Cost estimate for paid tier - real-time spend
  • macOS notifications at 80% and 95%
SessionWatcher

Stop chasing 429s.
Watch your RPM live.

Native macOS menu bar app. Track Claude and Codex usage, costs, and rate limits in real-time.

★★★★★ 4.9/5 from developers
nicojerome

“Fast, simple, and does exactly what it should. Definitely worth it.”

@nicojerome on GitHub

Get SessionWatcher

macOS 14+. $2.99 one-time purchase.

Free vs Tier 1 Limits (Approximate)

ModelFree RPM / RPDTier 1 RPM / RPD
Gemini 2.5 Pro5 / 100150 / 1,000
Gemini 2.5 Flash10 / 2501,000 / 10,000
Gemini 2.5 Flash-Lite15 / 1,0004,000 / 30,000

Numbers shift as Google rebalances. Always cross-check against the official docs. The pattern stays: Pro is the tightest, Flash is in the middle, Flash-Lite is the most permissive.

SessionWatcher vs. Cloud Console

Feature SessionWatcherAI Studio / Cloud Console
Real-time RPMYesLagged
Daily quota countLiveSnapshot
429 preventionmacOS notificationsNone
Setup required10 secondsLogin each time
Workflow interruptionNone (menu bar)Switch to browser
Cost$2.99 one-timeFree

Frequently Asked Questions

How do I check my Gemini usage?

Free tier in aistudio.google.com, paid tier in console.cloud.google.com. Both lag. SessionWatcher for Gemini shows live RPM/RPD/tokens in the macOS menu bar.

What are the Gemini free-tier limits?

Pro is the tightest (~5 RPM, 100 RPD). Flash and Flash-Lite are higher. Always check the docs - Google rebalances them.

Why does Gemini CLI sometimes 429?

You hit your RPM cap. Gemini returns 429 with retry-after. SessionWatcher tracks RPM in real time so you can pace before the cap.

Is there a Gemini monitor app for macOS?

Yes - SessionWatcher for Gemini. $2.99 one-time, 10-second setup.