How to Get a Google Gemini API Key for Free (2026)

Google's Gemini API is one of the most generous free AI APIs available. With a free tier that includes 1,500 requests per day for Gemini 2.0 Flash and access to multiple model variants, it is an excellent starting point for developers building AI applications. This guide walks you through getting your free API key and making your first API calls.

What You Get for Free

Google AI Studio provides free API access to Gemini models with the following limits:

Model	Free Tier Limit	Rate Limit	Context Window
Gemini 2.0 Flash	1,500 requests/day	15 RPM	1M tokens
Gemini 2.0 Flash-Lite	1,500 requests/day	30 RPM	1M tokens
Gemini 1.5 Pro	50 requests/day	2 RPM	2M tokens
Gemini 2.0 Flash Thinking	1,500 requests/day	10 RPM	1M tokens

RPM = requests per minute. The daily limits reset at midnight Pacific Time.

These are genuinely useful limits. At 1,500 requests per day for Gemini 2.0 Flash, you can build and run production applications for free -- something few other AI providers offer.

Step 1: Go to Google AI Studio

Open your browser and navigate to aistudio.google.com.
Sign in with your Google account. Any Gmail account works -- no special developer account needed.
You land on the AI Studio playground where you can test prompts interactively.

Step 2: Generate Your API Key

Click "Get API Key" in the left sidebar (or top navigation bar).
Click "Create API Key".
Choose either:
- Create API key in new project (recommended for new users)
- Create API key in existing project (if you already have a Google Cloud project)
Copy the API key that appears. It starts with AIza....

# Store the key as an environment variable
export GEMINI_API_KEY="AIzaSy-your-api-key-here"

Important: The free tier API key works without billing setup. You do not need to add a credit card or enable billing in Google Cloud. However, free tier keys include your data in Google's improvement programs. For production use with data privacy, consider the paid tier through Vertex AI.

Step 3: Install the SDK

Google provides official SDKs for Python and JavaScript:

# Python
pip install google-genai

# JavaScript / Node.js
npm install @google/genai

Step 4: Make Your First API Call

Python Example

import os
from google import genai

client = genai.Client(api_key=os.environ["GEMINI_API_KEY"])

response = client.models.generate_content(
    model="gemini-2.0-flash",
    contents="Write a Python function that implements binary search on a sorted list. Include type hints and docstring."
)

print(response.text)

JavaScript / Node.js Example

const { GoogleGenAI } = require("@google/genai");

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

async function main() {
  const response = await ai.models.generateContent({
    model: "gemini-2.0-flash",
    contents: "Write a TypeScript utility type that makes all nested properties optional. Explain how it works.",
  });

  console.log(response.text);
}

main();

cURL Example

curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent?key=$GEMINI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [{
      "parts": [{"text": "Explain the CAP theorem with practical examples."}]
    }]
  }'

Step 5: Use the OpenAI-Compatible Endpoint

Google also provides an OpenAI-compatible endpoint, making it easy to use with tools that already support OpenAI's format:

from openai import OpenAI

client = OpenAI(
    api_key=os.environ["GEMINI_API_KEY"],
    base_url="https://generativelanguage.googleapis.com/v1beta/openai/"
)

response = client.chat.completions.create(
    model="gemini-2.0-flash",
    messages=[
        {"role": "system", "content": "You are a helpful coding assistant."},
        {"role": "user", "content": "Write a Redis caching middleware for Express.js."}
    ]
)

print(response.choices[0].message.content)

This compatibility means you can use your free Gemini API key with:

Cursor (as a custom API key)
Continue.dev
Aider
LiteLLM
Any OpenAI SDK-based application

Step 6: Use Multimodal Features

Gemini is natively multimodal. You can send images, audio, video, and documents:

Analyze an Image

import base64

with open("screenshot.png", "rb") as f:
    image_data = base64.b64encode(f.read()).decode()

response = client.models.generate_content(
    model="gemini-2.0-flash",
    contents=[
        {"text": "Describe what you see in this screenshot and identify any UI/UX issues."},
        {
            "inline_data": {
                "mime_type": "image/png",
                "data": image_data
            }
        }
    ]
)

print(response.text)

Analyze a PDF Document

with open("report.pdf", "rb") as f:
    pdf_data = base64.b64encode(f.read()).decode()

response = client.models.generate_content(
    model="gemini-2.0-flash",
    contents=[
        {"text": "Summarize the key findings in this report and list action items."},
        {
            "inline_data": {
                "mime_type": "application/pdf",
                "data": pdf_data
            }
        }
    ]
)

Step 7: Use Streaming for Better UX

For chat applications, streaming provides a real-time feel:

response = client.models.generate_content_stream(
    model="gemini-2.0-flash",
    contents="Write a comprehensive guide to database indexing strategies."
)

for chunk in response:
    print(chunk.text, end="", flush=True)

Step 8: Use Structured Output

Gemini supports JSON mode for structured output:

import json

response = client.models.generate_content(
    model="gemini-2.0-flash",
    contents="List the top 5 JavaScript frameworks with their GitHub stars, license, and primary use case.",
    config={
        "response_mime_type": "application/json"
    }
)

data = json.loads(response.text)
for framework in data:
    print(f"{framework['name']}: {framework['stars']} stars")

Free Tier Optimization Tips

Use Flash-Lite for simple tasks. Gemini 2.0 Flash-Lite has a higher rate limit (30 RPM vs. 15 RPM) and is perfectly capable for summarization, classification, and simple code generation.

Cache repeated context. If you send the same system prompt or context repeatedly, use Gemini's context caching feature to reduce token usage and improve latency.

Batch requests efficiently. Instead of sending 10 separate API calls, consider batching related work into fewer, more comprehensive requests.

Monitor your usage. Google AI Studio includes a usage dashboard. Check it periodically to ensure you are not approaching daily limits unexpectedly.

Use the 1M context wisely. Gemini 2.0 Flash supports 1 million tokens of context. You can pass entire codebases or documents in a single request, which is more efficient than multiple smaller requests.

Gemini Free vs. Other Free AI APIs

Feature	Gemini Free	OpenAI Free Credits	DeepSeek Free	Claude Free
Daily request limit	1,500	N/A (token budget)	~2,000	N/A (rate limited)
Best model	Gemini 2.0 Flash	GPT-4o mini	DeepSeek-V3	Claude Sonnet
Context window	1M tokens	128K tokens	64K tokens	200K tokens
Multimodal	Yes (image, video, audio, PDF)	Text + image	Text only	Text + image
Credit card required	No	No	No	No
OpenAI-compatible	Yes	Native	Yes	No
Code quality	Good	Good	Excellent	Excellent
Duration	Permanent free tier	Credits expire in 3 months	Credits expire	Permanent free tier

Gemini stands out with its permanent free tier (no expiring credits), massive context window, and multimodal capabilities.

Common Pitfalls

"API key not valid" error: Make sure you copied the full key including the AIza prefix. Trailing spaces can also cause issues.

"Quota exceeded" error: You have hit the daily or per-minute rate limit. Wait for the limit to reset (midnight PT for daily, 1 minute for RPM).

Inconsistent responses: Set temperature=0 for deterministic outputs. The default temperature allows some randomness.

Data privacy concerns: Free tier API calls may be used to improve Google's models. For sensitive data, use the paid tier through Vertex AI, which has stricter data handling policies.

Frequently Asked Questions

Is the Gemini free tier really permanent? Google has maintained the free tier since launching AI Studio. While limits may change, the free tier itself has been consistent. There is no indication it will be removed.

Can I use the free tier in production? You can, but be aware of the rate limits (15 RPM for Flash) and the data usage policy. For production applications with user data, consider the paid Vertex AI tier.

Do I need a Google Cloud account? No. A standard Google/Gmail account is sufficient for the free tier through AI Studio. You only need Google Cloud for the paid Vertex AI tier.

Can I get more free requests? Create a Google Cloud project and enable billing to get higher rate limits and pay-as-you-go pricing. There is no way to increase the free tier limits themselves.

Which Gemini model is best for coding? Gemini 2.0 Flash offers the best balance of speed and quality for coding tasks on the free tier. For the most complex coding challenges, Gemini 1.5 Pro (50 free requests/day) provides better reasoning.

Wrapping Up

Google Gemini's free API tier is arguably the best free AI API available in 2026. The combination of 1,500 daily requests, a 1M token context window, multimodal support, and OpenAI compatibility makes it an excellent choice for both prototyping and production use. Getting your API key takes less than 2 minutes, and you can be making API calls immediately.

If your projects also need AI-generated media like images, videos, or talking avatars, consider adding a media generation API to your stack.

Try Hypereal AI free -- 35 credits, no credit card required.