How to Use Gemini 3 API: Complete Developer Guide (2026)

How to Use Gemini 3 API: Complete Developer Guide for 2026

Google's Gemini 3 is the latest generation of their multimodal AI model family, bringing significant improvements in reasoning, code generation, multimodal understanding, and instruction following. Whether you are building chatbots, content generation tools, code assistants, or multimodal applications, the Gemini 3 API provides a powerful foundation.

This guide covers everything you need to get started: authentication, API calls in Python and JavaScript, multimodal inputs, streaming, function calling, and pricing.

Gemini 3 Model Variants

Model	Context Window	Best For	Speed
Gemini 3 Ultra	2M tokens	Complex reasoning, research, coding	Slow
Gemini 3 Pro	2M tokens	Balanced quality and speed	Medium
Gemini 3 Flash	1M tokens	Fast responses, high throughput	Fast
Gemini 3 Flash Lite	512K tokens	Cost-optimized, simple tasks	Very Fast

Prerequisites

A Google Cloud account or Google AI Studio account
An API key from Google AI Studio
Python 3.9+ or Node.js 18+
The Google Generative AI SDK

Step 1: Get Your API Key

The fastest way to get an API key:

Go to Google AI Studio
Click Get API Key
Select or create a Google Cloud project
Copy the generated API key

For production use, create a key in the Google Cloud Console under APIs & Services > Credentials.

Step 2: Install the SDK

Python:

pip install google-generativeai

JavaScript/Node.js:

npm install @google/generative-ai

Step 3: Basic Text Generation (Python)

import google.generativeai as genai

genai.configure(api_key="YOUR_API_KEY")

model = genai.GenerativeModel("gemini-3-pro")

response = model.generate_content("Explain quantum computing in simple terms")

print(response.text)

Step 4: Basic Text Generation (JavaScript)

import { GoogleGenerativeAI } from "@google/generative-ai";

const genAI = new GoogleGenerativeAI("YOUR_API_KEY");
const model = genAI.getGenerativeModel({ model: "gemini-3-pro" });

async function generateText() {
  const result = await model.generateContent(
    "Explain quantum computing in simple terms"
  );
  console.log(result.response.text());
}

generateText();

Step 5: Multi-Turn Conversations

Create a chat session for back-and-forth conversations:

Python:

import google.generativeai as genai

genai.configure(api_key="YOUR_API_KEY")
model = genai.GenerativeModel("gemini-3-pro")

chat = model.start_chat(history=[])

# First message
response = chat.send_message("I'm building a REST API with Python. What framework should I use?")
print(response.text)

# Follow-up
response = chat.send_message("Can you show me a basic FastAPI example with a POST endpoint?")
print(response.text)

# Another follow-up (context is maintained)
response = chat.send_message("Now add input validation with Pydantic")
print(response.text)

JavaScript:

const chat = model.startChat({
  history: [],
});

const result1 = await chat.sendMessage(
  "I'm building a REST API with Python. What framework should I use?"
);
console.log(result1.response.text());

const result2 = await chat.sendMessage(
  "Can you show me a basic FastAPI example with a POST endpoint?"
);
console.log(result2.response.text());

Step 6: Multimodal Input (Image + Text)

Gemini 3 excels at understanding images alongside text:

Python:

import google.generativeai as genai
from pathlib import Path

genai.configure(api_key="YOUR_API_KEY")
model = genai.GenerativeModel("gemini-3-pro")

# Load an image
image_path = Path("screenshot.png")
image_data = image_path.read_bytes()

response = model.generate_content([
    "Analyze this UI screenshot and suggest 3 specific improvements for accessibility and usability.",
    {
        "mime_type": "image/png",
        "data": image_data
    }
])

print(response.text)

JavaScript:

import fs from "fs";

const imageData = fs.readFileSync("screenshot.png");
const base64Image = imageData.toString("base64");

const result = await model.generateContent([
  "Analyze this UI screenshot and suggest 3 specific improvements.",
  {
    inlineData: {
      mimeType: "image/png",
      data: base64Image,
    },
  },
]);

console.log(result.response.text());

Step 7: Streaming Responses

For better user experience, stream responses token by token:

Python:

import google.generativeai as genai

genai.configure(api_key="YOUR_API_KEY")
model = genai.GenerativeModel("gemini-3-pro")

response = model.generate_content(
    "Write a detailed guide on setting up a CI/CD pipeline with GitHub Actions",
    stream=True
)

for chunk in response:
    print(chunk.text, end="", flush=True)

print()  # Newline at the end

JavaScript:

const result = await model.generateContentStream(
  "Write a detailed guide on setting up a CI/CD pipeline with GitHub Actions"
);

for await (const chunk of result.stream) {
  process.stdout.write(chunk.text());
}
console.log();

Step 8: Function Calling

Gemini 3 supports function calling, allowing the model to request specific actions:

import google.generativeai as genai

genai.configure(api_key="YOUR_API_KEY")

# Define tools
tools = [
    {
        "function_declarations": [
            {
                "name": "get_weather",
                "description": "Get the current weather for a location",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "location": {
                            "type": "string",
                            "description": "City name, e.g., 'San Francisco, CA'"
                        },
                        "unit": {
                            "type": "string",
                            "enum": ["celsius", "fahrenheit"],
                            "description": "Temperature unit"
                        }
                    },
                    "required": ["location"]
                }
            },
            {
                "name": "search_products",
                "description": "Search for products in the catalog",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "query": {
                            "type": "string",
                            "description": "Search query"
                        },
                        "max_results": {
                            "type": "integer",
                            "description": "Maximum number of results"
                        }
                    },
                    "required": ["query"]
                }
            }
        ]
    }
]

model = genai.GenerativeModel("gemini-3-pro", tools=tools)
chat = model.start_chat()

response = chat.send_message("What's the weather like in Tokyo?")

# Check if the model wants to call a function
for part in response.parts:
    if hasattr(part, "function_call"):
        function_name = part.function_call.name
        function_args = dict(part.function_call.args)
        print(f"Function call: {function_name}({function_args})")

        # Execute the function and return results
        if function_name == "get_weather":
            # Your actual API call here
            weather_result = {"temperature": 22, "condition": "partly cloudy"}

            response = chat.send_message({
                "function_response": {
                    "name": function_name,
                    "response": weather_result
                }
            })
            print(response.text)

Step 9: System Instructions

Set the model's behavior with system instructions:

model = genai.GenerativeModel(
    "gemini-3-pro",
    system_instruction="""You are a senior software engineer specializing
    in Python and cloud architecture. Always provide code examples.
    Prefer FastAPI over Flask. Use type hints in all Python code.
    Keep explanations concise and practical."""
)

response = model.generate_content("How do I implement rate limiting?")
print(response.text)

Step 10: Safety Settings

Configure content filtering thresholds:

from google.generativeai.types import HarmCategory, HarmBlockThreshold

model = genai.GenerativeModel("gemini-3-pro")

response = model.generate_content(
    "Your prompt here",
    safety_settings={
        HarmCategory.HARM_CATEGORY_HARASSMENT: HarmBlockThreshold.BLOCK_NONE,
        HarmCategory.HARM_CATEGORY_HATE_SPEECH: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE,
        HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT: HarmBlockThreshold.BLOCK_ONLY_HIGH,
        HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE,
    }
)

Pricing (February 2026)

Model	Input (per 1M tokens)	Output (per 1M tokens)	Context Caching
Gemini 3 Ultra	$10.00	$30.00	$2.50/hr
Gemini 3 Pro	$2.50	$10.00	$0.625/hr
Gemini 3 Flash	$0.15	$0.60	$0.0375/hr
Gemini 3 Flash Lite	$0.04	$0.15	N/A

Free tier: Google AI Studio provides a generous free tier with rate limits (15 RPM for Pro, 30 RPM for Flash).

Using the REST API Directly

If you prefer not to use the SDK:

curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-3-pro:generateContent?key=YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [{
      "parts": [{
        "text": "Write a Python function to validate email addresses"
      }]
    }],
    "generationConfig": {
      "temperature": 0.7,
      "topK": 40,
      "topP": 0.95,
      "maxOutputTokens": 2048
    }
  }'

Generation Config Options

Parameter	Default	Range	Description
`temperature`	1.0	0.0-2.0	Randomness of output
`topP`	0.95	0.0-1.0	Nucleus sampling threshold
`topK`	40	1-100	Top-k sampling
`maxOutputTokens`	Model-dependent	1-8192+	Maximum response length
`stopSequences`	[]	Up to 5	Strings that stop generation
`candidateCount`	1	1-8	Number of response candidates

Error Handling

import google.generativeai as genai
from google.api_core import exceptions

genai.configure(api_key="YOUR_API_KEY")
model = genai.GenerativeModel("gemini-3-pro")

try:
    response = model.generate_content("Your prompt")
    print(response.text)
except exceptions.InvalidArgument as e:
    print(f"Invalid request: {e}")
except exceptions.ResourceExhausted as e:
    print(f"Rate limit exceeded. Retry after a delay: {e}")
except exceptions.PermissionDenied as e:
    print(f"API key invalid or insufficient permissions: {e}")
except exceptions.InternalServerError as e:
    print(f"Google server error. Retry: {e}")
except Exception as e:
    # Check if content was blocked by safety filters
    if hasattr(response, 'prompt_feedback'):
        print(f"Content blocked: {response.prompt_feedback}")
    else:
        print(f"Unexpected error: {e}")

Best Practices

Use the right model: Flash for speed, Pro for quality, Ultra for complex reasoning
Set appropriate temperature: 0.0-0.3 for factual/code tasks, 0.7-1.0 for creative tasks
Use system instructions to set consistent behavior
Implement streaming for better user experience in chat interfaces
Cache context for repeated prompts to reduce costs (available on Pro and Ultra)
Handle safety filter blocks gracefully in your application
Implement exponential backoff for rate limit errors
Use function calling instead of parsing structured output from text

Conclusion

The Gemini 3 API offers a powerful and cost-effective option for building AI-powered applications. With its massive context window, strong multimodal capabilities, and competitive pricing (especially Flash), it is a solid choice for both prototyping and production use.

If you need AI media generation capabilities alongside language models -- image generation, video creation, talking avatars, or voice synthesis -- Hypereal AI provides a unified API that complements Gemini nicely. Use Gemini for text and reasoning, and Hypereal for visual and audio content, all through simple API calls with transparent pricing.

How to Use Gemini 3 API: Complete Developer Guide for 2026

This guide covers everything you need to get started: authentication, API calls in Python and JavaScript, multimodal inputs, streaming, function calling, and pricing.

Gemini 3 Model Variants

Model	Context Window	Best For	Speed
Gemini 3 Ultra	2M tokens	Complex reasoning, research, coding	Slow
Gemini 3 Pro	2M tokens	Balanced quality and speed	Medium
Gemini 3 Flash	1M tokens	Fast responses, high throughput	Fast
Gemini 3 Flash Lite	512K tokens	Cost-optimized, simple tasks	Very Fast

Prerequisites

A Google Cloud account or Google AI Studio account
An API key from Google AI Studio
Python 3.9+ or Node.js 18+
The Google Generative AI SDK

Step 1: Get Your API Key

The fastest way to get an API key:

Go to Google AI Studio
Click Get API Key
Select or create a Google Cloud project
Copy the generated API key

For production use, create a key in the Google Cloud Console under APIs & Services > Credentials.

Step 2: Install the SDK

Python:

pip install google-generativeai

JavaScript/Node.js:

npm install @google/generative-ai

Step 3: Basic Text Generation (Python)

import google.generativeai as genai

genai.configure(api_key="YOUR_API_KEY")

model = genai.GenerativeModel("gemini-3-pro")

response = model.generate_content("Explain quantum computing in simple terms")

print(response.text)

Step 4: Basic Text Generation (JavaScript)

import { GoogleGenerativeAI } from "@google/generative-ai";

const genAI = new GoogleGenerativeAI("YOUR_API_KEY");
const model = genAI.getGenerativeModel({ model: "gemini-3-pro" });

async function generateText() {
  const result = await model.generateContent(
    "Explain quantum computing in simple terms"
  );
  console.log(result.response.text());
}

generateText();

Step 5: Multi-Turn Conversations

Create a chat session for back-and-forth conversations:

Python:

import google.generativeai as genai

genai.configure(api_key="YOUR_API_KEY")
model = genai.GenerativeModel("gemini-3-pro")

chat = model.start_chat(history=[])

# First message
response = chat.send_message("I'm building a REST API with Python. What framework should I use?")
print(response.text)

# Follow-up
response = chat.send_message("Can you show me a basic FastAPI example with a POST endpoint?")
print(response.text)

# Another follow-up (context is maintained)
response = chat.send_message("Now add input validation with Pydantic")
print(response.text)

JavaScript:

const chat = model.startChat({
  history: [],
});

const result1 = await chat.sendMessage(
  "I'm building a REST API with Python. What framework should I use?"
);
console.log(result1.response.text());

const result2 = await chat.sendMessage(
  "Can you show me a basic FastAPI example with a POST endpoint?"
);
console.log(result2.response.text());

Step 6: Multimodal Input (Image + Text)

Gemini 3 excels at understanding images alongside text:

Python:

import google.generativeai as genai
from pathlib import Path

genai.configure(api_key="YOUR_API_KEY")
model = genai.GenerativeModel("gemini-3-pro")

# Load an image
image_path = Path("screenshot.png")
image_data = image_path.read_bytes()

response = model.generate_content([
    "Analyze this UI screenshot and suggest 3 specific improvements for accessibility and usability.",
    {
        "mime_type": "image/png",
        "data": image_data
    }
])

print(response.text)

JavaScript:

import fs from "fs";

const imageData = fs.readFileSync("screenshot.png");
const base64Image = imageData.toString("base64");

const result = await model.generateContent([
  "Analyze this UI screenshot and suggest 3 specific improvements.",
  {
    inlineData: {
      mimeType: "image/png",
      data: base64Image,
    },
  },
]);

console.log(result.response.text());

Step 7: Streaming Responses

For better user experience, stream responses token by token:

Python:

import google.generativeai as genai

genai.configure(api_key="YOUR_API_KEY")
model = genai.GenerativeModel("gemini-3-pro")

response = model.generate_content(
    "Write a detailed guide on setting up a CI/CD pipeline with GitHub Actions",
    stream=True
)

for chunk in response:
    print(chunk.text, end="", flush=True)

print()  # Newline at the end

JavaScript:

const result = await model.generateContentStream(
  "Write a detailed guide on setting up a CI/CD pipeline with GitHub Actions"
);

for await (const chunk of result.stream) {
  process.stdout.write(chunk.text());
}
console.log();

Step 8: Function Calling

Gemini 3 supports function calling, allowing the model to request specific actions:

import google.generativeai as genai

genai.configure(api_key="YOUR_API_KEY")

# Define tools
tools = [
    {
        "function_declarations": [
            {
                "name": "get_weather",
                "description": "Get the current weather for a location",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "location": {
                            "type": "string",
                            "description": "City name, e.g., 'San Francisco, CA'"
                        },
                        "unit": {
                            "type": "string",
                            "enum": ["celsius", "fahrenheit"],
                            "description": "Temperature unit"
                        }
                    },
                    "required": ["location"]
                }
            },
            {
                "name": "search_products",
                "description": "Search for products in the catalog",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "query": {
                            "type": "string",
                            "description": "Search query"
                        },
                        "max_results": {
                            "type": "integer",
                            "description": "Maximum number of results"
                        }
                    },
                    "required": ["query"]
                }
            }
        ]
    }
]

model = genai.GenerativeModel("gemini-3-pro", tools=tools)
chat = model.start_chat()

response = chat.send_message("What's the weather like in Tokyo?")

# Check if the model wants to call a function
for part in response.parts:
    if hasattr(part, "function_call"):
        function_name = part.function_call.name
        function_args = dict(part.function_call.args)
        print(f"Function call: {function_name}({function_args})")

        # Execute the function and return results
        if function_name == "get_weather":
            # Your actual API call here
            weather_result = {"temperature": 22, "condition": "partly cloudy"}

            response = chat.send_message({
                "function_response": {
                    "name": function_name,
                    "response": weather_result
                }
            })
            print(response.text)

Step 9: System Instructions

Set the model's behavior with system instructions:

model = genai.GenerativeModel(
    "gemini-3-pro",
    system_instruction="""You are a senior software engineer specializing
    in Python and cloud architecture. Always provide code examples.
    Prefer FastAPI over Flask. Use type hints in all Python code.
    Keep explanations concise and practical."""
)

response = model.generate_content("How do I implement rate limiting?")
print(response.text)

Step 10: Safety Settings

Configure content filtering thresholds:

from google.generativeai.types import HarmCategory, HarmBlockThreshold

model = genai.GenerativeModel("gemini-3-pro")

response = model.generate_content(
    "Your prompt here",
    safety_settings={
        HarmCategory.HARM_CATEGORY_HARASSMENT: HarmBlockThreshold.BLOCK_NONE,
        HarmCategory.HARM_CATEGORY_HATE_SPEECH: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE,
        HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT: HarmBlockThreshold.BLOCK_ONLY_HIGH,
        HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE,
    }
)

Pricing (February 2026)

Model	Input (per 1M tokens)	Output (per 1M tokens)	Context Caching
Gemini 3 Ultra	$10.00	$30.00	$2.50/hr
Gemini 3 Pro	$2.50	$10.00	$0.625/hr
Gemini 3 Flash	$0.15	$0.60	$0.0375/hr
Gemini 3 Flash Lite	$0.04	$0.15	N/A

Free tier: Google AI Studio provides a generous free tier with rate limits (15 RPM for Pro, 30 RPM for Flash).

Using the REST API Directly

If you prefer not to use the SDK:

curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-3-pro:generateContent?key=YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [{
      "parts": [{
        "text": "Write a Python function to validate email addresses"
      }]
    }],
    "generationConfig": {
      "temperature": 0.7,
      "topK": 40,
      "topP": 0.95,
      "maxOutputTokens": 2048
    }
  }'

Generation Config Options

Parameter	Default	Range	Description
`temperature`	1.0	0.0-2.0	Randomness of output
`topP`	0.95	0.0-1.0	Nucleus sampling threshold
`topK`	40	1-100	Top-k sampling
`maxOutputTokens`	Model-dependent	1-8192+	Maximum response length
`stopSequences`	[]	Up to 5	Strings that stop generation
`candidateCount`	1	1-8	Number of response candidates

Error Handling

import google.generativeai as genai
from google.api_core import exceptions

genai.configure(api_key="YOUR_API_KEY")
model = genai.GenerativeModel("gemini-3-pro")

try:
    response = model.generate_content("Your prompt")
    print(response.text)
except exceptions.InvalidArgument as e:
    print(f"Invalid request: {e}")
except exceptions.ResourceExhausted as e:
    print(f"Rate limit exceeded. Retry after a delay: {e}")
except exceptions.PermissionDenied as e:
    print(f"API key invalid or insufficient permissions: {e}")
except exceptions.InternalServerError as e:
    print(f"Google server error. Retry: {e}")
except Exception as e:
    # Check if content was blocked by safety filters
    if hasattr(response, 'prompt_feedback'):
        print(f"Content blocked: {response.prompt_feedback}")
    else:
        print(f"Unexpected error: {e}")

Best Practices

Use the right model: Flash for speed, Pro for quality, Ultra for complex reasoning
Set appropriate temperature: 0.0-0.3 for factual/code tasks, 0.7-1.0 for creative tasks
Use system instructions to set consistent behavior
Implement streaming for better user experience in chat interfaces
Cache context for repeated prompts to reduce costs (available on Pro and Ultra)
Handle safety filter blocks gracefully in your application
Implement exponential backoff for rate limit errors
Use function calling instead of parsing structured output from text

Start Building with Hypereal

How to Use Gemini 3 API: Complete Developer Guide for 2026

Gemini 3 Model Variants

Prerequisites

Step 1: Get Your API Key

Step 2: Install the SDK

Step 3: Basic Text Generation (Python)

Step 4: Basic Text Generation (JavaScript)

Step 5: Multi-Turn Conversations

Step 6: Multimodal Input (Image + Text)

Step 7: Streaming Responses

Step 8: Function Calling

Step 9: System Instructions

Step 10: Safety Settings

Pricing (February 2026)

Using the REST API Directly

Generation Config Options

Error Handling

Best Practices

Conclusion

Related Articles

Google Gemini 3 API Guide: Getting Started (2026)

How to Use Gemini 3.0 Pro with Cursor (2026)

Gemini CLI Setup Guide: Complete Installation (2026)

Start Building Today

Start Building with Hypereal

How to Use Gemini 3 API: Complete Developer Guide for 2026

Gemini 3 Model Variants

Prerequisites

Step 1: Get Your API Key

Step 2: Install the SDK

Step 3: Basic Text Generation (Python)

Step 4: Basic Text Generation (JavaScript)

Step 5: Multi-Turn Conversations

Step 6: Multimodal Input (Image + Text)

Step 7: Streaming Responses

Step 8: Function Calling

Step 9: System Instructions

Step 10: Safety Settings

Pricing (February 2026)

Using the REST API Directly

Generation Config Options

Error Handling

Best Practices

Conclusion

Related Articles

Google Gemini 3 API Guide: Getting Started (2026)

How to Use Gemini 3.0 Pro with Cursor (2026)

Gemini CLI Setup Guide: Complete Installation (2026)

Start Building Today