How to Use Gemini 3 API: Complete Developer Guide (2026)
Integrate Google Gemini 3 into your applications with code examples
Start Building with Hypereal
Access Kling, Flux, Sora, Veo & more through a single API. Free credits to start, scale to millions.
No credit card required • 100k+ developers • Enterprise ready
How to Use Gemini 3 API: Complete Developer Guide for 2026
Google's Gemini 3 is the latest generation of their multimodal AI model family, bringing significant improvements in reasoning, code generation, multimodal understanding, and instruction following. Whether you are building chatbots, content generation tools, code assistants, or multimodal applications, the Gemini 3 API provides a powerful foundation.
This guide covers everything you need to get started: authentication, API calls in Python and JavaScript, multimodal inputs, streaming, function calling, and pricing.
Gemini 3 Model Variants
| Model | Context Window | Best For | Speed |
|---|---|---|---|
| Gemini 3 Ultra | 2M tokens | Complex reasoning, research, coding | Slow |
| Gemini 3 Pro | 2M tokens | Balanced quality and speed | Medium |
| Gemini 3 Flash | 1M tokens | Fast responses, high throughput | Fast |
| Gemini 3 Flash Lite | 512K tokens | Cost-optimized, simple tasks | Very Fast |
Prerequisites
- A Google Cloud account or Google AI Studio account
- An API key from Google AI Studio
- Python 3.9+ or Node.js 18+
- The Google Generative AI SDK
Step 1: Get Your API Key
The fastest way to get an API key:
- Go to Google AI Studio
- Click Get API Key
- Select or create a Google Cloud project
- Copy the generated API key
For production use, create a key in the Google Cloud Console under APIs & Services > Credentials.
Step 2: Install the SDK
Python:
pip install google-generativeai
JavaScript/Node.js:
npm install @google/generative-ai
Step 3: Basic Text Generation (Python)
import google.generativeai as genai
genai.configure(api_key="YOUR_API_KEY")
model = genai.GenerativeModel("gemini-3-pro")
response = model.generate_content("Explain quantum computing in simple terms")
print(response.text)
Step 4: Basic Text Generation (JavaScript)
import { GoogleGenerativeAI } from "@google/generative-ai";
const genAI = new GoogleGenerativeAI("YOUR_API_KEY");
const model = genAI.getGenerativeModel({ model: "gemini-3-pro" });
async function generateText() {
const result = await model.generateContent(
"Explain quantum computing in simple terms"
);
console.log(result.response.text());
}
generateText();
Step 5: Multi-Turn Conversations
Create a chat session for back-and-forth conversations:
Python:
import google.generativeai as genai
genai.configure(api_key="YOUR_API_KEY")
model = genai.GenerativeModel("gemini-3-pro")
chat = model.start_chat(history=[])
# First message
response = chat.send_message("I'm building a REST API with Python. What framework should I use?")
print(response.text)
# Follow-up
response = chat.send_message("Can you show me a basic FastAPI example with a POST endpoint?")
print(response.text)
# Another follow-up (context is maintained)
response = chat.send_message("Now add input validation with Pydantic")
print(response.text)
JavaScript:
const chat = model.startChat({
history: [],
});
const result1 = await chat.sendMessage(
"I'm building a REST API with Python. What framework should I use?"
);
console.log(result1.response.text());
const result2 = await chat.sendMessage(
"Can you show me a basic FastAPI example with a POST endpoint?"
);
console.log(result2.response.text());
Step 6: Multimodal Input (Image + Text)
Gemini 3 excels at understanding images alongside text:
Python:
import google.generativeai as genai
from pathlib import Path
genai.configure(api_key="YOUR_API_KEY")
model = genai.GenerativeModel("gemini-3-pro")
# Load an image
image_path = Path("screenshot.png")
image_data = image_path.read_bytes()
response = model.generate_content([
"Analyze this UI screenshot and suggest 3 specific improvements for accessibility and usability.",
{
"mime_type": "image/png",
"data": image_data
}
])
print(response.text)
JavaScript:
import fs from "fs";
const imageData = fs.readFileSync("screenshot.png");
const base64Image = imageData.toString("base64");
const result = await model.generateContent([
"Analyze this UI screenshot and suggest 3 specific improvements.",
{
inlineData: {
mimeType: "image/png",
data: base64Image,
},
},
]);
console.log(result.response.text());
Step 7: Streaming Responses
For better user experience, stream responses token by token:
Python:
import google.generativeai as genai
genai.configure(api_key="YOUR_API_KEY")
model = genai.GenerativeModel("gemini-3-pro")
response = model.generate_content(
"Write a detailed guide on setting up a CI/CD pipeline with GitHub Actions",
stream=True
)
for chunk in response:
print(chunk.text, end="", flush=True)
print() # Newline at the end
JavaScript:
const result = await model.generateContentStream(
"Write a detailed guide on setting up a CI/CD pipeline with GitHub Actions"
);
for await (const chunk of result.stream) {
process.stdout.write(chunk.text());
}
console.log();
Step 8: Function Calling
Gemini 3 supports function calling, allowing the model to request specific actions:
import google.generativeai as genai
genai.configure(api_key="YOUR_API_KEY")
# Define tools
tools = [
{
"function_declarations": [
{
"name": "get_weather",
"description": "Get the current weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City name, e.g., 'San Francisco, CA'"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "Temperature unit"
}
},
"required": ["location"]
}
},
{
"name": "search_products",
"description": "Search for products in the catalog",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "Search query"
},
"max_results": {
"type": "integer",
"description": "Maximum number of results"
}
},
"required": ["query"]
}
}
]
}
]
model = genai.GenerativeModel("gemini-3-pro", tools=tools)
chat = model.start_chat()
response = chat.send_message("What's the weather like in Tokyo?")
# Check if the model wants to call a function
for part in response.parts:
if hasattr(part, "function_call"):
function_name = part.function_call.name
function_args = dict(part.function_call.args)
print(f"Function call: {function_name}({function_args})")
# Execute the function and return results
if function_name == "get_weather":
# Your actual API call here
weather_result = {"temperature": 22, "condition": "partly cloudy"}
response = chat.send_message({
"function_response": {
"name": function_name,
"response": weather_result
}
})
print(response.text)
Step 9: System Instructions
Set the model's behavior with system instructions:
model = genai.GenerativeModel(
"gemini-3-pro",
system_instruction="""You are a senior software engineer specializing
in Python and cloud architecture. Always provide code examples.
Prefer FastAPI over Flask. Use type hints in all Python code.
Keep explanations concise and practical."""
)
response = model.generate_content("How do I implement rate limiting?")
print(response.text)
Step 10: Safety Settings
Configure content filtering thresholds:
from google.generativeai.types import HarmCategory, HarmBlockThreshold
model = genai.GenerativeModel("gemini-3-pro")
response = model.generate_content(
"Your prompt here",
safety_settings={
HarmCategory.HARM_CATEGORY_HARASSMENT: HarmBlockThreshold.BLOCK_NONE,
HarmCategory.HARM_CATEGORY_HATE_SPEECH: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE,
HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT: HarmBlockThreshold.BLOCK_ONLY_HIGH,
HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE,
}
)
Pricing (February 2026)
| Model | Input (per 1M tokens) | Output (per 1M tokens) | Context Caching |
|---|---|---|---|
| Gemini 3 Ultra | $10.00 | $30.00 | $2.50/hr |
| Gemini 3 Pro | $2.50 | $10.00 | $0.625/hr |
| Gemini 3 Flash | $0.15 | $0.60 | $0.0375/hr |
| Gemini 3 Flash Lite | $0.04 | $0.15 | N/A |
Free tier: Google AI Studio provides a generous free tier with rate limits (15 RPM for Pro, 30 RPM for Flash).
Using the REST API Directly
If you prefer not to use the SDK:
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-3-pro:generateContent?key=YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"contents": [{
"parts": [{
"text": "Write a Python function to validate email addresses"
}]
}],
"generationConfig": {
"temperature": 0.7,
"topK": 40,
"topP": 0.95,
"maxOutputTokens": 2048
}
}'
Generation Config Options
| Parameter | Default | Range | Description |
|---|---|---|---|
temperature |
1.0 | 0.0-2.0 | Randomness of output |
topP |
0.95 | 0.0-1.0 | Nucleus sampling threshold |
topK |
40 | 1-100 | Top-k sampling |
maxOutputTokens |
Model-dependent | 1-8192+ | Maximum response length |
stopSequences |
[] | Up to 5 | Strings that stop generation |
candidateCount |
1 | 1-8 | Number of response candidates |
Error Handling
import google.generativeai as genai
from google.api_core import exceptions
genai.configure(api_key="YOUR_API_KEY")
model = genai.GenerativeModel("gemini-3-pro")
try:
response = model.generate_content("Your prompt")
print(response.text)
except exceptions.InvalidArgument as e:
print(f"Invalid request: {e}")
except exceptions.ResourceExhausted as e:
print(f"Rate limit exceeded. Retry after a delay: {e}")
except exceptions.PermissionDenied as e:
print(f"API key invalid or insufficient permissions: {e}")
except exceptions.InternalServerError as e:
print(f"Google server error. Retry: {e}")
except Exception as e:
# Check if content was blocked by safety filters
if hasattr(response, 'prompt_feedback'):
print(f"Content blocked: {response.prompt_feedback}")
else:
print(f"Unexpected error: {e}")
Best Practices
- Use the right model: Flash for speed, Pro for quality, Ultra for complex reasoning
- Set appropriate temperature: 0.0-0.3 for factual/code tasks, 0.7-1.0 for creative tasks
- Use system instructions to set consistent behavior
- Implement streaming for better user experience in chat interfaces
- Cache context for repeated prompts to reduce costs (available on Pro and Ultra)
- Handle safety filter blocks gracefully in your application
- Implement exponential backoff for rate limit errors
- Use function calling instead of parsing structured output from text
Conclusion
The Gemini 3 API offers a powerful and cost-effective option for building AI-powered applications. With its massive context window, strong multimodal capabilities, and competitive pricing (especially Flash), it is a solid choice for both prototyping and production use.
If you need AI media generation capabilities alongside language models -- image generation, video creation, talking avatars, or voice synthesis -- Hypereal AI provides a unified API that complements Gemini nicely. Use Gemini for text and reasoning, and Hypereal for visual and audio content, all through simple API calls with transparent pricing.
Related Articles
Start Building Today
Get 35 free credits on signup. No credit card required. Generate your first image in under 5 minutes.
