How to Use Veo 3.1 API for Video Generation (2026)

Google's Veo 3.1 represents a significant leap in AI video generation. Building on the success of Veo 3, the 3.1 release brings higher resolution output, improved temporal consistency, better audio synchronization, and longer clip durations. This guide walks you through everything you need to know to integrate Veo 3.1 into your applications.

What Is Veo 3.1?

Veo 3.1 is Google DeepMind's latest video generation model. It accepts text prompts or image inputs and produces high-quality video clips with natural motion, realistic lighting, and optional synchronized audio. Key improvements over Veo 3 include:

Up to 1080p output at 24 or 30 fps
Clips up to 30 seconds (up from 8 seconds in Veo 3)
Native audio generation with improved lip-sync accuracy
Image-to-video mode for animating still frames
Better prompt adherence with fine-grained scene control

Veo 3.1 API Access Options

There are two primary ways to access the Veo 3.1 API:

Provider	Endpoint	Free Tier	Pricing Model
Google Vertex AI	`generativelanguage.googleapis.com`	Limited free credits	Per-second billing
Hypereal AI	`api.hypereal.com`	Free starter credits	Pay-as-you-go

Option 1: Google Vertex AI (Direct)

Access Veo 3.1 through Google Cloud's Vertex AI platform. You will need a Google Cloud account with billing enabled.

# Install the Google Cloud CLI
curl https://sdk.cloud.google.com | bash
gcloud init

# Enable the Vertex AI API
gcloud services enable aiplatform.googleapis.com

# Set your project
gcloud config set project YOUR_PROJECT_ID

Option 2: Hypereal AI (Simplified)

Hypereal AI provides a unified API that includes Veo 3.1 alongside other video generation models, with simpler authentication and pay-as-you-go pricing.

# Get your API key from https://hypereal.ai
# No Google Cloud setup required

Step-by-Step: Generate Video with Veo 3.1

Step 1: Authentication

Using Google Vertex AI:

import google.auth
from google.auth.transport.requests import Request

credentials, project = google.auth.default()
credentials.refresh(Request())
access_token = credentials.token

Using Hypereal AI:

import requests

API_KEY = "your_hypereal_api_key"
BASE_URL = "https://api.hypereal.com/v1"

Step 2: Text-to-Video Generation

Here is a complete example using the REST API directly:

import requests
import time

# Submit a video generation request
response = requests.post(
    f"{BASE_URL}/veo-3.1/generate",
    headers={
        "Authorization": f"Bearer {API_KEY}",
        "Content-Type": "application/json"
    },
    json={
        "prompt": "A golden retriever running through a sunlit meadow, "
                  "wildflowers swaying in the breeze, cinematic slow motion, "
                  "shallow depth of field, 4K quality",
        "duration": 10,
        "resolution": "1080p",
        "fps": 24,
        "aspect_ratio": "16:9",
        "audio": True
    }
)

task = response.json()
task_id = task["id"]
print(f"Task submitted: {task_id}")

# Poll for completion
while True:
    status = requests.get(
        f"{BASE_URL}/tasks/{task_id}",
        headers={"Authorization": f"Bearer {API_KEY}"}
    ).json()

    if status["state"] == "completed":
        video_url = status["output"]["video_url"]
        print(f"Video ready: {video_url}")
        break
    elif status["state"] == "failed":
        print(f"Error: {status['error']}")
        break

    print(f"Status: {status['state']} ({status.get('progress', 0)}%)")
    time.sleep(5)

Step 3: Image-to-Video Generation

Animate a still image using Veo 3.1's image-to-video mode:

import base64

# Read your source image
with open("product_photo.jpg", "rb") as f:
    image_b64 = base64.b64encode(f.read()).decode()

response = requests.post(
    f"{BASE_URL}/veo-3.1/image-to-video",
    headers={
        "Authorization": f"Bearer {API_KEY}",
        "Content-Type": "application/json"
    },
    json={
        "image": image_b64,
        "prompt": "The camera slowly zooms in while the product rotates, "
                  "soft studio lighting, white background",
        "duration": 6,
        "resolution": "1080p",
        "motion_intensity": 0.5
    }
)

task_id = response.json()["id"]
print(f"Image-to-video task: {task_id}")

Prompt Engineering Tips for Veo 3.1

Writing effective prompts is critical for getting high-quality results. Here are proven patterns:

Technique	Example	Effect
Specify camera motion	"Slow dolly forward through a forest"	Controls virtual camera movement
Define lighting	"Golden hour sunlight, long shadows"	Sets mood and visual tone
Include style keywords	"Cinematic, shallow depth of field, 4K"	Improves visual quality
Describe motion	"Waves crashing in slow motion"	Controls pacing and dynamics
Set scene context	"Interior of a modern coffee shop, morning"	Grounds the generation
Add audio cues	"Birds chirping, gentle wind sounds"	Guides audio generation

Prompt Structure Formula

A well-structured prompt follows this pattern:

[Subject] + [Action] + [Setting] + [Camera/Style] + [Lighting] + [Audio]

Example:

"A chef plating a dessert with precise hand movements,
in a professional kitchen with stainless steel surfaces,
close-up shot with rack focus,
warm overhead lighting,
sounds of kitchen ambient noise and plating"

Veo 3.1 vs Veo 3: What Changed

Feature	Veo 3	Veo 3.1
Max duration	8 seconds	30 seconds
Max resolution	720p	1080p
Audio generation	Basic	Improved lip-sync and ambient
Image-to-video	Limited	Full support with motion control
Prompt length	512 tokens	1024 tokens
Batch processing	No	Yes (up to 4 concurrent)
Generation speed	~120s for 8s clip	~90s for 8s clip

Handling Errors and Rate Limits

The API uses standard HTTP status codes. Here is how to handle common scenarios:

import time
from requests.exceptions import HTTPError

def generate_with_retry(payload, max_retries=3):
    for attempt in range(max_retries):
        try:
            response = requests.post(
                f"{BASE_URL}/veo-3.1/generate",
                headers={
                    "Authorization": f"Bearer {API_KEY}",
                    "Content-Type": "application/json"
                },
                json=payload
            )
            response.raise_for_status()
            return response.json()

        except HTTPError as e:
            if e.response.status_code == 429:
                wait = int(e.response.headers.get("Retry-After", 30))
                print(f"Rate limited. Waiting {wait}s...")
                time.sleep(wait)
            elif e.response.status_code == 400:
                print(f"Bad request: {e.response.json()['error']}")
                raise
            else:
                print(f"Error {e.response.status_code}, retrying...")
                time.sleep(2 ** attempt)

    raise Exception("Max retries exceeded")

Common Rate Limits

Tier	Requests/minute	Concurrent jobs	Max duration/request
Free	5	1	10 seconds
Standard	30	4	30 seconds
Enterprise	Custom	Custom	30 seconds

Best Practices

Cache results. Video generation is expensive. Store generated videos and reuse them.
Use webhooks. Instead of polling, configure a webhook URL to get notified when generation completes.
Start short. Test your prompts with 4-second clips before generating longer videos.
Be specific. Vague prompts produce inconsistent results. Describe exactly what you want.
Handle failures gracefully. Content filters may reject certain prompts. Always implement error handling.

Conclusion

Veo 3.1 is one of the most capable video generation APIs available in 2026. Whether you are building a content creation platform, automating marketing video production, or adding AI video to your product, the API is straightforward to integrate once you understand the authentication and async task pattern.

If you want simplified access to Veo 3.1 alongside dozens of other video, image, and audio generation models through a single API key, Hypereal AI offers pay-as-you-go pricing with no minimum commitment and free starter credits to test your integration.