How to Use Perplexity AI API (2026)

Perplexity AI provides an API that combines large language models with real-time web search. Unlike standard LLM APIs that rely on static training data, Perplexity's API retrieves current information from the web and synthesizes it into accurate, cited responses. This makes it ideal for applications that need up-to-date information: research tools, news aggregators, competitive analysis, customer support bots, and any product where freshness of information matters.

This guide covers the full API setup, all available models, code examples in multiple languages, and practical integration patterns.

Why Use the Perplexity API?

Feature	Perplexity API	Standard LLM API
Real-time web search	Yes, built-in	No (static training data)
Source citations	Yes, with URLs	No
Knowledge cutoff	None (live search)	Months to years old
Hallucination rate	Lower (grounded in search)	Higher
Structured output	JSON mode supported	Varies by provider
OpenAI-compatible	Yes	Depends on provider

Prerequisites

A Perplexity account with API access.
An API key from the Perplexity developer dashboard.
Python 3.9+ or Node.js 18+ (for the examples below).

Step 1: Get Your API Key

Go to perplexity.ai/settings/api.
Click Generate API Key.
Copy the key and store it securely.
Add credits to your account (the API is prepaid, not postpaid).

Set the environment variable:

export PERPLEXITY_API_KEY="pplx-your-api-key-here"

Step 2: Make Your First Request

The Perplexity API is OpenAI-compatible, so you can use the OpenAI SDK:

pip install openai

from openai import OpenAI

client = OpenAI(
    api_key="pplx-your-api-key-here",
    base_url="https://api.perplexity.ai"
)

response = client.chat.completions.create(
    model="sonar-pro",
    messages=[
        {
            "role": "system",
            "content": "You are a helpful research assistant. Provide accurate, well-cited answers."
        },
        {
            "role": "user",
            "content": "What are the latest developments in nuclear fusion energy in 2026?"
        }
    ]
)

print(response.choices[0].message.content)

# Access citations
if hasattr(response, 'citations'):
    print("\nSources:")
    for citation in response.citations:
        print(f"  - {citation}")

Available Models

Perplexity offers several models optimized for different use cases:

Model	Best For	Context Window	Search	Price (Input)	Price (Output)
`sonar-pro`	Complex research, detailed answers	200K	Yes	$3/M tokens	$15/M tokens
`sonar`	General questions, fast responses	128K	Yes	$1/M tokens	$1/M tokens
`sonar-reasoning-pro`	Multi-step analysis, comparisons	128K	Yes	$2/M tokens	$8/M tokens
`sonar-reasoning`	Light reasoning with search	128K	Yes	$1/M tokens	$5/M tokens
`sonar-deep-research`	Comprehensive reports, long-form	128K	Yes (deep)	$2/M tokens	$8/M tokens

Model selection guide:

Use sonar for quick factual questions and lightweight queries.
Use sonar-pro for detailed research and complex questions.
Use sonar-reasoning-pro for tasks that require analysis and comparison.
Use sonar-deep-research for comprehensive reports that need multiple search iterations.

Step 3: Using cURL

curl -X POST https://api.perplexity.ai/chat/completions \
  -H "Authorization: Bearer pplx-your-api-key-here" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "sonar-pro",
    "messages": [
      {
        "role": "user",
        "content": "Compare the top 3 JavaScript frameworks in 2026 by popularity and performance"
      }
    ],
    "temperature": 0.2,
    "max_tokens": 2000
  }'

Step 4: Node.js / TypeScript

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: process.env.PERPLEXITY_API_KEY,
  baseURL: 'https://api.perplexity.ai',
});

async function search(query: string): Promise<string> {
  const response = await client.chat.completions.create({
    model: 'sonar-pro',
    messages: [
      {
        role: 'system',
        content: 'Provide concise, factual answers with source citations.',
      },
      {
        role: 'user',
        content: query,
      },
    ],
    temperature: 0.2,
    max_tokens: 1500,
  });

  return response.choices[0].message.content ?? '';
}

// Usage
const result = await search('What is the current price of Bitcoin?');
console.log(result);

Step 5: Streaming Responses

For real-time response delivery:

from openai import OpenAI

client = OpenAI(
    api_key="pplx-your-api-key-here",
    base_url="https://api.perplexity.ai"
)

stream = client.chat.completions.create(
    model="sonar",
    messages=[
        {
            "role": "user",
            "content": "What happened in tech news today?"
        }
    ],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

print()  # Newline at the end

Step 6: Search Context Control

You can control how Perplexity searches the web:

response = client.chat.completions.create(
    model="sonar-pro",
    messages=[
        {
            "role": "user",
            "content": "Latest Python 3.14 features and release date"
        }
    ],
    # Control search behavior
    search_domain_filter=["python.org", "docs.python.org", "peps.python.org"],
    search_recency_filter="week",  # Options: "hour", "day", "week", "month"
    return_citations=True,
    return_related_questions=True
)

print(response.choices[0].message.content)

# Related questions the user might want to ask next
if hasattr(response, 'related_questions'):
    print("\nRelated questions:")
    for q in response.related_questions:
        print(f"  - {q}")

Practical Use Cases

1. Research Assistant

def research_topic(topic: str, depth: str = "standard") -> dict:
    model = "sonar-deep-research" if depth == "deep" else "sonar-pro"

    response = client.chat.completions.create(
        model=model,
        messages=[
            {
                "role": "system",
                "content": """You are a research analyst. Provide:
                1. Key findings (bullet points)
                2. Recent developments (last 30 days)
                3. Expert opinions
                4. Data and statistics
                Include citations for all claims."""
            },
            {
                "role": "user",
                "content": f"Research: {topic}"
            }
        ],
        return_citations=True,
        max_tokens=3000
    )

    return {
        "content": response.choices[0].message.content,
        "citations": getattr(response, 'citations', []),
        "model": model
    }

result = research_topic("state of AI regulation in Europe 2026", depth="deep")
print(result["content"])

2. Competitive Intelligence

def analyze_competitor(company: str) -> str:
    response = client.chat.completions.create(
        model="sonar-reasoning-pro",
        messages=[
            {
                "role": "user",
                "content": f"""Analyze {company} as a competitor:
                - Recent product launches
                - Pricing changes
                - Key partnerships
                - Market position
                - Strengths and weaknesses
                Base your analysis on the most recent information available."""
            }
        ],
        search_recency_filter="month",
        return_citations=True,
        max_tokens=2500
    )

    return response.choices[0].message.content

print(analyze_competitor("Vercel"))

3. News Aggregation

def get_news_summary(topic: str) -> str:
    response = client.chat.completions.create(
        model="sonar",
        messages=[
            {
                "role": "user",
                "content": f"Summarize today's top 5 news stories about {topic}. Include sources."
            }
        ],
        search_recency_filter="day",
        return_citations=True,
        max_tokens=1500
    )

    return response.choices[0].message.content

print(get_news_summary("artificial intelligence"))

4. Fact-Checking API

def fact_check(claim: str) -> str:
    response = client.chat.completions.create(
        model="sonar-reasoning-pro",
        messages=[
            {
                "role": "system",
                "content": """You are a fact-checker. For each claim:
                1. State whether it is TRUE, FALSE, PARTIALLY TRUE, or UNVERIFIABLE
                2. Provide evidence from reliable sources
                3. Cite your sources
                Be objective and thorough."""
            },
            {
                "role": "user",
                "content": f"Fact-check this claim: {claim}"
            }
        ],
        return_citations=True,
        max_tokens=2000
    )

    return response.choices[0].message.content

print(fact_check("The global average temperature in 2025 was the hottest year on record"))

Rate Limits and Pricing

Tier	Requests/Min	Tokens/Min	Monthly Spend
Free trial	5	20,000	$0 (limited)
Standard	50	100,000	Pay as you go
Pro	500	1,000,000	$50+
Enterprise	Custom	Custom	Contact sales

Cost examples:

Use Case	Model	Requests/Day	Est. Monthly Cost
Personal research	sonar	50	~$5
News aggregation	sonar	500	~$30
Competitive intel	sonar-pro	100	~$50
Production search app	sonar-pro	5,000	~$500

Error Handling

from openai import OpenAI, APIError, RateLimitError, APIConnectionError
import time

client = OpenAI(
    api_key="pplx-your-api-key-here",
    base_url="https://api.perplexity.ai"
)

def query_with_retry(query: str, max_retries: int = 3) -> str:
    for attempt in range(max_retries):
        try:
            response = client.chat.completions.create(
                model="sonar",
                messages=[{"role": "user", "content": query}],
                max_tokens=1500
            )
            return response.choices[0].message.content

        except RateLimitError:
            wait_time = 2 ** attempt
            print(f"Rate limited. Waiting {wait_time}s...")
            time.sleep(wait_time)

        except APIConnectionError:
            print("Connection error. Retrying...")
            time.sleep(1)

        except APIError as e:
            print(f"API error: {e}")
            raise

    raise Exception("Max retries exceeded")

Troubleshooting

Issue	Solution
401 Unauthorized	Check your API key is correct and has credits
429 Rate Limited	Implement backoff; upgrade your tier
Empty citations	Not all responses include citations; use `return_citations=True`
Outdated information	Use `search_recency_filter` to limit to recent results
Slow responses	Use `sonar` instead of `sonar-pro` for faster results
Model not found	Check the model name; Perplexity updates model names periodically

Conclusion

The Perplexity AI API is the best option available for building applications that need LLM responses grounded in real-time web data. Its OpenAI-compatible interface makes integration straightforward, and the built-in search and citation features eliminate the need to build a separate RAG pipeline for web content.

If you are building applications that also need AI-generated media -- images, videos, talking avatars, or voice synthesis -- check out Hypereal AI. Hypereal offers a unified REST API with pay-as-you-go pricing for generative media, making it easy to combine Perplexity's search intelligence with Hypereal's media generation in a single application.