How to Use Perplexity AI API (2026)
Developer guide to integrating Perplexity's search-augmented AI into your applications
Start Building with Hypereal
Access Kling, Flux, Sora, Veo & more through a single API. Free credits to start, scale to millions.
No credit card required • 100k+ developers • Enterprise ready
How to Use Perplexity AI API (2026)
Perplexity AI provides an API that combines large language models with real-time web search. Unlike standard LLM APIs that rely on static training data, Perplexity's API retrieves current information from the web and synthesizes it into accurate, cited responses. This makes it ideal for applications that need up-to-date information: research tools, news aggregators, competitive analysis, customer support bots, and any product where freshness of information matters.
This guide covers the full API setup, all available models, code examples in multiple languages, and practical integration patterns.
Why Use the Perplexity API?
| Feature | Perplexity API | Standard LLM API |
|---|---|---|
| Real-time web search | Yes, built-in | No (static training data) |
| Source citations | Yes, with URLs | No |
| Knowledge cutoff | None (live search) | Months to years old |
| Hallucination rate | Lower (grounded in search) | Higher |
| Structured output | JSON mode supported | Varies by provider |
| OpenAI-compatible | Yes | Depends on provider |
Prerequisites
- A Perplexity account with API access.
- An API key from the Perplexity developer dashboard.
- Python 3.9+ or Node.js 18+ (for the examples below).
Step 1: Get Your API Key
- Go to perplexity.ai/settings/api.
- Click Generate API Key.
- Copy the key and store it securely.
- Add credits to your account (the API is prepaid, not postpaid).
Set the environment variable:
export PERPLEXITY_API_KEY="pplx-your-api-key-here"
Step 2: Make Your First Request
The Perplexity API is OpenAI-compatible, so you can use the OpenAI SDK:
pip install openai
from openai import OpenAI
client = OpenAI(
api_key="pplx-your-api-key-here",
base_url="https://api.perplexity.ai"
)
response = client.chat.completions.create(
model="sonar-pro",
messages=[
{
"role": "system",
"content": "You are a helpful research assistant. Provide accurate, well-cited answers."
},
{
"role": "user",
"content": "What are the latest developments in nuclear fusion energy in 2026?"
}
]
)
print(response.choices[0].message.content)
# Access citations
if hasattr(response, 'citations'):
print("\nSources:")
for citation in response.citations:
print(f" - {citation}")
Available Models
Perplexity offers several models optimized for different use cases:
| Model | Best For | Context Window | Search | Price (Input) | Price (Output) |
|---|---|---|---|---|---|
sonar-pro |
Complex research, detailed answers | 200K | Yes | $3/M tokens | $15/M tokens |
sonar |
General questions, fast responses | 128K | Yes | $1/M tokens | $1/M tokens |
sonar-reasoning-pro |
Multi-step analysis, comparisons | 128K | Yes | $2/M tokens | $8/M tokens |
sonar-reasoning |
Light reasoning with search | 128K | Yes | $1/M tokens | $5/M tokens |
sonar-deep-research |
Comprehensive reports, long-form | 128K | Yes (deep) | $2/M tokens | $8/M tokens |
Model selection guide:
- Use
sonarfor quick factual questions and lightweight queries. - Use
sonar-profor detailed research and complex questions. - Use
sonar-reasoning-profor tasks that require analysis and comparison. - Use
sonar-deep-researchfor comprehensive reports that need multiple search iterations.
Step 3: Using cURL
curl -X POST https://api.perplexity.ai/chat/completions \
-H "Authorization: Bearer pplx-your-api-key-here" \
-H "Content-Type: application/json" \
-d '{
"model": "sonar-pro",
"messages": [
{
"role": "user",
"content": "Compare the top 3 JavaScript frameworks in 2026 by popularity and performance"
}
],
"temperature": 0.2,
"max_tokens": 2000
}'
Step 4: Node.js / TypeScript
import OpenAI from 'openai';
const client = new OpenAI({
apiKey: process.env.PERPLEXITY_API_KEY,
baseURL: 'https://api.perplexity.ai',
});
async function search(query: string): Promise<string> {
const response = await client.chat.completions.create({
model: 'sonar-pro',
messages: [
{
role: 'system',
content: 'Provide concise, factual answers with source citations.',
},
{
role: 'user',
content: query,
},
],
temperature: 0.2,
max_tokens: 1500,
});
return response.choices[0].message.content ?? '';
}
// Usage
const result = await search('What is the current price of Bitcoin?');
console.log(result);
Step 5: Streaming Responses
For real-time response delivery:
from openai import OpenAI
client = OpenAI(
api_key="pplx-your-api-key-here",
base_url="https://api.perplexity.ai"
)
stream = client.chat.completions.create(
model="sonar",
messages=[
{
"role": "user",
"content": "What happened in tech news today?"
}
],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)
print() # Newline at the end
Step 6: Search Context Control
You can control how Perplexity searches the web:
response = client.chat.completions.create(
model="sonar-pro",
messages=[
{
"role": "user",
"content": "Latest Python 3.14 features and release date"
}
],
# Control search behavior
search_domain_filter=["python.org", "docs.python.org", "peps.python.org"],
search_recency_filter="week", # Options: "hour", "day", "week", "month"
return_citations=True,
return_related_questions=True
)
print(response.choices[0].message.content)
# Related questions the user might want to ask next
if hasattr(response, 'related_questions'):
print("\nRelated questions:")
for q in response.related_questions:
print(f" - {q}")
Practical Use Cases
1. Research Assistant
def research_topic(topic: str, depth: str = "standard") -> dict:
model = "sonar-deep-research" if depth == "deep" else "sonar-pro"
response = client.chat.completions.create(
model=model,
messages=[
{
"role": "system",
"content": """You are a research analyst. Provide:
1. Key findings (bullet points)
2. Recent developments (last 30 days)
3. Expert opinions
4. Data and statistics
Include citations for all claims."""
},
{
"role": "user",
"content": f"Research: {topic}"
}
],
return_citations=True,
max_tokens=3000
)
return {
"content": response.choices[0].message.content,
"citations": getattr(response, 'citations', []),
"model": model
}
result = research_topic("state of AI regulation in Europe 2026", depth="deep")
print(result["content"])
2. Competitive Intelligence
def analyze_competitor(company: str) -> str:
response = client.chat.completions.create(
model="sonar-reasoning-pro",
messages=[
{
"role": "user",
"content": f"""Analyze {company} as a competitor:
- Recent product launches
- Pricing changes
- Key partnerships
- Market position
- Strengths and weaknesses
Base your analysis on the most recent information available."""
}
],
search_recency_filter="month",
return_citations=True,
max_tokens=2500
)
return response.choices[0].message.content
print(analyze_competitor("Vercel"))
3. News Aggregation
def get_news_summary(topic: str) -> str:
response = client.chat.completions.create(
model="sonar",
messages=[
{
"role": "user",
"content": f"Summarize today's top 5 news stories about {topic}. Include sources."
}
],
search_recency_filter="day",
return_citations=True,
max_tokens=1500
)
return response.choices[0].message.content
print(get_news_summary("artificial intelligence"))
4. Fact-Checking API
def fact_check(claim: str) -> str:
response = client.chat.completions.create(
model="sonar-reasoning-pro",
messages=[
{
"role": "system",
"content": """You are a fact-checker. For each claim:
1. State whether it is TRUE, FALSE, PARTIALLY TRUE, or UNVERIFIABLE
2. Provide evidence from reliable sources
3. Cite your sources
Be objective and thorough."""
},
{
"role": "user",
"content": f"Fact-check this claim: {claim}"
}
],
return_citations=True,
max_tokens=2000
)
return response.choices[0].message.content
print(fact_check("The global average temperature in 2025 was the hottest year on record"))
Rate Limits and Pricing
| Tier | Requests/Min | Tokens/Min | Monthly Spend |
|---|---|---|---|
| Free trial | 5 | 20,000 | $0 (limited) |
| Standard | 50 | 100,000 | Pay as you go |
| Pro | 500 | 1,000,000 | $50+ |
| Enterprise | Custom | Custom | Contact sales |
Cost examples:
| Use Case | Model | Requests/Day | Est. Monthly Cost |
|---|---|---|---|
| Personal research | sonar | 50 | ~$5 |
| News aggregation | sonar | 500 | ~$30 |
| Competitive intel | sonar-pro | 100 | ~$50 |
| Production search app | sonar-pro | 5,000 | ~$500 |
Error Handling
from openai import OpenAI, APIError, RateLimitError, APIConnectionError
import time
client = OpenAI(
api_key="pplx-your-api-key-here",
base_url="https://api.perplexity.ai"
)
def query_with_retry(query: str, max_retries: int = 3) -> str:
for attempt in range(max_retries):
try:
response = client.chat.completions.create(
model="sonar",
messages=[{"role": "user", "content": query}],
max_tokens=1500
)
return response.choices[0].message.content
except RateLimitError:
wait_time = 2 ** attempt
print(f"Rate limited. Waiting {wait_time}s...")
time.sleep(wait_time)
except APIConnectionError:
print("Connection error. Retrying...")
time.sleep(1)
except APIError as e:
print(f"API error: {e}")
raise
raise Exception("Max retries exceeded")
Troubleshooting
| Issue | Solution |
|---|---|
| 401 Unauthorized | Check your API key is correct and has credits |
| 429 Rate Limited | Implement backoff; upgrade your tier |
| Empty citations | Not all responses include citations; use return_citations=True |
| Outdated information | Use search_recency_filter to limit to recent results |
| Slow responses | Use sonar instead of sonar-pro for faster results |
| Model not found | Check the model name; Perplexity updates model names periodically |
Conclusion
The Perplexity AI API is the best option available for building applications that need LLM responses grounded in real-time web data. Its OpenAI-compatible interface makes integration straightforward, and the built-in search and citation features eliminate the need to build a separate RAG pipeline for web content.
If you are building applications that also need AI-generated media -- images, videos, talking avatars, or voice synthesis -- check out Hypereal AI. Hypereal offers a unified REST API with pay-as-you-go pricing for generative media, making it easy to combine Perplexity's search intelligence with Hypereal's media generation in a single application.
Related Articles
Start Building Today
Get 35 free credits on signup. No credit card required. Generate your first image in under 5 minutes.
