How to Use JSON Format Prompts for LLMs (2026)

Getting large language models to return clean, parseable JSON is one of the most practical skills for developers building AI-powered applications. Whether you are extracting structured data from text, building API responses, or creating tool-use pipelines, controlling the output format is essential. This guide covers every major technique for getting reliable JSON output from LLMs in 2026.

Why JSON Output Matters

When you integrate LLMs into applications, you need machine-readable output -- not freeform text. JSON is the standard interchange format for web applications, and being able to reliably get JSON from an LLM means you can:

Parse responses directly in your code without regex hacks
Feed structured data into databases, APIs, and downstream processes
Build multi-step agent workflows where each step produces structured output
Create type-safe interfaces between AI and your application logic

Method 1: System Prompt Instructions

The simplest approach is instructing the model to return JSON in the system prompt.

Basic JSON Prompt

from openai import OpenAI

client = OpenAI()

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {
            "role": "system",
            "content": """You are a data extraction assistant.
Always respond with valid JSON only. No markdown, no explanations,
no code fences. Just raw JSON."""
        },
        {
            "role": "user",
            "content": """Extract the product information from this text:

"The new MacBook Pro M4 starts at $1,599 for the base model with
16GB RAM and 512GB SSD. The 14-inch display version weighs 3.4 pounds.
It was released in November 2024 and is available in Space Black
and Silver colors."
"""
        }
    ]
)

import json
data = json.loads(response.choices[0].message.content)
print(data)

Expected output:

{
  "product_name": "MacBook Pro M4",
  "price": 1599,
  "currency": "USD",
  "ram": "16GB",
  "storage": "512GB SSD",
  "display_size": "14 inches",
  "weight": "3.4 pounds",
  "release_date": "November 2024",
  "colors": ["Space Black", "Silver"]
}

Providing a Schema in the Prompt

For more reliable output, include the exact schema you expect:

system_prompt = """You are a data extraction assistant. Extract information
and return it as JSON matching this exact schema:

{
  "product_name": "string",
  "price": number,
  "currency": "string (ISO 4217)",
  "specifications": {
    "ram": "string",
    "storage": "string",
    "display_size": "string",
    "weight": "string"
  },
  "release_date": "string (YYYY-MM-DD)",
  "colors": ["string"],
  "in_stock": boolean
}

Rules:
- Use null for any field where the information is not available
- Always return valid JSON with no trailing commas
- Never wrap the response in markdown code fences
"""

Method 2: OpenAI JSON Mode

OpenAI provides a built-in JSON mode that guarantees valid JSON output:

response = client.chat.completions.create(
    model="gpt-4o",
    response_format={"type": "json_object"},
    messages=[
        {
            "role": "system",
            "content": "Extract product data as JSON with keys: name, price, category."
        },
        {
            "role": "user",
            "content": "Sony WH-1000XM5 wireless headphones, $348, electronics"
        }
    ]
)

# Guaranteed to be valid JSON
data = json.loads(response.choices[0].message.content)

Important: When using json_object mode, you must mention "JSON" somewhere in your messages (system or user). Otherwise the API returns an error.

Method 3: Structured Outputs (Schema Enforcement)

The most reliable method in 2026 is structured outputs, where you provide a JSON Schema and the model is constrained to match it exactly.

OpenAI Structured Outputs

from pydantic import BaseModel
from typing import Optional
from openai import OpenAI

class Product(BaseModel):
    name: str
    price: float
    currency: str
    category: str
    in_stock: bool
    rating: Optional[float]
    tags: list[str]

client = OpenAI()

response = client.beta.chat.completions.parse(
    model="gpt-4o",
    messages=[
        {
            "role": "system",
            "content": "Extract product information from the user's text."
        },
        {
            "role": "user",
            "content": "The Dyson V15 vacuum is $749.99, home appliances category, currently in stock. Rated 4.7 stars. Tags: cordless, powerful, premium."
        }
    ],
    response_format=Product
)

product = response.choices[0].message.parsed
print(product.name)      # "Dyson V15"
print(product.price)     # 749.99
print(product.tags)      # ["cordless", "powerful", "premium"]

Anthropic Claude Structured Output

Claude supports structured output via tool use:

import anthropic

client = anthropic.Anthropic()

# Define the expected structure as a tool
tools = [
    {
        "name": "extract_product",
        "description": "Extract structured product information from text",
        "input_schema": {
            "type": "object",
            "properties": {
                "name": {"type": "string", "description": "Product name"},
                "price": {"type": "number", "description": "Price in USD"},
                "category": {"type": "string", "description": "Product category"},
                "in_stock": {"type": "boolean", "description": "Availability status"},
                "tags": {
                    "type": "array",
                    "items": {"type": "string"},
                    "description": "Product tags"
                }
            },
            "required": ["name", "price", "category", "in_stock", "tags"]
        }
    }
]

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    tools=tools,
    tool_choice={"type": "tool", "name": "extract_product"},
    messages=[
        {
            "role": "user",
            "content": "Extract product info: The Bose QuietComfort Ultra headphones cost $429, available in electronics. In stock. Tags: noise-cancelling, wireless, premium."
        }
    ]
)

# The response is guaranteed to match the schema
product_data = response.content[0].input
print(product_data)

Google Gemini Structured Output

import google.generativeai as genai

genai.configure(api_key="your-api-key")

model = genai.GenerativeModel("gemini-2.0-flash")

response = model.generate_content(
    "Extract product info: iPhone 16 Pro, $999, electronics, in stock",
    generation_config=genai.GenerationConfig(
        response_mime_type="application/json",
        response_schema={
            "type": "object",
            "properties": {
                "name": {"type": "string"},
                "price": {"type": "number"},
                "category": {"type": "string"},
                "in_stock": {"type": "boolean"}
            },
            "required": ["name", "price", "category", "in_stock"]
        }
    )
)

import json
data = json.loads(response.text)

Method 4: Few-Shot Examples

Providing examples of input/output pairs dramatically improves JSON consistency:

system_prompt = """You are a data extraction assistant. Given a product review,
extract structured information as JSON.

Example input: "The Samsung Galaxy S25 Ultra is amazing! Great camera but
pricey at $1,299. Battery lasts all day. 4.5/5 stars."

Example output:
{"product": "Samsung Galaxy S25 Ultra", "sentiment": "positive", "price": 1299, "pros": ["great camera", "all-day battery"], "cons": ["expensive"], "rating": 4.5}

Example input: "Disappointed with the Pixel 9. Overheats during gaming
and the $899 price is not justified. Camera is okay. 2.5/5."

Example output:
{"product": "Google Pixel 9", "sentiment": "negative", "price": 899, "pros": ["decent camera"], "cons": ["overheating", "overpriced"], "rating": 2.5}

Now extract from the user's review:"""

Method 5: JSON Repair for Unreliable Models

When using smaller or local models that sometimes produce malformed JSON, use a repair library:

# Install: pip install json-repair
import json_repair

raw_output = """
{
  "name": "Test Product",
  "price": 29.99,
  "tags": ["sale", "new",],  // trailing comma
  "description": "A great product
  with multiple lines"  // unescaped newline
}
"""

# Automatically fix common JSON errors
fixed = json_repair.loads(raw_output)
print(fixed)
# {"name": "Test Product", "price": 29.99, "tags": ["sale", "new"], "description": "A great product\nwith multiple lines"}

For more control, use a validation and retry pattern:

import json
from tenacity import retry, stop_after_attempt, retry_if_exception_type

@retry(
    stop=stop_after_attempt(3),
    retry=retry_if_exception_type(json.JSONDecodeError)
)
def get_json_response(client, prompt):
    response = client.chat.completions.create(
        model="gpt-4o",
        response_format={"type": "json_object"},
        messages=[
            {"role": "system", "content": "Respond with valid JSON only."},
            {"role": "user", "content": prompt}
        ]
    )
    # This will raise JSONDecodeError if invalid, triggering a retry
    return json.loads(response.choices[0].message.content)

Common JSON Prompting Patterns

Pattern 1: List Extraction

prompt = """Extract all people mentioned in this text as a JSON array.
Each person should have: name, role, and organization.

Text: "CEO Tim Cook announced Apple's new partnership with
Dr. Sarah Chen from MIT and CFO Luca Maestri presented the financials."

Return format: [{"name": "...", "role": "...", "organization": "..."}]"""

Pattern 2: Classification with Confidence

prompt = """Classify this customer support ticket. Return JSON with:
- category: one of ["billing", "technical", "account", "general"]
- priority: one of ["low", "medium", "high", "urgent"]
- confidence: float between 0 and 1
- reasoning: brief explanation

Ticket: "I've been charged twice for my subscription this month
and I need an immediate refund. This is the third time this has happened!"
"""

Expected output:

{
  "category": "billing",
  "priority": "urgent",
  "confidence": 0.97,
  "reasoning": "Double charge complaint with repeated occurrence and refund demand indicates urgent billing issue."
}

Pattern 3: Comparison Table as JSON

prompt = """Compare these two products and return structured JSON:

Product A: MacBook Air M4, $1,099, 13.6" display, 18hr battery, 8GB RAM
Product B: Dell XPS 13, $999, 13.4" display, 12hr battery, 16GB RAM

Return format:
{
  "products": [{"name": "...", "specs": {...}}],
  "comparison": [{"category": "...", "winner": "...", "reason": "..."}],
  "recommendation": "..."
}"""

Pattern 4: Multi-Step Extraction Pipeline

# Step 1: Extract entities
entities_prompt = """Extract all entities from this article as JSON:
{"people": [...], "organizations": [...], "locations": [...], "dates": [...]}"""

# Step 2: Extract relationships
relationships_prompt = """Given these entities: {entities}
Extract relationships as JSON:
[{"subject": "...", "predicate": "...", "object": "..."}]"""

# Step 3: Generate summary
summary_prompt = """Given these entities and relationships: {entities}, {relationships}
Generate a structured summary as JSON:
{"title": "...", "key_facts": [...], "timeline": [...]}"""

Tips for Reliable JSON Output

Comparison of Techniques

Technique	Reliability	Flexibility	Supported Models
System prompt instructions	Medium	High	All models
JSON mode (`response_format`)	High	Medium	OpenAI, some others
Structured outputs (schema)	Highest	Low (strict schema)	OpenAI, Claude (via tools), Gemini
Few-shot examples	Medium-High	High	All models
JSON repair libraries	Fallback	High	Any (post-processing)

Do's and Don'ts

Do	Don't
Provide a clear schema with field descriptions	Assume the model knows your desired format
Use structured outputs when available	Rely on prompt-only JSON for production systems
Validate output against a schema (e.g., Pydantic, Zod)	Parse JSON without error handling
Include examples of edge cases (null fields, empty arrays)	Ignore empty or null value handling
Use `json_repair` as a fallback	Parse raw output with `eval()` (security risk)
Test with varied inputs to catch format drift	Deploy without testing diverse inputs

Conclusion

Reliable JSON output from LLMs is a solved problem in 2026 if you use the right technique for your use case. For maximum reliability, use structured outputs with a schema. For flexibility, combine system prompt instructions with few-shot examples and validation. For production systems, always validate the output against a schema and implement retry logic.

For developers building applications that combine structured data extraction with AI-generated visual content, Hypereal AI offers a pay-as-you-go API for AI video generation, talking avatars, image creation, and voice cloning -- all returning structured JSON responses that integrate seamlessly into your existing data pipelines.