How to Set Up Open WebUI with Ollama (2026)
Deploy a ChatGPT-like interface for your local AI models
Start Building with Hypereal
Access Kling, Flux, Sora, Veo & more through a single API. Free credits to start, scale to millions.
No credit card required • 100k+ developers • Enterprise ready
How to Set Up Open WebUI with Ollama (2026)
Open WebUI is an open-source, self-hosted web interface for interacting with large language models. Paired with Ollama -- a tool for running LLMs locally -- it gives you a ChatGPT-like experience that runs entirely on your own hardware. No API keys, no subscription fees, no data leaving your machine.
This guide walks you through the complete setup, from installing Ollama to configuring Open WebUI with advanced features.
Why Open WebUI + Ollama?
| Feature | ChatGPT | Open WebUI + Ollama |
|---|---|---|
| Cost | $20-200/month | Free (your hardware) |
| Privacy | Data sent to OpenAI | Everything stays local |
| Internet required | Yes | No (after setup) |
| Model choice | GPT-4o, o1 only | Any open-source model |
| Customization | Limited | Full control |
| Rate limits | Yes | No |
| Multi-user | No (per account) | Yes (built-in) |
Prerequisites
- Hardware: A computer with at least 8GB RAM. For good performance with larger models, 16GB+ RAM and a GPU with 8GB+ VRAM is recommended.
- OS: macOS, Linux, or Windows (WSL2 for Docker).
- Docker: Required for Open WebUI. Install from docker.com.
Step 1: Install Ollama
Ollama is the backend that downloads and runs AI models locally.
macOS
# Download and install from the website
# Or use Homebrew:
brew install ollama
Linux
curl -fsSL https://ollama.ai/install.sh | sh
Windows
Download the installer from ollama.com/download.
Verify installation
ollama --version
# Should output: ollama version 0.x.x
Step 2: Download Your First Model
Pull a model before setting up the UI:
# Recommended starter model (good balance of quality and speed)
ollama pull llama3.1:8b
# For more capable responses (needs 16GB+ RAM)
ollama pull llama3.3:70b
# For coding
ollama pull qwen2.5-coder:14b
# For fast, lightweight use
ollama pull phi4-mini
Model size guide
| Model | RAM Needed | VRAM Needed | Quality |
|---|---|---|---|
| phi4-mini (3.8B) | 4GB | 3GB | Good for simple tasks |
| llama3.1:8b | 8GB | 6GB | Good general purpose |
| qwen2.5-coder:14b | 12GB | 10GB | Great for coding |
| llama3.3:70b | 48GB | 40GB | Excellent all-around |
| deepseek-v3 (quantized) | 32GB+ | 24GB+ | Top-tier reasoning |
Test your model:
ollama run llama3.1:8b "What is the capital of France?"
If you get a response, Ollama is working correctly.
Step 3: Install Open WebUI with Docker
The easiest way to run Open WebUI is with Docker. One command does everything:
docker run -d \
-p 3000:8080 \
--add-host=host.docker.internal:host-gateway \
-v open-webui:/app/backend/data \
--name open-webui \
--restart always \
ghcr.io/open-webui/open-webui:main
This command:
- Runs Open WebUI on port 3000.
- Connects to your local Ollama instance automatically.
- Persists data (chats, settings, users) in a Docker volume.
- Restarts automatically if your computer reboots.
Alternative: Docker Compose
For more control, use a docker-compose.yml file:
version: "3.8"
services:
open-webui:
image: ghcr.io/open-webui/open-webui:main
container_name: open-webui
ports:
- "3000:8080"
volumes:
- open-webui:/app/backend/data
extra_hosts:
- "host.docker.internal:host-gateway"
environment:
- OLLAMA_BASE_URL=http://host.docker.internal:11434
- WEBUI_AUTH=true
- WEBUI_SECRET_KEY=your-secret-key-here
restart: always
volumes:
open-webui:
Start it:
docker compose up -d
Step 4: Initial Configuration
- Open your browser and go to
http://localhost:3000. - Create an admin account. The first user to register becomes the administrator.
- You should see your Ollama models listed in the model selector dropdown.
If models are not appearing:
- Verify Ollama is running:
ollama list - Check the Ollama URL in Open WebUI: Go to Admin Panel > Settings > Connections and confirm the URL is
http://host.docker.internal:11434.
Step 5: Pull Models from the UI
You can download new models directly from Open WebUI:
- Go to Admin Panel > Settings > Models.
- Enter a model name (e.g.,
qwen2.5:14b) in the "Pull a model" field. - Click the download button.
- Wait for the download to complete. Progress is shown in the UI.
Step 6: Configure Advanced Features
Enable Web Search
Open WebUI supports web search via several providers:
- Go to Admin Panel > Settings > Web Search.
- Enable web search.
- Choose a search engine (SearXNG for self-hosted, Google, Brave, etc.).
- Add your API key if required.
For a fully self-hosted solution, deploy SearXNG alongside Open WebUI:
# Add to docker-compose.yml
searxng:
image: searxng/searxng:latest
container_name: searxng
ports:
- "8888:8080"
volumes:
- ./searxng:/etc/searxng
restart: always
Then set the search URL in Open WebUI to http://searxng:8080.
Enable RAG (Document Chat)
Open WebUI has built-in RAG capabilities:
- In any chat, click the + button and upload a document (PDF, TXT, DOCX, etc.).
- Open WebUI will chunk, embed, and index the document.
- Ask questions about the document content.
For the embedding model, go to Admin Panel > Settings > Documents and configure:
- Embedding model:
nomic-embed-text(pull it via Ollama first) - Chunk size: 1000 (default is fine for most use cases)
- Chunk overlap: 200
# Pull the embedding model
ollama pull nomic-embed-text
Enable Image Generation
Connect Open WebUI to a local Stable Diffusion or DALL-E instance:
- Go to Admin Panel > Settings > Images.
- Choose your backend (Automatic1111, ComfyUI, or OpenAI-compatible).
- Enter the API URL (e.g.,
http://host.docker.internal:7860for Automatic1111).
Multi-User Setup
Open WebUI supports multiple users with role-based access:
- Go to Admin Panel > Users.
- Set the default role for new signups (user, pending, or admin).
- Manage individual user permissions.
- Each user gets their own chat history and settings.
This makes it perfect for teams, families, or classroom environments.
Step 7: Connect External APIs (Optional)
Open WebUI can also connect to remote APIs alongside Ollama:
OpenAI API
- Go to Admin Panel > Settings > Connections.
- Under "OpenAI API," add your API key.
- Models like GPT-4o will appear in the model selector alongside local Ollama models.
Any OpenAI-Compatible API
You can add any provider that uses the OpenAI format:
URL: https://api.groq.com/openai/v1
Key: your-groq-api-key
This lets you mix local models (via Ollama) and remote models (via APIs) in the same interface.
Performance Optimization
GPU Acceleration
Ensure Ollama is using your GPU:
# Check if GPU is detected
ollama run llama3.1:8b --verbose
# Look for "GPU" in the output
For NVIDIA GPUs, install the NVIDIA Container Toolkit for Docker GPU passthrough:
# Ubuntu/Debian
sudo apt-get install nvidia-container-toolkit
sudo systemctl restart docker
Memory Management
If you are running out of memory:
# Use quantized models (smaller, slightly lower quality)
ollama pull llama3.1:8b-q4_0 # 4-bit quantization, ~4GB
ollama pull llama3.1:8b-q8_0 # 8-bit quantization, ~8GB
Context Length
To increase context length for a model, create a custom Modelfile:
# Create a Modelfile
cat > Modelfile << 'EOF'
FROM llama3.1:8b
PARAMETER num_ctx 16384
PARAMETER temperature 0.7
SYSTEM You are a helpful coding assistant.
EOF
# Create the custom model
ollama create llama3.1-16k -f Modelfile
Troubleshooting
| Issue | Solution |
|---|---|
| "Ollama not connected" | Ensure Ollama is running (ollama serve). Check the connection URL in settings. |
| Models not loading | Check RAM. Use ollama ps to see running models. |
| Slow responses | Use a smaller model or enable GPU acceleration. |
| Docker permission denied | Add your user to the docker group: sudo usermod -aG docker $USER |
| Chat history lost | Ensure the Docker volume is persistent (-v open-webui:/app/backend/data). |
Wrapping Up
Open WebUI with Ollama gives you a fully private, customizable ChatGPT alternative running on your own hardware. The setup takes about 15 minutes, and once running, you have unlimited access to powerful AI models with no subscription fees, no rate limits, and no data privacy concerns.
If you need AI-generated media capabilities that go beyond text -- like image generation, video creation, or talking avatars -- try Hypereal AI free -- 35 credits, no credit card required. It complements your local LLM setup with cloud-powered media generation via a simple API.
Related Articles
Start Building Today
Get 35 free credits on signup. No credit card required. Generate your first image in under 5 minutes.
