Open WebUI - Self-hosted AI interface platform
Deploy Open WebUI, the extensible self-hosted AI platform supporting Ollama, OpenAI API, RAG, voice/video calls, and multi-tenancy.
- Step 1
Overview
Open WebUI is a powerful, extensible AI interface that supports multiple LLM providers (Ollama, OpenAI API, Anthropic, Google Gemini, etc.) with built-in RAG, voice/video calling, Python function execution, and multi-tenant user management.
Key capabilities:
- Connect to Ollama or any OpenAI-compatible API endpoint
- Run AI inference locally with CUDA support (
:cudaimage) - Bundled Ollama support in a single container (
:ollamaimage) - RAG with 9 vector database options and multiple content extraction engines
- Voice input (Whisper) and TTS (Azure, ElevenLabs, OpenAI, local)
- Image generation (DALL-E, ComfyUI, AUTOMATIC1111)
- Python code execution sandbox (Pyodide)
- Collaborative editing (YJS + Prosemirror)
- Enterprise authentication (LDAP/AD, OAuth, SCIM 2.0)
- Horizontal scaling with Redis-backed sessions
- OpenTelemetry observability
- Step 2
Tech Stack Reference
Understanding the technology stack helps with troubleshooting and customization:
Frontend:
- SvelteKit (framework), Vite 5 (tooling)
- Tailwind CSS 4, Shiki (syntax highlighting)
- Socket.IO-client (real-time), i18next (internationalization)
- Pyodide (browser Python), YJS + Prosemirror (collaboration)
- Tiptap (rich text editor), Mermaid (diagrams), Katex (LaTeX)
- Chart.js, Leaflet, PDF.js, CodeMirror
Backend:
- FastAPI + uvicorn (async Python server)
- Pydantic 2, SQLAlchemy (async ORM), Alembic (migrations)
- Starsessions + Redis (session management)
- LangChain (RAG orchestration)
- sentence-transformers (embeddings)
- faster-whisper (STT), transformers, ONNX Runtime
- ChromaDB/PGVector/Qdrant/Milvus/Elasticsearch/OpenSearch/Pinecone (vector stores)
- boto3, azure-* SDKs, google-* SDKs (cloud integrations)
- OpenTelemetry (traces/metrics/logs)
Deployment:
- Docker/Docker Compose, Kubernetes (Helm, kustomize)
- PostgreSQL or SQLite (main database)
- Redis (session/caching for scale)
Relevant packages: Frontend (npm): @sveltejs/kit, @sveltejs/adapter-static, tailwindcss@4, tiptap, pyodide, yjs, @azure/msal-browser, socket.io-client Backend (pip): fastapi, uvicorn, pydantic, sqlalchemy[asyncio], langchain, chromadb, sentence-transformers, faster-whisper, transformers, onnxruntime, tiktoken, openai, anthropic, opentelemetry-sdk - Step 3
Quick Start with Docker (Ollama on same machine)
The simplest setup runs Open WebUI with Ollama accessible from your host machine. Port 3000 exposes the UI.
The
-v open-webui:/app/backend/datamount is critical — it persists your database, models, and documents. Without it, data is lost on container restart.# Start Open WebUI connecting to local Ollama docker run -d \ -p 3000:8080 \ --add-host=host.docker.internal:host-gateway \ -v open-webui:/app/backend/data \ --name open-webui \ --restart always \ ghcr.io/open-webui/open-webui:main # Access at http://localhost:3000 # Ensure Ollama is running locally: ollama serve # Verify: ollama list # shows available models⚠ Heads up: The `-v open-webui:/app/backend/data` volume mount is required. Without it, your database and document uploads will be lost when the container restarts. - Step 4
Quick Start (Ollama on different server)
If Ollama runs on a different machine, set
OLLAMA_BASE_URLto its address. Replace the URL with the exact host:port serving Ollama.# Connect to remote Ollama server docker run -d \ -p 3000:8080 \ -e OLLAMA_BASE_URL=http://192.168.1.100:11434 \ -v open-webui:/app/backend/data \ --name open-webui \ --restart always \ ghcr.io/open-webui/open-webui:main - Step 5
Quick Start with CUDA GPU acceleration
The
:cudatag includes PyTorch with CUDA for running embedding models and Whisper on a GPU. Requires NVIDIA Container Toolkit installed on the host.Prerequisites:
- Install NVIDIA Container Toolkit
- Restart Docker service
See NVIDIA docs for installation.
# First, ensure NVIDIA Container Toolkit is installed: sudo apt-get install -y nvidia-container-toolkit sudo systemctl restart docker # Then run with GPU: docker run -d \ -p 3000:8080 \ --gpus all \ --add-host=host.docker.internal:host-gateway \ -v open-webui:/app/backend/data \ --name open-webui \ --restart always \ ghcr.io/open-webui/open-webui:cuda - Step 6
One-Container setup (Open WebUI + Ollama bundled)
The
:ollamaimage bundles Ollama inside the container, eliminating external dependencies. Choose the command based on your hardware.Benefits:
- Single container, no external Ollama setup needed
- Ollama models persist in the
ollamavolume - GPU passthrough works with
--gpus=all
# With GPU support (recommended if you have an NVIDIA GPU): docker run -d \ -p 3000:8080 \ --gpus=all \ -v ollama:/root/.ollama \ -v open-webui:/app/backend/data \ --name open-webui \ --restart always \ ghcr.io/open-webui/open-webui:ollama # CPU-only mode: docker run -d \ -p 3000:8080 \ -v ollama:/root/.ollama \ -v open-webui:/app/backend/data \ --name open-webui \ --restart always \ ghcr.io/open-webui/open-webui:ollama - Step 7
Docker Compose setup (recommended for production)
Docker Compose is ideal for multi-container setups with persistent data and complex networking. Create
docker-compose.ymlin your project directory.version: '3.8' services: open-webui: image: ghcr.io/open-webui/open-webui:main ports: - "3000:8080" environment: - OLLAMA_BASE_URL=http://ollama:11434 volumes: - open-webui:/app/backend/data depends_on: - ollama restart: always ollama: image: ollama/ollama:latest ports: - "11434:11434" volumes: - ollama:/root/.ollama restart: always volumes: open-webui: ollama: - Step 8
Python pip installation
Install Open WebUI as a Python package. Use Python 3.11 for compatibility.
# Install pip install open-webui # Start the server open-webui serve # Access at http://localhost:8080 # Default Ollama endpoint: http://127.0.0.1:11434 # Optional: expose on network open-webui serve --host 0.0.0.0 --port 8080 - Step 9
Configuration options
Open WebUI is highly configurable via environment variables.
Model providers:
OLLAMA_BASE_URL— Ollama server endpointOPENAI_API_KEY— OpenAI or compatible API keyOPENAI_API_BASE_URL— Custom endpoint for non-OpenAI providers
Database:
DATABASE_URL— PostgreSQL connection string (default: SQLite)DB_ENCRYPT_KEY— Encryption key for SQLite (optional)
Vector database (RAG):
RAG_EMBEDDING_MODEL— Sentence transformer model (default:sentence-transformers/all-MiniLM-L6-v2)STORAGE_TYPE— Vector DB backend (chroma, qdrant, pgvector, etc.)
Security:
WEBUI_SECRET_KEY— Session secret (auto-generated if not set)ENABLE_SIGNUP— Allow user registration (default: true)
# Example with custom configuration: docker run -d \ -p 3000:8080 \ -e OLLAMA_BASE_URL=http://192.168.1.50:11434 \ -e OPENAI_API_KEY=sk-*** \ -e DATABASE_URL=postgresql://user:pass@postgres:5432/openwebui \ -e RAG_EMBEDDING_MODEL=intfloat/multilingual-e5-large \ -e ENABLE_SIGNUP=false \ -v open-webui:/app/backend/data \ --name open-webui \ ghcr.io/open-webui/open-webui:main - Step 10
PostgreSQL database setup
For production deployments, use PostgreSQL instead of SQLite for better performance and multi-instance support.
version: '3.8' services: open-webui: image: ghcr.io/open-webui/open-webui:main ports: - "3000:8080" environment: - OLLAMA_BASE_URL=http://ollama:11434 - DATABASE_URL=postgresql://openwebui:openwebui@postgres:5432/openwebui volumes: - open-webui:/app/backend/data depends_on: - ollama - postgres restart: always ollama: image: ollama/ollama:latest volumes: - ollama:/root/.ollama restart: always postgres: image: postgres:16-alpine environment: - POSTGRES_USER=openwebui - POSTGRES_PASSWORD=openwebui - POSTGRES_DB=openwebui volumes: - postgres-data:/var/lib/postgresql/data restart: always volumes: open-webui: ollama: postgres-data: - Step 11
Troubleshooting
Connection errors: If Open WebUI can't reach Ollama, use
--add-host=host.docker.internal:host-gatewayto map the Docker host.GPU not detected: For CUDA images, ensure
--gpus allis passed todocker run.Data persistence: Always mount
-v open-webui:/app/backend/datato prevent data loss.# Check container logs docker logs open-webui # Test Ollama connection from container docker exec -it open-webui sh wget -qO- http://host.docker.internal:11434/api/tags # Monitor resources docker stats open-webui # Fallback: use --network=host flag (port becomes 8080): docker run -d --network=host \ -v open-webui:/app/backend/data \ -e OLLAMA_BASE_URL=http://127.0.0.1:11434 \ --name open-webui \ ghcr.io/open-webui/open-webui:main - Step 12
Key Features
Pipelines Plugin Framework: Extend Open WebUI with custom Python code for rate limiting, logging, translation, content filtering.
Function Calling (Tools): Integrate external APIs and functions with LLMs.
RAG: Upload documents (PDF, DOCX, images) for contextual chat. Supports 15+ web search providers.
Web Browsing: Inject webpages into chat using
#<url>command.Image Generation: DALL-E, ComfyUI, AUTOMATIC1111.
Voice/Video Calls: STT (Whisper) and TTS (Azure, ElevenLabs, OpenAI).
Collaborative Editing: Real-time multi-user editing with YJS.
Python Sandbox: Execute Python in browser with Pyodide.
Feature requests
Sign in to suggest features or vote on existing ones.
No feature requests yet.
Discussion
Sign in to join the discussion.
No comments yet.