Building a Self-Hosted AI Stack with Ollama and Docker
A practical guide to setting up a local AI development environment using Ollama, Open WebUI, and Docker Compose for privacy-first machine learning workflows.
Why Self-Host Your AI?
Running AI models locally gives you full control over your data, eliminates API costs, and allows you to experiment freely without rate limits. In this post, I’ll walk you through setting up a production-ready local AI stack.
The Stack
Our self-hosted AI environment consists of three key components:
- Ollama — Local model runtime for running LLMs
- Open WebUI — Beautiful chat interface for interacting with models
- Docker Compose — Orchestration for the entire stack
Docker Compose Configuration
Here’s the docker-compose.yml that ties everything together:
version: '3.8'
services:
ollama:
image: ollama/ollama:latest
container_name: ollama
volumes:
- ollama_data:/root/.ollama
ports:
- "11434:11434"
restart: unless-stopped
open-webui:
image: ghcr.io/open-webui/open-webui:main
container_name: open-webui
ports:
- "3000:8080"
environment:
- OLLAMA_BASE_URL=http://ollama:11434
depends_on:
- ollama
restart: unless-stopped
volumes:
ollama_data:
Pulling Your First Model
Once the stack is running, pull a model with:
docker exec -it ollama ollama pull llama3.2
This downloads the model weights locally. You can now interact with it through Open WebUI at http://localhost:3000.
Performance Tips
- GPU Passthrough — If you have an NVIDIA GPU, add the
deploysection with GPU resources - Model Selection — Start with smaller models (7B parameters) for faster inference
- Persistent Storage — Always use Docker volumes to persist model data
Conclusion
Self-hosting AI is more accessible than ever. With Docker and Ollama, you can have a fully functional AI development environment running in minutes.