Self-Hosted with Docker
The lightest deployment path. Bring up the full Routero AI gateway with Docker Compose — suitable for on-prem, single-node production, air-gap evaluation, or as the base for a custom container orchestrator.
Minimum viable stack
The bare minimum to run the gateway: the proxy container + a Postgres database.
# docker-compose.minimal.yml
services:
litellm:
image: ghcr.io/filigrain/routero-proxy:latest
ports:
- "4000:4000"
environment:
MASTER_KEY: "your-master-key"
DATABASE_URL: "postgresql://user:pass@db:5432/litellm"
depends_on: [db]
db:
image: pgvector/pgvector:pg16
environment:
POSTGRES_DB: litellm
POSTGRES_USER: user
POSTGRES_PASSWORD: pass
volumes:
- pgdata:/var/lib/postgresql/data
volumes:
pgdata:
docker compose -f docker-compose.minimal.yml up -d
The dashboard is available at http://localhost:4000/_experimental/out/. Add your first model and API key there.
Full stack (recommended for production)
The bundled docker-compose.yml includes the proxy, coworker spend-sync service, Postgres (pgvector), Redis, and Prometheus:
git clone https://github.com/Filigrain/llmrouter.git
cd llmrouter/cicd/compose
MASTER_KEY=your-secret-key docker compose up -d
Services started:
| Service | Port | Purpose |
|---|---|---|
litellm (proxy) |
4000 | Gateway — inference + management API |
coworker |
8001 | Spend-sync worker (Redis → Postgres) |
db |
5432 | Postgres + pgvector (keys, spend, config) |
redis |
6379 | Rate limiting, key cache, spend queue |
prometheus |
9090 | Metrics scraping |
Enable Advanced Features (memory tier)
The memory tier (required for Memory-as-a-Service and semantic caching in Token Saving) runs as an optional Compose profile:
docker compose --profile semantic up -d
This adds:
neo4j— graph database for Cognee long-term memoryqdrant— vector store for semantic cachingredis-stack— Redis with vector search for semantic cachingmem0_db/cognee_db— dedicated Postgres instances for Mem0 and Cognee
Required environment variables
| Variable | Required | Description |
|---|---|---|
MASTER_KEY |
Yes | Admin key for the proxy — keep secret |
DATABASE_URL |
Yes | Postgres connection string |
REDIS_URL |
No | Redis for caching and rate limiting |
LITELLM_PROXY_BASE_URL |
No | Public base URL of this instance (for memory/cache loopback) |
USE_DDTRACE |
No | Set to true to enable Datadog APM tracing |
SEPARATE_HEALTH_APP |
No | Run health endpoints on a separate port |
Notes
- The proxy and coworker images run as non-root users.
- This deployment has no built-in HA or autoscaling — the customer is responsible for process supervision, DB durability, and horizontal scaling.
- For production use with multiple replicas, use a shared external Postgres and Redis, and run multiple proxy containers behind a load balancer.
- For the full AWS-native HA topology, see Self-Hosted on AWS.