Self-Hosted with Docker

The lightest deployment path. Bring up the full Routero AI gateway with Docker Compose — suitable for on-prem, single-node production, air-gap evaluation, or as the base for a custom container orchestrator.


Minimum viable stack

The bare minimum to run the gateway: the proxy container + a Postgres database.

# docker-compose.minimal.yml
services:
  litellm:
    image: ghcr.io/filigrain/routero-proxy:latest
    ports:
      - "4000:4000"
    environment:
      MASTER_KEY: "your-master-key"
      DATABASE_URL: "postgresql://user:pass@db:5432/litellm"
    depends_on: [db]

  db:
    image: pgvector/pgvector:pg16
    environment:
      POSTGRES_DB: litellm
      POSTGRES_USER: user
      POSTGRES_PASSWORD: pass
    volumes:
      - pgdata:/var/lib/postgresql/data

volumes:
  pgdata:
docker compose -f docker-compose.minimal.yml up -d

The dashboard is available at http://localhost:4000/_experimental/out/. Add your first model and API key there.


The bundled docker-compose.yml includes the proxy, coworker spend-sync service, Postgres (pgvector), Redis, and Prometheus:

git clone https://github.com/Filigrain/llmrouter.git
cd llmrouter/cicd/compose
MASTER_KEY=your-secret-key docker compose up -d

Services started:

Service Port Purpose
litellm (proxy) 4000 Gateway — inference + management API
coworker 8001 Spend-sync worker (Redis → Postgres)
db 5432 Postgres + pgvector (keys, spend, config)
redis 6379 Rate limiting, key cache, spend queue
prometheus 9090 Metrics scraping

Enable Advanced Features (memory tier)

The memory tier (required for Memory-as-a-Service and semantic caching in Token Saving) runs as an optional Compose profile:

docker compose --profile semantic up -d

This adds:

  • neo4j — graph database for Cognee long-term memory
  • qdrant — vector store for semantic caching
  • redis-stack — Redis with vector search for semantic caching
  • mem0_db / cognee_db — dedicated Postgres instances for Mem0 and Cognee

Required environment variables

Variable Required Description
MASTER_KEY Yes Admin key for the proxy — keep secret
DATABASE_URL Yes Postgres connection string
REDIS_URL No Redis for caching and rate limiting
LITELLM_PROXY_BASE_URL No Public base URL of this instance (for memory/cache loopback)
USE_DDTRACE No Set to true to enable Datadog APM tracing
SEPARATE_HEALTH_APP No Run health endpoints on a separate port

Notes

  • The proxy and coworker images run as non-root users.
  • This deployment has no built-in HA or autoscaling — the customer is responsible for process supervision, DB durability, and horizontal scaling.
  • For production use with multiple replicas, use a shared external Postgres and Redis, and run multiple proxy containers behind a load balancer.
  • For the full AWS-native HA topology, see Self-Hosted on AWS.