Release v0.1.4 is out — 100% open source

The ultra-fast AI Gateway written in Rust.

Melis unifies OpenAI, Anthropic, Google Vertex, OCI GenAI and Ollama behind a single OpenAI-compatible contract. Stateless, sub-2ms overhead, under 32Mi RSS — built for production LLMOps.

Apache-style open source Stateless & horizontally scalable Kubernetes native
Melis — Unified AI Gateway
Run Melis in one command
bash
docker run -d \
  --name melis-gateway \
  -p 9090:9090 \
  -v $(pwd)/config.yaml:/app/config.yaml:ro \
  -v $(pwd)/routes.yaml:/app/routes.yaml:ro \
  -e MELIS_SERVER_PORT=9090 \
  melis-gateway:latest

One contract — every major provider

OpenAIAnthropicGoogle Vertex AIOCI GenAIOllamaDeepSeekLlama 3

Features

Everything an LLM platform team needs — in one tiny Rust binary.

Move load balancing, circuit breaking, token compression and rate limiting out of your application code and into a high-performance infrastructure layer.

OpenAI-compatible contract

Exposes POST /v1/chat/completions. Melis transpiles payloads on the fly to each upstream provider's schema.

Sub-2ms overhead

Non-blocking async Rust core. Internal processing under 2ms with a memory footprint below 32Mi RSS.

Weighted multi-provider routing

Native support for openai, anthropic, google_vertex_ai, oci_genai and ollama with configurable traffic weights.

Adaptive context trimming

Tokenizes inputs locally and trims repetitive metadata before sending to the cloud — protect your token budget.

Enterprise resiliency

Distributed token-bucket rate limiting and circuit breaking with exponential backoff, orchestrated via Redis.

Cloud-native by design

Hot-reload routes.yaml, Prometheus /metrics, OpenTelemetry tracing and Kubernetes-compliant probes.

Architecture

Stateless. Horizontal. Production-grade.

Melis instances scale horizontally inside Kubernetes with no shared state. Volatile cluster state, blocklists and token-bucket counters live in an external high-speed Redis layer.

  • Pure reverse proxy
    Sits between your apps and any LLM provider.
  • Zero-impact migrations
    Swap providers in routes.yaml — zero application code changes.
  • Hot-reload config
    routes.yaml reloads within seconds without dropping active connections.
[ App Python (FastAPI) ] ──┐
                           ├──► [ Melis AI Gateway Pod ] ──► [ OpenAI / Claude / Gemini ]
[ App Java (Quarkus)   ] ──┘                │
                                            ▼
                               [ Redis ] ◄──┴──► [ Prometheus / OTel ]
<2ms
Gateway overhead
<32Mi
Memory RSS
100%
Open source

Declarative routing

Swap providers without touching application code.

Move from a costly OpenAI setup to a local Ollama Llama 3 model by editing a single YAML file. Melis intercepts, translates and streams responses natively.

routes.yaml
yaml
routes:
  - path: "/v1/chat/completions"
    method: "POST"
    provider: "ollama"         # Swapped from "openai" instantly
    model: "llama3.2"          # Overrides the payload target model
    token_optimization:
      strategy: "adaptive_trimming"
      compress_above_tokens: 4096
Your app — unchanged
python
from openai import OpenAI

client = OpenAI(base_url="http://melis:9090/v1", api_key="sk-anything")

resp = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello from Melis!"}],
)
print(resp.choices[0].message.content)

Deploy

First-class citizen of modern cloud infra.

Run as a standalone Docker container, or deploy natively to Kubernetes with ConfigMaps, Secrets and Horizontal Pod Autoscaling.

Docker standalone
bash
docker run -d \
  --name melis-gateway \
  -p 9090:9090 \
  -v $(pwd)/config.yaml:/app/config.yaml:ro \
  -v $(pwd)/routes.yaml:/app/routes.yaml:ro \
  -e MELIS_SERVER_PORT=9090 \
  melis-gateway:latest
Kubernetes native
bash
kubectl apply -f k8s/configmap.yaml
kubectl apply -f k8s/secret.yaml
kubectl apply -f k8s/deployment.yaml
kubectl apply -f k8s/service.yaml
kubectl apply -f k8s/hpa.yaml

Observability

Turn the third-party AI black box into a transparent stream of metrics.

Scrape /metrics from Prometheus, ship traces with OpenTelemetry and build Grafana dashboards your SRE team will actually trust.

Token volumetrics
Real-time tracking of input vs output tokens per API key or client.
Network performance
Isolated latency profiles — gateway overhead vs provider round-trip.
Resiliency lifecycles
Live circuit breaker status, failure ratios and fallback activations.
curl /metrics
bash
# HELP melis_request_duration_seconds Gateway overhead per request
# TYPE melis_request_duration_seconds histogram
melis_request_duration_seconds_bucket{provider="openai",le="0.002"} 18421
melis_tokens_total{provider="anthropic",direction="input"}   1284912
melis_tokens_total{provider="anthropic",direction="output"}   421038
melis_circuit_breaker_state{provider="google_vertex_ai"} 0
melis_ratelimit_drops_total{client="tenant-a"} 12
Forever open source

Built in the open. Run it anywhere.

Melis is — and will always be — 100% open source. No paid tier, no vendor lock-in, no "open core" surprises. Fork it, deploy it, contribute back.

Ship your AI features behind a real gateway.

Drop Melis in front of any OpenAI SDK and gain routing, resiliency, observability and cost control — without rewriting a line of application code.