decisionshared last reviewed 2026-05-20

All LLM / vision / OCR traffic egresses via Aspire LLM Gateway only

Context

Before 2026-05-09, Aspire apps (Postiz, KO worker-ocr, Zac landing, agents) each held their own provider keys: Anthropic OAuth, cloud-first.ai LiteLLM, OpenAI SDK, etc. Result: scattered credentials, no central budget, no spend ledger, no ability to swap providers without per-app code changes, security incidents on any one Coolify host would expose multiple provider keys.

Detail

Options considered

Option	Pros	Cons
A — Single Aspire LiteLLM proxy at `llm.aspiredigital.group/v1`	One central budget, one set of upstream creds, OpenAI-compat clients drop in trivially, per-app virtual keys with budget caps	One more piece of infra to keep healthy
B — Keep per-app provider SDKs	Maximum flexibility	Credential sprawl, no central spend visibility, swap = code change
C — Use cloud-first.ai as-is (single vendor)	Already partially adopted	Single-vendor lock-in; doesn't cover Claude Max OAuth or ChatGPT Pro use

Decision

We chose: Aspire LLM Gateway (option A). Self-hosted LiteLLM proxy at https://llm.aspiredigital.group/v1, deployed on Coolify, unifying Claude Max OAuth + cloud-first.ai Qwen + (future) ChatGPT Pro behind one OpenAI-compatible endpoint.

Rationale

Per-app virtual keys (e.g., knowledge-os, postiz) with budget caps and audit trail.
Swap a provider behind an alias = config-only change, no code touch.
Spend ledger middleware enforces MONTHLY_BUDGET_AUD per app.
Master key + admin-only fix paths kept off application Coolify hosts.

Constraints we accepted

The gateway is a single point of failure: if llm.aspiredigital.group goes down, ALL downstream consumers stop. Mitigation: gateway runs on the Aspire Coolify host (already monitored).
No silent fallback to a different vendor if a model is down upstream — fail loudly.
New providers require a config change on the gateway, not in each app.

Revisit trigger

Gateway downtime exceeds 1 hour/month (SLA breach), OR a regulatory requirement forces per-app provider isolation (e.g., HIPAA-style separation).

Actions

[x] LiteLLM Phase 1 LIVE 2026-05-09 (gqo0jgnxdkdxkmipnqzx6ras on Coolify).
[x] Postiz migrated to gateway with virtual key sk-woK9Fvz... (2026-05-09).
[x] Knowledge OS worker-ocr migrated to gateway with virtual key alias knowledge-os (2026-05-17).
[ ] codex-shim sidecar for OpenAI gpt-* aliases (Phase 1b — currently DEFERRED).

🔗 Relationships

graph LR aspire_llm_gateway_only_egress["aspire-llm-gateway-only-egress"]:::self aspire_llm_gateway_only_egress --> aspire_llm_gateway["aspire-llm-gateway"] aspire_llm_gateway_only_egress --> knowledge_os_stage_1["knowledge-os-stage-1"] classDef self fill:#715EE3,color:#fff,stroke:#291F50;

Generated from the Knowledge OS markdown vault · diagrams via Mermaid · source of truth = .md