All LLM / vision / OCR traffic egresses via Aspire LLM Gateway only
Context
Before 2026-05-09, Aspire apps (Postiz, KO worker-ocr, Zac landing, agents) each held their own provider keys: Anthropic OAuth, cloud-first.ai LiteLLM, OpenAI SDK, etc. Result: scattered credentials, no central budget, no spend ledger, no ability to swap providers without per-app code changes, security incidents on any one Coolify host would expose multiple provider keys.
Detail
Options considered
| Option | Pros | Cons |
|---|---|---|
A โ Single Aspire LiteLLM proxy at llm.aspiredigital.group/v1 | One central budget, one set of upstream creds, OpenAI-compat clients drop in trivially, per-app virtual keys with budget caps | One more piece of infra to keep healthy |
| B โ Keep per-app provider SDKs | Maximum flexibility | Credential sprawl, no central spend visibility, swap = code change |
| C โ Use cloud-first.ai as-is (single vendor) | Already partially adopted | Single-vendor lock-in; doesn't cover Claude Max OAuth or ChatGPT Pro use |
Decision
We chose: Aspire LLM Gateway (option A). Self-hosted LiteLLM proxy at https://llm.aspiredigital.group/v1, deployed on Coolify, unifying Claude Max OAuth + cloud-first.ai Qwen + (future) ChatGPT Pro behind one OpenAI-compatible endpoint.
Rationale
- Per-app virtual keys (e.g.,
knowledge-os,postiz) with budget caps and audit trail. - Swap a provider behind an alias = config-only change, no code touch.
- Spend ledger middleware enforces
MONTHLY_BUDGET_AUDper app. - Master key + admin-only fix paths kept off application Coolify hosts.
Constraints we accepted
- The gateway is a single point of failure: if
llm.aspiredigital.groupgoes down, ALL downstream consumers stop. Mitigation: gateway runs on the Aspire Coolify host (already monitored). - No silent fallback to a different vendor if a model is down upstream โ fail loudly.
- New providers require a config change on the gateway, not in each app.
Revisit trigger
Gateway downtime exceeds 1 hour/month (SLA breach), OR a regulatory requirement forces per-app provider isolation (e.g., HIPAA-style separation).
Actions
- [x] LiteLLM Phase 1 LIVE 2026-05-09 (
gqo0jgnxdkdxkmipnqzx6rason Coolify). - [x] Postiz migrated to gateway with virtual key
sk-woK9Fvz...(2026-05-09). - [x] Knowledge OS worker-ocr migrated to gateway with virtual key alias
knowledge-os(2026-05-17). - [ ] codex-shim sidecar for OpenAI gpt-* aliases (Phase 1b โ currently DEFERRED).