Skip to main content

Self-hosted deployment

Run Midcore services in your own infrastructure for air-gapped networks, data residency, or compliance. You operate the API, gate runner, and evidence ledger; clients (CLI, VS Code, Desktop) point to your instance.

When to self-host

  • Data must stay inside your network or region.
  • You need to use your own IdP and network policies.
  • Air-gapped or restricted environments.
  • Custom scaling and high availability requirements.

Components

A typical self-hosted setup includes:

ComponentRole
API / gatewayHandles auth, routing, and agent/gate requests
Gate runnerExecutes gates and writes evidence
Evidence ledgerAppend-only store for gate results
Optional: database and cachesState and performance; see deployment guide

Exact components and topology depend on the official deployment package or Helm chart. Refer to the deployment guide and release notes for your version.

Deployment checklist

  • Network — TLS for all endpoints; restrict egress/ingress as required.
  • Secrets — Store DB credentials, IdP client secrets, and API keys in a secret manager; never in images or Git.
  • Persistence — Evidence ledger and any DB need durable storage and backups.
  • Health — Configure health checks and readiness probes for API and gate runner.

Prerequisites

  • Kubernetes cluster (or supported runtime per the deployment guide).
  • Container registry access for Midcore images.
  • Identity provider (OIDC/SAML) if you use SSO.
  • TLS certificates for HTTPS.
  • Resource limits and storage for logs and evidence.

Config and secrets

Configure API URL, auth, and feature flags via environment variables or a config file. Store secrets (DB credentials, IdP client secrets, API keys) in a secret manager (e.g. Kubernetes Secrets, Vault) and inject them at runtime. Never bake secrets into images or commit them to Git. See Security and Environment variables.

Production Compose and registry images

docker-compose.prod.yaml pulls three application images built by GitHub Actions (.github/workflows/build-and-push.yml): ghcr.io/<owner>/<repo>/api, .../web, and .../auth. Set GHCR_IMAGE_SLUG in your environment (e.g. GHCR_IMAGE_SLUG=myorg/maestro-midcore) so it matches the lowercase GitHub owner/repo that publishes the packages. Use IMAGE_TAG for the digest or release tag.

Docker: web frontend and API

In a Docker Compose setup, the web frontend runs as a container alongside postgres, redis, and the API. The web image is built from the app and docs in apps/web. To apply any frontend or documentation changes to the running container, rebuild the web image and restart the web (and API) services—for example:

  • docker compose build web then docker compose --profile with-api up -d web api, or
  • Using the repo Makefile: make docker-rebuild-web.

After rebuilding, the web container serves the latest docs and UI. Ensure the API URL and auth endpoints are set correctly for the environment (see Environment variables).

Deployment guide

For step-by-step install, scaling, and upgrade instructions, use the deployment guide or Helm values that ship with your Midcore release. They define the exact resources and settings for a supported setup.

Automation daemon (scheduler · workers · watchdog)

Every API pod runs an AutomationDaemon that drives the scheduled / webhook-driven automation surface. The daemon is the reason scheduled and event-triggered work continues to run while the user's laptop is closed. Tune behavior via these env vars on the API deployment:

Env varDefaultEffect
MIDCORE_AUTOMATION_DAEMON1 (enabled)Set to "0" to disable the daemon on this pod (useful for test deploys).
MIDCORE_AUTOMATION_TICK_SECONDS30Scheduler tick period — how often automation_triggers.next_fire_at is scanned.
MIDCORE_AUTOMATION_WORKERS4Concurrent worker coroutines that claim queued runs via SELECT … FOR UPDATE SKIP LOCKED.
MIDCORE_AUTOMATION_HEARTBEAT_STALE180Seconds without heartbeat before the watchdog re-queues a run.

Horizontal scale is safe — multiple pods can run the daemon. The scheduler tick is guarded by a Postgres advisory lock so only one pod owns trigger evaluation; workers on every pod claim queued runs via SKIP LOCKED. Run with MIDCORE_AUTOMATION_DAEMON=0 on read-replica pods if you want only a subset of replicas to execute automations.

LLM provider keys (Anthropic / OpenAI / Gemini / xAI / DeepSeek)

Cloud LLM provider keys live in Azure Keyvault maestro-beta-kv-tjzct7wj and are mounted into the cluster via the Kubernetes Secret maestro-secrets. The model catalog (/api/v1/autonomy/models) reads them at startup plus on the daily APScheduler refresh.

  • Rotation: az keyvault secret set --vault-name maestro-beta-kv-tjzct7wj --name <provider>-api-key --value <new>, then kubectl rollout restart deploy/maestro-api -n maestro.
  • Windows kubectl: use kubelogin convert-kubeconfig -l azurecli after az login; without this the kubeconfig stays on a stale device-code token.
  • BYOM: tenants may store their own provider keys via the Studio settings UI; tenant keys land in the encrypted billing.api_keys table and override the cluster default for that tenant only.

Integration credentials (encrypted at rest)

Slack / Notion / HubSpot / Linear / X / LinkedIn / Gmail / Outlook OAuth tokens live in automation.agent_credentials, encrypted with Fernet via the EncryptedText SQLAlchemy TypeDecorator. The encryption key comes from MIDCORE_FIELD_ENCRYPTION_KEY (set in the same Kubernetes Secret as the LLM provider keys). Cross-tenant access is rejected at the resolver, not the route handler — every adapter call resolves the credential through CredentialsResolver.resolve(id, tenant_id) which validates the tenant boundary in one SQL query.

See Reference: 11 SaaS integrations for the per-adapter credential schema and Automation setup for end-to-end connect flows.

Grounded research backends

To unlock biographical / factual research (instead of having the agent refuse), configure at least one of these env vars on the API deployment:

Env varProviderNotes
BRAVE_SEARCH_API_KEYBrave SearchRecommended: fast, well-cited, generous free tier.
TAVILY_API_KEYTavilyGood fallback. Search-depth: advanced.
SEARXNG_INSTANCE_URLSearXNGSelf-hosted, no key. Best for air-gapped deployments.
MIDCORE_ALLOW_DDG_HTMLDuckDuckGo HTML scrapeDev fallback only. Set to "1" to enable.

Without any of these configured, biographical and factual prompts are refused rather than answered from training-data memory. See Reference: GroundedResearchEngine for the full anti-fabrication contract.

Signed desktop release pipeline

Windows installers ship signed via Azure Trusted Signing. The signing workstation retrieves the certificate at sign time; the cluster only needs to know how to flip the published version:

  • Re-sign + publish: pwsh ./scripts/build-sign-windows-onprem.ps1 (signing workstation).
  • Upload + flip: node ./scripts/publish-release.js --signed-msi <path> then kubectl set env deploy/maestro-api MIDCORE_CURRENT_VERSION=<version> -n maestro.
  • Verify: curl https://midcore.ai/api/v1/app/downloads — the new version + SHA-256 should be visible immediately after the env flip.

Offline and BYOM add-ons

Two recurring add-ons can be activated on top of a Cloud subscription and govern what your self-hosted or hybrid deployment is entitled to do:

  • Offline stack add-on — required to run the desktop IDE and self-hosted backend in offline-capable mode while keeping a Cloud subscription active. Enable it from the Billing page; entitlements sync to your tenant within a few minutes.
  • BYOM add-on — lets your tenant route LLM calls through your own provider keys (OpenAI, Anthropic, Ollama, or any compatible endpoint). The CLI and Desktop pick up BYOM credentials from your tenant once the add-on is active.

Both add-ons are managed centrally in Billing and surface as feature flags on every workspace; no extra deployment step is required to switch them on once entitlements are present.

Security · Authentication · Environment variables