Skip to content

Operations

This section covers everything that happens after you've deployed OrbitalReg and put real traffic through it.

Backup + disaster recovery

The two questions that should have answers before a release goes live:

  • Are my backups working? Backup verification documents the weekly-restore-into-ephemeral-cluster job, the Prometheus alert it fires when stale, and the admin UI card that surfaces the last-good-restore.
  • What do I do when the cluster is gone? Disaster recovery is the runbook. Three scenarios — DB loss, S3 loss, total loss — with copy-pasteable commands.

Postgres on CloudNativePG is the recommended production-shape Postgres: HA replicas, PITR, Barman-managed S3 backups. The chart also documents migrating from a stand-alone Postgres install.

Observability

Observability covers metrics, logs, and traces:

  • Prometheus metrics over /metrics (every chi handler is wrapped)
  • ServiceMonitor CRD shipped in the chart
  • JSON-structured logs to stdout (Loki + Promtail-friendly)
  • Optional OpenTelemetry traces via OTLP/HTTP (off by default)

Monitoring is the metrics deep-dive: the full metric catalogue, the Grafana import recipe for the bundled overview + deep-dive dashboards, three Alertmanager routing examples (PagerDuty / Slack / email-only) for the bundled alert + recording rules, sample PromQL for the five most common triage questions, and a section-per-alert runbook.

Versioning policy

Versioning policy covers Calendar Versioning (YYYY.MAJOR.MINOR), the daily update-channel poll, and the 18-month support window per year-major. The Admin Overview page shows the currently-installed version, when it was built, and whether an update is available.

Release pipeline

Release pipeline describes how every CalVer tag becomes three multi-arch container images on ghcr.io, signed with cosign-keyless OIDC and attested with a CycloneDX SBOM. The workflow runs entirely from GitHub Actions identity — no long-lived OrbitalReg key is involved — so a downstream auditor can verify provenance without a vendor handshake.

Demo seeder

Demo seeder (orbital seed) documents the test-user matrix the seeder writes alongside the demo projects, repos, and trust policies. Five gestaffelte access profiles (alice / bob / carol / dave / eve) on the default tier let an operator (or an integration test) walk the per-project RBAC matrix without an auth roundtrip; full-plus adds three richer profiles (token-only CI bot, SAML-asserted maintainer, second org admin) for sales walkthroughs.

Air-gapped mode

Air-gapped operations is the runbook for installs that can't reach the public internet:

  • Egress is blocked by default on fresh installs
  • Each integration (webhooks, OSV, Sigstore Rekor, telemetry, OTel) has its own opt-in toggle under Admin → System → Egress allowlist
  • Documentation, the chart, and container images all ship as air-gap-friendly bundles

Day-2 checklist

A short list of "things that should be enabled before you go-live":

ItemDoc
Postgres backups verified end-to-endBackup verification
At least one tested DR drill in the last 90 daysDisaster recovery
ServiceMonitor scraping the APIObservability
Alert rules with runbook linksObservability
TLS via cert-manager, with a Renewal monitor(operator's existing dashboard)
Air-gapped egress allowlist matches your security postureAir-gapped operations
At least one project-owner exists per projectCore concepts
Retention policies on long-tail reposCore concepts
Sigstore trust policies pinned for prod-deploy reposCore concepts

Released under the Apache-2.0 License.