Skip to content

Troubleshooting

The operator reports failures through status.conditions[type=Synced]. Each non-True state has a reason code; this page maps each to a resolution path.

Canonical reference: docs/operator/troubleshooting.md.

Inspecting status

bash
kubectl get orbitalregproject acme -o jsonpath='{.status.conditions}' | jq

Or the friendly summary the operator publishes alongside Synced:

bash
kubectl get orbitalregproject acme \
  -o custom-columns=NAME:.metadata.name,SYNCED:.status.conditions[?(@.type=="Synced")].status,REASON:.status.conditions[?(@.type=="Synced")].reason

Reason-code index

RefUnresolved

A spec.projectRef (or repositoryRef, serviceAccountRef) doesn't match a sibling CR's status.<id>.

Common causes:

  • The parent CR doesn't exist yet (race during initial GitOps apply)
  • The parent CR exists but hasn't reconciled yet (its status.<id> is still empty)
  • A typo in spec.<…>Ref.name

Diagnosis:

bash
kubectl get <parent-kind> <parent-name> \
  -o jsonpath='{.status.projectID}{"\n"}'

If empty, wait ~5s and re-check. Persistent emptiness means the parent CR is also failing — chase that one first.

CreateFailed

The upstream POST returned non-2xx. The full error body lives in status.conditions[Synced].message.

Common causes:

  • 403 Forbidden — the API token in the credentials Secret doesn't have admin scope. Mint a new admin token.
  • 409 Conflict — a row with the same deterministic key ((project_id, name), (repo_id, name), etc.) already exists upstream and was created by a different mechanism (UI, Terraform). Either delete the upstream row or change the CR's name to adopt it.
  • 422 Unprocessable Entity — a field failed server-side validation that the CRD schema didn't catch. Review the message body.

UpdateFailed

The upstream PATCH returned non-2xx. Same triage as CreateFailed, but you'll often see this when:

  • A field was made immutable upstream (a major-version migration)
  • The row's lock / version column conflicts with another writer

Reconciling

Transient. Lasts at most one reconcile interval. Persistent Reconciling means the controller is being killed mid-reconcile — inspect kubectl describe pod orbitalreg-operator-… for OOM kills or readiness-probe failures.

SecretWriteFailed (Token controller only)

The plaintext token was minted upstream but the Secret materialisation failed. The controller rolls back: it revokes the freshly-minted upstream row so the cluster doesn't end up with a credential the consumer can't read.

Common causes:

  • ServiceAccount RBAC missing secrets/create in the target namespace
  • securityContext.runAsNonRoot mismatch on the namespace's PodSecurity standard
  • The Secret name conflicts with a pre-existing one not owned by the CR

CredsInvalid

The configured API token (in credentials.existingSecret) is rejected on every call. Mint a new admin token and rotate the Secret. The controller picks up the new value at the next reconcile.

Common operational tripwires

Webhook subscription's HMAC secret won't update

The reconciler only re-pushes the secret when the referenced Kubernetes Secret's metadata.resourceVersion advances. If you edited the Secret data in place (which doesn't always advance resourceVersion), force a re-push by labelling the Secret:

bash
kubectl annotate secret <name> -n <ns> \
  orbitalreg.io/force-resync="$(date +%s)" --overwrite

Token rotated but Pod still has the old value

envFrom: secretRef: doesn't re-read the Secret on update — Pods see the new value only on restart. Either restart the Pod manually or configure a sidecar like reloader to do it for you.

CR stuck deleting

The finalizer hasn't fired (the controller is down or can't reach the API). Two options:

  1. Recommended — bring the controller back, let it finalize cleanly.
  2. Last resort — patch the finalizer off:
    bash
    kubectl patch orbitalregproject acme \
      -p '{"metadata":{"finalizers":[]}}' --type=merge
    This leaves the upstream row orphaned.

Reconcile loop in logs but no spec change

A drift detector returning false positives. Most often the JSONB round-trip on retention rules — submit a kubectl describe orbitalregretentionpolicy <name> and check that the rules block in the CR matches the rules block in the upstream API verbatim (whitespace included). The retention reconciler normalises whitespace-only rule drift away, but novel JSONB serialisation differences may still trip it.

Released under the Apache-2.0 License.