System Upgrades and Bootstrap
Relevant source files
- .taskfiles/Flux/Taskfile.yaml
- .taskfiles/bootstrap/Taskfile.yaml
- bootstrap/helmfile/apps.yaml
- kubernetes/apps/actions-runner-system/kustomization.yaml
- kubernetes/apps/actions-runner-system/namespace.yaml
- kubernetes/apps/cert-manager/kustomization.yaml
- kubernetes/apps/kube-system/kustomization.yaml
- kubernetes/apps/network/envoy-gateway/app/envoy.yaml
- kubernetes/apps/network/envoy-gateway/app/helmrelease.yaml
- kubernetes/apps/network/envoy-gateway/app/kustomization.yaml
- kubernetes/apps/network/envoy-gateway/app/scaledobject.yaml
- kubernetes/apps/network/kustomization.yaml
- kubernetes/apps/system-upgrade/kustomization.yaml
- kubernetes/apps/system-upgrade/tuppr/app/helmrelease.yaml
- kubernetes/apps/system-upgrade/tuppr/app/kustomization.yaml
- kubernetes/apps/system-upgrade/tuppr/app/ocirepository.yaml
- kubernetes/apps/system-upgrade/tuppr/ks.yaml
- kubernetes/apps/system-upgrade/tuppr/upgrades/kubernetesupgrade.yaml
- kubernetes/apps/system-upgrade/tuppr/upgrades/kustomization.yaml
- kubernetes/apps/system-upgrade/tuppr/upgrades/talosupgrade.yaml
- kubernetes/apps/system-upgrade/versions.env
- scripts/bootstrap-cluster.sh
- scripts/render-machine-config.sh
This page details the lifecycle management of the cluster, from the initial provisioning of Talos Linux and Kubernetes to the automated upgrade orchestration of the underlying system components. It focuses on the system-upgrade namespace and the automated bootstrap sequence that hands off control to Flux CD.
Bootstrap Process
The bootstrap process is a multi-stage sequence designed to bring a bare Talos Linux node to a fully functional GitOps-managed Kubernetes cluster. This process is orchestrated primarily through the bootstrap-cluster.sh script and a dedicated Taskfile.
Phase 1: Talos Provisioning
The initial phase involves applying machine configurations to Talos nodes. The apply_talos_config function in scripts/bootstrap-cluster.sh renders Jinja2 templates (e.g., controlplane.yaml.j2) into machine configurations using render-machine-config.sh and applies them via talosctl apply-configscripts/bootstrap-cluster.sh10-62 Once nodes are configured, bootstrap_talos triggers the initial cluster formation on a controller node scripts/bootstrap-cluster.sh65-80
Phase 2: CRD and Base Resource Installation
Before Flux can take over, certain Custom Resource Definitions (CRDs) and fundamental resources must exist.
- CRD Pre-installation: The
apply_crdsfunction useshelmfileto template OCI-based charts and applies only theCustomResourceDefinitionkinds using Server-Side Apply scripts/bootstrap-cluster.sh115-132 - Core Resources: Fundamental namespaces and secrets are applied via
apply_resourcesusing the template atbootstrap/resources.yaml.j2scripts/bootstrap-cluster.sh135-154
Phase 3: Flux Handover
The final bootstrap stage involves installing the Flux controllers and the initial GitOps configuration.
- Prometheus Operator CRDs: Applied manually to ensure observability primitives exist [ .taskfiles/Flux/Taskfile.yaml15-18](https://github.com/chaijunkin/home-ops/blob/b5f8d898/ .taskfiles/Flux/Taskfile.yaml#L15-L18)
- Flux Installation: The
bootstraptask applies the Flux kustomization and creates thesops-agesecret for decrypting Git-stored secrets .taskfiles/Flux/Taskfile.yaml19-21 - Cluster Sync: Once Flux is running, it reconciles the
cluster-appsKustomization to deploy the rest of the repository .taskfiles/Flux/Taskfile.yaml57
Bootstrap Logic Flow
The following diagram illustrates the execution flow within scripts/bootstrap-cluster.sh.
Bootstrap Execution Flow
[Flowchart Diagram]
Sources: scripts/bootstrap-cluster.sh174-197.taskfiles/Flux/Taskfile.yaml12-26
System Upgrades with TUPPR
System upgrades for Talos Linux and Kubernetes are managed by TUPPR (Talos/Kubernetes Upgrade Controller), located in the system-upgrade namespace kubernetes/apps/system-upgrade/kustomization.yaml4-11
Version Pinning
Versions for the core system components are centralized in a versions.env file. This allows Renovate to track and update versions automatically using specific datasources kubernetes/apps/system-upgrade/versions.env1-5
| Variable | Renovate Datasource | Target |
|---|---|---|
KUBERNETES_VERSION | docker:ghcr.io/siderolabs/kubelet | Kubernetes Binaries |
TALOS_VERSION | docker:ghcr.io/siderolabs/installer | Talos OS Image |
Upgrade Controllers
TUPPR utilizes two primary Custom Resources to manage the rollout of updates:
- TalosUpgrade: Manages the Talos OS version. It includes a
rebootMode(e.g.,powercycle) and health checks to ensure the cluster is stable before proceeding. For example, it checks thatvolsyncis not currently synchronizing data kubernetes/apps/system-upgrade/tuppr/upgrades/talosupgrade.yaml1-16 - KubernetesUpgrade: Manages the Kubernetes control plane and worker versions, ensuring they match the pinned
KUBERNETES_VERSIONkubernetes/apps/system-upgrade/tuppr/upgrades/kubernetesupgrade.yaml1-14
Upgrade Entity Mapping
This diagram maps the logical upgrade concepts to the specific code entities used in the system-upgrade namespace.
TUPPR Upgrade Entity Mapping
[Flowchart Diagram]
Sources: kubernetes/apps/system-upgrade/versions.env1-5kubernetes/apps/system-upgrade/tuppr/upgrades/talosupgrade.yaml1-9kubernetes/apps/system-upgrade/tuppr/app/ocirepository.yaml1-13
Technical Implementation Details
CRD Management
During bootstrap, CRDs are applied using --server-side and --force-conflicts. This is critical for large CRDs like those from kube-prometheus-stack or envoy-gateway which often exceed the size limits of standard kubectl applyscripts/bootstrap-cluster.sh123-126
Bootstrap Taskfile
The .taskfiles/Flux/Taskfile.yaml provides the bootstrap task which acts as the entry point for operators.
| Task | Purpose | Key Commands |
|---|---|---|
bootstrap | Initial Flux deployment | kubectl apply --kustomize ./bootstrap/flux |
apply | Manual Flux build/apply | flux build ks ... | kubectl apply --server-side |
reconcile | Force Git sync | flux reconcile kustomization cluster-apps --with-source |
github-deploy-key | Secret setup | sops --decrypt github-deploy-key.sops.yaml | kubectl apply |
Sources: .taskfiles/Flux/Taskfile.yaml11-72scripts/bootstrap-cluster.sh115-132