Infrastructure Provisioning

Relevant source files

This section provides an overview of the physical and virtual infrastructure layers of the home-ops repository. The infrastructure is managed through a combination of Terraform for virtualized resources, Ansible for host-level configuration, and Talos Linux for the Kubernetes operating system.

The provisioning process follows a layered approach:

  1. Host Configuration: Preparing the physical Proxmox host.
  2. Virtual Infrastructure: Provisioning VMs and cloud resources via Terraform.
  3. Operating System: Bootstrapping Talos Linux nodes.
  4. External Resilience: Deploying critical services outside the main cluster.

Infrastructure Architecture

The following diagram illustrates the relationship between the provisioning tools and the resulting infrastructure entities.

Provisioning Flow: Code to Entity

Sources:infrastructure/terraform/proxmox/providers.tf23-30infrastructure/terraform/proxmox/talos/config.tf92-102infrastructure/ansible/requirements.yaml1-10


2.1 Proxmox and Talos Node Provisioning

The core of the on-premises compute is a Proxmox VE cluster hosting Talos Linux VMs. Terraform is used to define the virtual hardware and the Talos machine configuration.

For details, see Proxmox and Talos Node Provisioning.

Sources:infrastructure/terraform/proxmox/talos/virtual-machines.tf1-149infrastructure/terraform/proxmox/talos/image.tf47-73infrastructure/terraform/proxmox/talos/machine-config/common.yaml.tftpl1-83


2.2 Ansible and Host Configuration

While the Kubernetes nodes are immutable Talos VMs, the underlying Proxmox host requires traditional configuration for storage and monitoring.

  • Storage Management: Ansible playbooks manage the physical ZFS pools and configure NFS/SMB exports for bulk storage (the /tank mount).
  • Host Monitoring: Deployment of node_exporter directly on the Proxmox host to allow the cluster’s Prometheus instance to scrape hardware metrics.
  • Requirements: The setup utilizes collections such as ansible.posix and community.generalinfrastructure/ansible/requirements.yaml3-8

For details, see Ansible and Host Configuration.

Sources:infrastructure/ansible/requirements.yaml1-19


2.3 Terraform Infrastructure-as-Code

Beyond the hypervisor, Terraform manages the global footprint of the home-ops project, including networking, identity, and off-site storage.

  • State Management: Terraform state is stored in Cloudflare R2 using the S3-compatible backend infrastructure/terraform/proxmox/providers.tf1-17
  • Cloudflare: Management of DNS records, Tunnels, and WAF rules.
  • Storage: Provisioning of S3 buckets on Backblaze B2 for off-site backups and local Garage S3 instances.
  • Identity: Configuration of Authentik resources (Providers, Applications, and Flows) via the Authentik Terraform provider.

For details, see Terraform Infrastructure-as-Code.

Sources:infrastructure/terraform/proxmox/providers.tf1-17infrastructure/terraform/proxmox/variables.tf127-131


2.4 Fly.io External Resilience Workloads

To ensure critical services remain available even if the primary Proxmox host is offline, specific workloads are deployed to Fly.io.

  • Resilience Pattern: Services like Gatus (monitoring) and Vaultwarden (passwords) are hosted externally to avoid circular dependencies during a total site failure.
  • Deployment: Managed via Taskfile commands (e.g., task fly:app:*) and defined using fly.toml configurations.
  • Gatus Cloud: The external Gatus instance monitors the public endpoints of the home cluster, providing an “outside-in” view of availability.

For details, see Fly.io External Resilience Workloads.


Resource Relationship Map

The following table summarizes the primary infrastructure components and their management tools.

ComponentProvider / ToolCode Reference
Hypervisor VMsbpg/proxmoxproxmox_virtual_environment_vminfrastructure/terraform/proxmox/talos/virtual-machines.tf1
K8s OS Configsiderolabs/talostalos_machine_configurationinfrastructure/terraform/proxmox/talos/config.tf45
OS ImagesTalos Image Factorytalos_image_factory_schematicinfrastructure/terraform/proxmox/talos/image.tf35
Remote StateCloudflare R2terraform.backend "s3"infrastructure/terraform/proxmox/providers.tf2
Host ServicesAnsibleinfrastructure/ansible/requirements.yamlinfrastructure/ansible/requirements.yaml1

Sources:infrastructure/terraform/proxmox/providers.tf1-32infrastructure/terraform/proxmox/talos/config.tf45-90infrastructure/terraform/proxmox/talos/virtual-machines.tf1-10