Storage and Data Management

Relevant source files

This page provides a high-level overview of the multi-tier storage architecture used in the home-ops cluster. The system is designed to balance high-performance local storage for databases, scalable NFS-backed bulk storage for media, and distributed S3-compatible object storage for application data and backups. Data durability is ensured through an automated pipeline of snapshots, local backups, and off-site replication.

Storage Architecture Overview

The cluster utilizes a tiered approach to storage, mapping different hardware and protocol capabilities to specific application needs.

  1. Local CSI (High Performance): Managed by democratic-csi, providing local-hostpath storage for latency-sensitive applications like databases.
  2. NFS/Bulk Storage: Connects to external fileservers (e.g., /tank/Apps on smb.cloudjur.com) for high-capacity requirements.
  3. S3 Object Storage: Distributed S3 API provided by Garage for internal services and public web hosting.
  4. Backup & Replication: Automated volume lifecycle management using VolSync and the K8s Snapshot Controller.

System Mapping: Storage Components to Code

The following diagram illustrates how storage abstractions in the cluster map to specific Helm releases and configurations in the codebase.

Storage Provider Mapping

[Flowchart Diagram]

Sources:


5.1 VolSync and Volume Backup

The cluster implements a robust backup strategy using the VolSync operator. It automates the creation of snapshots and synchronizes data to multiple destinations. This subsystem relies on the snapshot-controller to interface with the democratic-csi driver for point-in-time local copies.

  • Local Backups: Uses the Kopia method to back up PVC data to local NFS targets.
  • Off-site Replication: Leverages Cloudflare R2 via S3-compatible replication for disaster recovery.
  • Kustomize Components: Standardized backup configurations are defined as reusable components in kubernetes/components/volsync/, including kopia.yaml and r2.yaml.

For details, see VolSync and Volume Backup.

Sources:


5.2 Garage S3 Object Storage

Garage is a distributed, S3-compatible storage service that runs directly within the cluster. It provides a highly available object storage layer for internal applications and serves static assets for public-facing websites.

  • Configuration: Managed via garage.toml, which defines the RPC secrets, S3 API endpoints, and web hosting parameters.
  • Deployment: The cluster runs a primary garage instance and a staticgarage instance for specialized workloads.
  • Integrations: Used by CloudNativePG for WAL archiving and as a target for various application-level backups.

For details, see Garage S3 Object Storage.

Sources:


5.3 Database Layer (CloudNativePG and DragonflyDB)

The database layer provides structured data storage with high availability and automated lifecycle management.

  • CloudNativePG: Manages PostgreSQL clusters with features like Point-In-Time Recovery (PITR) and automated WAL (Write-Ahead Log) archiving to Garage S3.
  • DragonflyDB: A high-performance, Redis-compatible multi-threaded data store used for caching and session management (e.g., for Authentik).
  • Persistence: Databases typically utilize the local-hostpath StorageClass for maximum IOPS, with VolSync handling the backup of these local volumes.

For details, see Database Layer (CloudNativePG and DragonflyDB).

Sources:


Data Flow and Lifecycle

The following diagram describes the lifecycle of data from an application’s PVC through the various storage and backup tiers.

Data Lifecycle Diagram

[Flowchart Diagram]

Sources: