Gatus, Exporters, and Health Monitoring

Relevant source files

This section details the specialized monitoring components of the home-ops cluster. While the core metrics stack is handled by Prometheus and Loki, these exporters and health-checking tools provide deep visibility into external infrastructure, hardware health, and service availability.

Gatus: Status Monitoring

Gatus serves as the primary health dashboard and service checker for the homelab. It performs active probing of endpoints and provides a public-facing status page.

Implementation and Discovery

The Gatus deployment utilizes a gatus-sidecar for automated discovery of cluster resources.

Data Flow: Gatus Service Monitoring

Title: Gatus Monitoring Logic

[Flowchart Diagram]

Sources: kubernetes/apps/observability/gatus/app/helmrelease.yaml31-82kubernetes/apps/observability/gatus/app/resources/config.yaml5-31

Blackbox Exporter

The Prometheus Blackbox Exporter allows for probing of endpoints over HTTP, HTTPS, DNS, TCP, and ICMP.

Sources: kubernetes/apps/observability/exporters/blackbox-exporter/app/helmrelease.yaml42-62kubernetes/apps/observability/exporters/blackbox-exporter/app/probe.yaml1-45

Specialized Exporters

The cluster utilizes several specialized exporters to pull metrics from non-standard sources:

ExporterTargetImplementation Details
OPNsense ExporterFirewallScrapes OPNsense API via HTTPS (insecure mode enabled for local) kubernetes/apps/observability/exporters/opnsense-exporter/app/helmrelease.yaml21-42
SMARTCTL ExporterDisk HealthMonitors S.M.A.R.T. data for physical drives.
NUT ExporterUPSInterfaces with Network UPS Tools to monitor power status.
Speedtest ExporterWAN PerformancePeriodically runs speed tests to track ISP bandwidth.

Sources: kubernetes/apps/observability/exporters/opnsense-exporter/app/helmrelease.yaml21-42

k8s-monitoring Alloy Stack

The k8s-monitoring stack (based on Grafana Alloy) acts as a high-performance telemetry collector that bridges various observability signals.

Code Entity Mapping: Alloy Telemetry Pipeline

Title: Alloy Collector Routing

[Flowchart Diagram]

Sources: kubernetes/apps/observability/k8s-monitoring/app/helmrelease.yaml42-132

Alerting and Remediation

Alertmanager coordinates the response to health monitoring failures.

Sources: kubernetes/apps/observability/kube-prometheus-stack/app/resources/alertmanager.yaml4-53