AI and MCP Services

Relevant source files

The ai namespace provides a comprehensive stack for Large Language Model (LLM) inference, user-facing chat interfaces, and a sophisticated Model Context Protocol (MCP) ecosystem. This architecture enables LLMs to interact with cluster services (Home Assistant, Grafana, Flux) and external data sources (GitHub, SearXNG) through a centralized MCP gateway managed by the ToolHive operator.

Core AI Infrastructure

The foundation of the AI stack consists of Ollama for model serving and Open WebUI as the primary interaction layer.

Open WebUI

Open WebUI serves as the central chat interface, configured to communicate with Ollama and integrated with the cluster’s search and observability stack.

Ollama

Ollama handles the LLM inference. It is deployed via a standard Flux Kustomizationkubernetes/apps/ai/ollama/ks.yaml1-21 and serves the API used by Open WebUI and other internal consumers.

Sources:

ToolHive MCP Operator

The cluster utilizes the ToolHive operator (stacklok) to manage Model Context Protocol servers. This allows LLMs to use “tools” by standardizing how they call external APIs.

Configuration and Security

MCP Gateway Routing

The gateway is exposed via Envoy Gateway using HTTPRoute resources.

Route NameHostnameBackend ServicePort
mcp-gatewaymcp.cloudjur.comvmcp-mcp-gateway4483
mcp-gateway-internalmcp-direct.cloudjur.comvmcp-mcp-gateway-internal4483

Sources:

MCP Server Integrations

Multiple specialized MCP servers are deployed to grant the AI context and control over the home-ops environment.

System Integration Diagram (Natural Language to Code)

This diagram maps the logical MCP tools to their specific Kubernetes Custom Resources and backend service targets.

[Flowchart Diagram]

Sources:

Implementation Details

MCP ServerImage / SourcePurposeKey Config
Context7context7-mcpSpecialized context retrievalCONTEXT7_API_KEY via ExternalSecret kubernetes/apps/ai/context7-mcp/app/externalsecret.yaml12
GitHubgithub-mcpRepo and Issue managementManaged via github-mcp-secret
Grafanamcp-grafana:0.14.0Querying metrics and dashboardsGRAFANA_URLkubernetes/apps/ai/grafana-mcp/app/mcpserver.yaml16
Home Assistantha-mcpIoT device controlConnects to home-assistant service
SearXNGmcp-searxng:1.0.5Privacy-respecting web searchSEARXNG_URLkubernetes/apps/ai/searxng-mcp/app/mcpserver.yaml14
Flux Operatorflux-operator-mcpGitOps lifecycle managementDepends on toolhive-operator-crdskubernetes/apps/ai/flux-operator-mcp/ks.yaml12

Data Flow: Chat to Tool Execution

The following diagram traces a request from the user interface through the MCP routing layer to a specific tool execution (e.g., searching the web via SearXNG).

sequenceDiagram
    participant User
    participant OWU as Open-WebUI [app]
    participant GW as MCP Gateway [vmcp-mcp-gateway]
    participant SearxMCP as SearXNG MCP [MCPServer]
    participant SearxSvc as SearXNG [searxng.default]
    User->>OWU: "Search for Kubernetes news"
    OWU->>GW: Request Tool: searxng_search
    GW->>SearxMCP: Forward via streamable-http [mcpserver.yaml:9]
    SearxMCP->>SearxSvc: HTTP GET /search?q=... [mcpserver.yaml:14]
    SearxSvc-->>SearxMCP: Results (JSON)
    SearxMCP-->>GW: MCP Tool Response
    GW-->>OWU: Tool Output
    OWU-->>User: "Here are the latest news..."

Sources:

Persistence and Storage

Open-WebUI uses a PersistentVolumeClaim (PVC) named open-webui to store application data (users, chat history, local RAG database) at /app/backend/datakubernetes/apps/ai/open-webui/app/helmrelease.yaml92-97

Sources: