Skip to content

Metrics Agent

The Metrics Agent is a lightweight application that runs inside your Kubernetes cluster, collects pod metrics, node metrics, pod logs, and Kubernetes events, and sends them to the AlertHawk Metrics API. It is typically deployed via the Agents Chart (Helm).


Overview

  • Runs in-cluster: Uses in-cluster Kubernetes config and a ServiceAccount with read access to the cluster (or specified namespaces).
  • Collection loop: Periodically runs three collectors (pod metrics, node metrics, events) and optionally pod logs; then sends data to the Metrics API over HTTP.
  • No database: The agent does not store data; it only reads from the Kubernetes API and POSTs to the Metrics API.
  • Cluster identification: Every payload is tagged with CLUSTER_NAME (and optionally CLUSTER_ENVIRONMENT) so the Metrics API can distinguish multiple clusters.

Environment Variables

The agent reads all configuration from environment variables. When deployed with the Agents Chart, these are set in the chart’s env block.

Required

VariableDescriptionExample
CLUSTER_NAMEName of the cluster (used in all payloads sent to the Metrics API)aks-tools-01
METRICS_API_URLBase URL of the Metrics API (must be reachable from the cluster)http://alerthawk-metrics-api.alerthawk.svc.cluster.local:8080
NAMESPACES_TO_WATCHComma-separated list of namespaces to collect from (pod metrics, events, pod logs)alerthawk,clickhouse,production

Optional — Collection

VariableDescriptionDefault
METRICS_COLLECTION_INTERVAL_SECONDSSeconds between collection cycles30
COLLECT_LOGSEnable pod log collection and send to Metrics APIfalse
LOG_TAIL_LINESNumber of tail lines to collect per container when COLLECT_LOGS=true100

Optional — Cluster / logging

VariableDescriptionDefault
CLUSTER_ENVIRONMENTEnvironment label sent with node metrics (e.g. PROD, DEV)PROD
LOG_LEVELSerilog log level (Verbose, Debug, Information, Warning, Error, Fatal)Information

Optional — Sentry

VariableDescriptionDefault
SENTRY_DSNSentry DSN for error tracking(chart may leave empty)
ENVIRONMENTEnvironment name sent to SentryProduction

What It Collects

The agent runs a loop every METRICS_COLLECTION_INTERVAL_SECONDS and, in order:

  1. Pod metrics (PodMetricsCollector)
    For each pod in NAMESPACES_TO_WATCH:

    • Reads container CPU/memory from the metrics.k8s.io API (if available) or estimates from node capacity.
    • Sends one payload per container to the Metrics API: POST /api/metrics/pod (namespace, pod, container, CPU/memory, node name, pod state, restart count, age).
    • If COLLECT_LOGS is true, collects LOG_TAIL_LINES lines per container and sends POST /api/metrics/pod/log.
  2. Node metrics (NodeMetricsCollector)
    For each node in the cluster:

    • Reads node capacity and conditions (Ready, MemoryPressure, DiskPressure, PIDPressure), Kubernetes version, and (when possible) cloud provider, region, instance type, OS, architecture.
    • Sends one payload per node: POST /api/metrics/node.
      The Metrics API may use this to send node-status notifications and to fetch/store Azure prices.
  3. Kubernetes events (EventsCollector)
    For each namespace in NAMESPACES_TO_WATCH:

    • Lists events and sends new/updated events to the Metrics API: POST /api/events.

All requests include CLUSTER_NAME (and, for node metrics, CLUSTER_ENVIRONMENT) so the Metrics API can store and query by cluster.


Data Sent to the Metrics API

DataMetrics API endpointAuth
Pod/container metricsPOST /api/metrics/podNone (ingest)
Node metricsPOST /api/metrics/nodeNone (ingest)
Pod logsPOST /api/metrics/pod/logNone (ingest)
Kubernetes eventsPOST /api/eventsNone (ingest)

The agent does not use JWT; ingest endpoints on the Metrics API are AllowAnonymous.


Deployment

  • Helm (recommended): Use the Agents Chart to deploy the Metrics Agent in each cluster you want to monitor. See Agents Chart — Installation and Environment variables.
  • Prerequisites: Kubernetes cluster (1.19+), ServiceAccount with read access to the cluster or to NAMESPACES_TO_WATCH, and the Metrics API running and reachable (e.g. METRICS_API_URL).

Troubleshooting

  • No data in Metrics API: Ensure METRICS_API_URL is correct and reachable from the pod (e.g. kubectl run -it --rm debug --image=curlimages/curl -- curl -v $METRICS_API_URL/api/Version). Check that NAMESPACES_TO_WATCH contains existing namespaces and that the ServiceAccount has permission to list pods/nodes/events (and logs if COLLECT_LOGS is true).
  • Agent exits at startup: If the agent logs "CLUSTER_NAME environment variable is required but not set!", set CLUSTER_NAME (and NAMESPACES_TO_WATCH) in the deployment.
  • Logs: Check agent logs with kubectl logs -n <namespace> <metrics-agent-pod>.

AlertHawk - Self-hosted monitoring solution.