Skip to content

Metrics Agent Chart

Documentation moved

The full documentation for the Metrics Agent Helm chart is now in the Agents Chart section: Overview, Installation, Configuration, Environment Variables.

The AlertHawk Metrics Agent is a separate Helm chart that deploys the metrics collection agent. It runs in your Kubernetes cluster, collects metrics from the cluster, and sends them to the AlertHawk Metrics API.

  • Chart name: alerthawk-metrics-agent
  • Same Helm repo as main chart: https://thiagoloureiro.github.io/AlertHawk.Chart/
  • Artifact Hub: alerthawk (search for metrics-agent)

What it does

  1. Collects Kubernetes metrics — Pods, deployments, services, and other resources in the cluster (optionally scoped by namespace).
  2. Sends to Metrics API — Forwards metrics to the AlertHawk Metrics API service (must be running and reachable).
  3. Cluster identification — Tags metrics with a cluster name for multi-cluster setups.
  4. Configurable — Collection interval, namespaces to watch, optional log collection.

Prerequisites

  • Kubernetes cluster (1.19+)
  • Helm 3.x
  • AlertHawk Metrics API running (to receive metrics)
  • ClickHouse (used by the Metrics API to store metrics — not installed by this chart)

Installation

1. Add the Helm repository

bash
helm repo add alerthawk https://thiagoloureiro.github.io/AlertHawk.Chart/
helm repo update

2. Create a values file

Create metrics-agent-values.yaml and set the required env vars:

yaml
env:
  CLUSTER_NAME: "YOUR-CLUSTER-NAME"   # e.g. aks-tools-01
  METRICS_API_URL: "http://alerthawk-metrics-api.alerthawk.svc.cluster.local:8080"
  NAMESPACES_TO_WATCH: "alerthawk,clickhouse"   # comma-separated namespaces to monitor

See Configuration and Environment variables below for full options.

3. Install the chart

bash
helm install alerthawk-metrics-agent alerthawk/alerthawk-metrics-agent -f metrics-agent-values.yaml

With a specific namespace:

bash
helm install alerthawk-metrics-agent alerthawk/alerthawk-metrics-agent -f metrics-agent-values.yaml -n alerthawk --create-namespace

4. Upgrade

bash
helm upgrade alerthawk-metrics-agent alerthawk/alerthawk-metrics-agent -f metrics-agent-values.yaml

5. Uninstall

bash
helm uninstall alerthawk-metrics-agent

Configuration

Top-level values

KeyDescriptionDefault
nameOverrideOverride deployment name prefixalerthawk
replicasNumber of pod replicas1

Image

KeyDescriptionDefault
image.repositoryContainer image repositorythiagoguaru/alerthawk.metrics
image.tagImage tag3.1.12 (or chart appVersion)
image.pullPolicyPull policyAlways

Deployment strategy

KeyDescriptionDefault
strategy.typeRollingUpdate or RecreateRollingUpdate
strategy.rollingUpdate.maxSurgeMax surge25%
strategy.rollingUpdate.maxUnavailableMax unavailable25%

Service account and RBAC

KeyDescriptionDefault
serviceAccount.createCreate a ServiceAccounttrue
serviceAccount.nameServiceAccount namealerthawk-sa
serviceAccount.annotationsAnnotations
serviceAccount.clusterRoleBinding.createCreate ClusterRoleBindingtrue
serviceAccount.clusterRoleBinding.clusterRoleClusterRole to bindcluster-admin

The agent needs cluster (or namespace) read access to collect metrics. By default the chart creates a ServiceAccount and binds it to cluster-admin. If you set serviceAccount.create: false, create the ServiceAccount and ClusterRoleBinding yourself.

Security context

KeyDescriptionDefault
securityContext.allowPrivilegeEscalationAllow privilege escalationfalse
securityContext.privilegedPrivileged containerfalse
securityContext.readOnlyRootFilesystemRead-only root filesystemfalse
securityContext.runAsNonRootRun as non-rootfalse

Resources (optional)

yaml
resources:
  limits:
    cpu: 500m
    memory: 512Mi
  requests:
    cpu: 100m
    memory: 128Mi

Other

KeyDescriptionDefault
progressDeadlineSecondsDeployment progress deadline600
revisionHistoryLimitReplicaSet history limit10
terminationGracePeriodSecondsPod termination grace period30
podAnnotationsPod annotations
imagePullSecretsImage pull secret names

Environment variables

All agent configuration is passed via the env map in values.yaml. Keys are environment variable names; values are strings.

Required

VariableDescriptionExample
CLUSTER_NAMEName of the cluster (used to tag metrics)aks-tools-01
METRICS_API_URLMetrics API base URL (must be reachable from the pod)http://alerthawk-metrics-api.alerthawk.svc.cluster.local:8080
NAMESPACES_TO_WATCHComma-separated list of namespaces to monitoralerthawk,clickhouse,production

Optional — Collection

VariableDescriptionDefault
METRICS_COLLECTION_INTERVAL_SECONDSInterval in seconds between metric collections40
COLLECT_LOGSEnable log collectionfalse

Optional — Cluster / logging

VariableDescriptionDefault
CLUSTER_ENVIRONMENTEnvironment label (e.g. PROD, DEV)PROD
LOG_LEVELLog level (Verbose, Debug, Information, Warning, Error, Fatal)Information

Optional — Sentry

VariableDescriptionDefault
SENTRY_DSNSentry DSN for error tracking
ENVIRONMENTEnvironment name sent to SentryProduction

Example values.yaml

yaml
nameOverride: "alerthawk"
replicas: 1

image:
  repository: thiagoguaru/alerthawk.metrics
  tag: "3.1.12"
  pullPolicy: Always

strategy:
  type: RollingUpdate
  rollingUpdate:
    maxSurge: 25%
    maxUnavailable: 25%

serviceAccount:
  create: true
  name: alerthawk-sa
  clusterRoleBinding:
    create: true
    clusterRole: cluster-admin

securityContext:
  allowPrivilegeEscalation: false
  privileged: false
  readOnlyRootFilesystem: false
  runAsNonRoot: false

env:
  CLUSTER_NAME: "aks-tools-01"
  METRICS_API_URL: "http://alerthawk-metrics-api.alerthawk.svc.cluster.local:8080"
  METRICS_COLLECTION_INTERVAL_SECONDS: "40"
  NAMESPACES_TO_WATCH: "alerthawk,clickhouse,production"
  COLLECT_LOGS: "false"
  CLUSTER_ENVIRONMENT: "PROD"
  LOG_LEVEL: "Information"
  SENTRY_DSN: ""
  ENVIRONMENT: "Production"

Rancher

The chart includes questions.yml and values.schema.json for a form-based UI in Rancher. You can configure replicas, image, strategy, env vars, security context, and resources from the Rancher UI.


Troubleshooting

  • Metrics not collected — Check METRICS_API_URL is correct and reachable; ensure the ServiceAccount has permission to read cluster/namespace resources; confirm namespaces exist.
  • Connection errors — Verify network from the agent pod to the Metrics API (e.g. kubectl run -it --rm debug --image=curlimages/curl -- curl -v $METRICS_API_URL).
  • Pod not starting — Check logs: kubectl logs -n <namespace> <pod-name>; confirm required env vars are set; verify ServiceAccount exists.

AlertHawk - Self-hosted monitoring solution.