Configuration

This section is a reference for the command-line interface, usage of environment variable and config.yaml file. Statusgraph has a simple client-server architecture. The Server serves the SPA Frontend, stores the graph data on disk and proxies metrics request to prometheus.

CLI

The server serves the Webapp and is the API Server that stores the graph information and issues requests towards prometheus/alertmanager.

Usage:
  statusgraph server [flags]

Flags:
      --config string       path to the config file which contains the server configuration (default "/etc/statusgraph/config.yaml")
      --data-dir string     path to the data dir (default "/data")
  -h, --help                help for server
      --static-dir string   path to the static dir (default "/www")

Global Flags:
      --loglevel string   set the loglevel (default "info")

Overview

This config file has three main purposes:

specify connection information for prometheus and alertmanager
define how statusgraph selects alerts and how to map them to a graph node
fetch metrics from prometheus and how to map them to a graph node

Example

See the following annotated config example for further explanation.

upstream:
  prometheus:
    # you can use http basic auth here in the form of http://user:pass@example.com
    url: http://localhost:9090
  alertmanager:
    url: http://localhost:9093

mapping:
  # this defines which alerts we display and how to find the correpsonding graph node
  # use a `label_selector` to filter for specific alerts
  # and `service_labels` and `service_annotations` to specify to which graph node this alert belongs
  alerts:
    label_selector:
      - severity: "critical"
      - severity: "warning"
        important: "true"

    # red & green lamp indicator
    # Use this if your alerts use a specific label for a service (e.g. app=frontend / app=backend ...)
    # this tells statusgraph to map alerts to nodes using the following labels/annotations
    service_labels:
      - "service_id"
    service_annotations:
      - "statusgraph-node"

  metrics:
    # green lamp indicator!
    # this helps statusgraph to find all existing services by fetching the label values
    # reference: https://prometheus.io/docs/prometheus/latest/querying/api/#querying-label-values
    service_labels:
      - 'service_id'

    queries:
      # just as an example
      - name: cpu wait
        query: sum(rate(node_pressure_cpu_waiting_seconds_total[1m])) by (service_id) * 100
        service_label: service_id