Configuration · Lookout

Lookout is configured with one YAML file. The installer writes it to /etc/lookout/config.yaml (mode 600, owned by the lookout user). Edit it and restart the service to apply changes:

sudo nano /etc/lookout/config.yaml
sudo systemctl restart lookout

If the file is missing, Lookout runs on built-in defaults. Out-of-range values are clamped back to safe ones with a warning in the log, so a misconfigured file degrades gracefully instead of crashing or going silent.

Durations are Go-style strings: 5s, 2m, 1h.

Top level

Key	Default	Description
`collection_interval`	`30s`	How often collectors run (minimum `5s`)
`state_file`	`/var/lib/lookout/state.json`	Where firing-alert state is persisted across restarts

Alerts

The alerts block holds global timing plus one section per check.

Key	Default	Description
`renotify_after`	`1h`	How long to wait before re-alerting on a still-firing issue
`stale_after`	`3 × collection_interval`	Alert if a metric stops reporting for this long

Threshold checks

memory, disk, cpu, swap, and load are threshold checks. Each takes:

Field	Description
`threshold`	The value at which the alert starts
`resolve_below`	The value at which it clears (defaults to a little under `threshold`)
`for`	How long the value must stay over `threshold` before alerting (`0s` = immediate)
`severity`	`warning` or `critical`

Keeping threshold and resolve_below apart stops an alert from flapping when a value sits right on the line.

alerts:
  memory:
    threshold: 85        # percent
    resolve_below: 80
    for: 2m
    severity: warning
  cpu:
    threshold: 85
    resolve_below: 80
    for: 2m
    severity: warning
  swap:
    threshold: 80
    resolve_below: 75
    for: 2m
    severity: warning
  load:
    # 1-minute load average divided by CPU cores
    threshold: 1.5
    resolve_below: 1.0
    for: 2m
    severity: warning

Defaults: memory and CPU 85, swap 80 (percent), load 1.5. These five run out of the box.

Disk

Disk is a threshold check with two extras: the mount points to watch, and an optional fill-rate prediction.

alerts:
  disk:
    threshold: 85
    resolve_below: 80
    for: 2m
    severity: warning
    predict_full_within: 4h   # also alert if a mount is on pace to fill within this window
    mounts:
      - /
      - /var

The default mount is /. Set predict_full_within: 0s to disable growth prediction.

Service and presence checks

These have no threshold — they alert when something isn't where it should be. They default to critical severity and do nothing until you list what to watch.

alerts:
  systemd:
    severity: critical
    services:
      - nginx
      - postgresql
  process:
    severity: critical
    names:
      - nginx
  http:
    severity: critical
    checks:
      - name: app
        url: "https://example.com/health"
        timeout: 5s           # default 5s
        expected_status: 200  # default 200
  tcp:
    severity: critical
    checks:
      - name: redis
        address: "127.0.0.1:6379"
        timeout: 5s

Check	Alerts when
`systemd`	A listed service is not `active`
`process`	A named process isn't found in `/proc`
`http`	A URL doesn't return its `expected_status` within `timeout`
`tcp`	A TCP `address` won't accept a connection within `timeout`

Notifiers

Alerts fan out to every notifier you configure — Google Chat, Discord, Slack, Microsoft Teams, Telegram, PagerDuty, a generic webhook, or email. With none configured, alerts print to the journal. See Notifications for each one.

Heartbeat

Lookout can ping a dead-man's-switch URL on an interval, so you're alerted if the whole box, its network, or the agent itself goes dark. See Heartbeat monitoring.

heartbeat:
  url: "https://lookout.kelvinamoaba.com/ping/lk_ping_xxxxxxxxxxxx"
  interval: 60s

Docker

Container monitoring, off by default. See Docker monitoring.

docker:
  enabled: false
  severity: critical
  restart_threshold: 3
  restart_window: 10m

Prometheus metrics

Optionally expose the latest readings as Prometheus text at GET /metrics. Off by default; serves only the current in-memory snapshot. See Prometheus metrics.

metrics:
  enabled: false
  listen: "127.0.0.1:9100"   # loopback by default

Full example

The complete annotated config — also what the installer writes on first run:

collection_interval: 30s
state_file: /var/lib/lookout/state.json

metrics:
  enabled: false
  listen: "127.0.0.1:9100"

alerts:
  renotify_after: 1h
  stale_after: 90s
  memory:
    threshold: 85
    resolve_below: 80
    for: 2m
    severity: warning
  disk:
    threshold: 85
    resolve_below: 80
    for: 2m
    severity: warning
    predict_full_within: 4h
    mounts:
      - /
  load:
    threshold: 1.5
    resolve_below: 1.0
    for: 2m
    severity: warning
  cpu:
    threshold: 85
    resolve_below: 80
    for: 2m
    severity: warning
  swap:
    threshold: 80
    resolve_below: 75
    for: 2m
    severity: warning
  systemd:
    severity: critical
    services: []
  http:
    severity: critical
    checks: []
  tcp:
    severity: critical
    checks: []
  process:
    severity: critical
    names: []

notifiers:
  # google_chat:
  #   webhook_url: "https://chat.googleapis.com/v1/spaces/XXX/messages?key=...&token=..."
  # discord:
  #   webhook_url: "https://discord.com/api/webhooks/XXX/YYY"

heartbeat:
  # url: "https://lookout.kelvinamoaba.com/ping/lk_ping_xxxxxxxxxxxx"
  interval: 60s

docker:
  enabled: false
  severity: critical
  restart_threshold: 3
  restart_window: 10m