-- Leo's gemini proxy

-- Connecting to jacksonchen666.com:1965...

-- Connected

-- Sending request

-- Meta line: 20 text/gemini;lang=en

Monitoring My Servers With Prometheus

2023-07-31 16:09:10Z (last updated 2023-10-16 08:55:03Z)



This blog post is kind of in response to a blog post by Sindastra titled "How to monitor your servers the easy way". But instead, I decided to document my personal monitoring setup.


blog post by Sindastra titled "How to monitor your servers the easy way"


My setup


I use Prometheus, because I didn't have any better ideas than what sourcehut was already doing (which is using Prometheus).


Prometheus

SourceHut Prometheus


To those who suggest Uptime Kuma: Yes I have used it, but I didn't like the idea of having to run a load of JavaScript just to check some hosts and having the program installed in a weird exotic way, and I wanted to get rid of the required JavaScript to even view the uptime.


Prometheus setup


So here's what's in my Prometheus config:

Some scraping configuration

An Alertmanager endpoint

Some rule files (`rule_files` in config file)

Some scraping configuration

Alertmanager endpoint

Prometheus config file


The specifics of what I hook up to Prometheus isn't important, it's just "if it supports Prometheus scraping then I'll hook it up to Prometheus". Also includes some exporters like the Node exporter and Blackbox exporter.


Node exporter

Blackbox exporter


For the rule files, most are alerting rules but I have made a couple of recording rules.


Prometheus Alerting Rules

Prometheus Recording rules


Many of the alerting rules are copied from whatever sourcehut has, but only what can also apply to me (so no `soju` alerts).


Alerts that sourcehut has


My alerting rules are:

Abnormal backups behavior (active for too long or not active at all)

If my websites and stuff are unreachable (or even bad SSL/TLS certs)

Resource exhaustion (free space, inodes, file descriptors, memory, battery)

High resource usage (CPU, IO)

Something Prometheus is trying to probe doesn't work

Test alarm (make sure alerting is working)


Alertmanager setup


Now onto Alertmanager. It alerts. Obviously.


Alertmanager


If you didn't know, Alertmanager doesn't *create* the alerts. Prometheus creates the alerts. Alertmanager just manages many aspects of the alerts, including grouping and repeat intervals.


My config is roughly as follows:

Alerts that don't make sense together

Specific `severity` routes which sends alerts to only ntfy or with also email

Receivers for the `severity` routes


One unique thing I have with my setup is having plaintext email. HTML email is too fancy, and I now prefer plaintext email, even for my alerts.


The Alertmanager template and Alertmanager email configuration is available as a paste:

https://paste.sr.ht/~jacksonchen666/26ee58537130952bbde027c8c95790a51e4715ea


It's also available in the next 2 sections. Links will be available for skipping around (mainly for screen readers)


Alertmanager template for plaintext email


The Alertmanager template (put on your filesystem somewhere):


{{ define "__alertmanager" }}Alertmanager{{ end }}
{{ define "__alertmanagerURL" }}{{ .ExternalURL }}/#/alerts?receiver={{ .Receiver | urlquery }}{{ end }}

{{ define "__subject" }}[{{ .Status | toUpper }}{{ if eq .Status "firing" }}:{{ .Alerts.Firing | len }}{{ end }}] {{ .GroupLabels.SortedPairs.Values | join " " }} {{ if gt (len .CommonLabels) (len .GroupLabels) }}({{ with .CommonLabels.Remove .GroupLabels.Names }}{{ .Values | join " " }}{{ end }}){{ end }}{{ end }}
{{ define "__less_subject" }}[{{ .Status | toUpper }}{{ if eq .Status "firing" }}:{{ .Alerts.Firing | len }}{{ end }}] {{ .GroupLabels.SortedPairs.Values | join " " }}{{ end }}

{{ define "email.default.subject" }}{{ template "__subject" . }}{{ end }}
{{/* is anything before necessary? */}}

{{ define "email.plaintext.__text_alert_list" }}{{ range . }}Labels:
{{ range .Labels.SortedPairs }}{{ .Name }}="{{ .Value }}"
{{ end }}Annotations:
{{ range .Annotations.SortedPairs }}{{ .Name }}="{{ .Value }}"
{{ end }}Source: {{ .GeneratorURL }}
{{ end }}
{{ end }}

{{ define "email.plaintext" }}
{{- template "__subject" . }}

{{ .Alerts | len }} alert{{ if gt (len .Alerts) 1 }}s{{ end }} for
{{ range .GroupLabels.SortedPairs }}{{ .Name }}="{{ .Value }}"
{{ end }}
View in {{ template "__alertmanager" . }}: {{ template "__alertmanagerURL" . }}

{{ if gt (len .Alerts.Firing) 0 -}}
[{{ .Alerts.Firing | len }}] Firing
{{ template "email.plaintext.__text_alert_list" .Alerts.Firing }}
{{ end }}

{{- if gt (len .Alerts.Resolved) 0 -}}
[{{ .Alerts.Resolved | len }}] Resolved
{{ template "email.plaintext.__text_alert_list" .Alerts.Resolved }}
{{ end }}
Sent by {{ template "__alertmanager" . }}: {{ .ExternalURL }}
{{- end }}

Alertmanager configuration for plaintext email


Special config is required for Alertmanager with plaintext emails. Here's an

example (with a made up email address):


templates:
- '/PATH/TO/alertmanager-plaintext.tmpl'
receivers:
  - name: 'email'
    email_configs:
      - to: 'example@domain.example'
        html: ''
        text: '{{ template "email.plaintext" . }}'
        headers:
          # custom
          Subject: '{{ template "__less_subject" . }}'

public inbox (comments and discussions)

public inbox archives

(mailing list etiquette for public inbox)

-- Response ended

-- Page fetched on Fri May 10 17:49:07 2024