-- Leo's gemini proxy

-- Connecting to michaelnordmeyer.com:1965...

-- Connected

-- Sending request

-- Meta line: 20 text/gemini;lang=en-US

Fighting a Heisenbug in Agate


I don’t know if the problem is located in layer 8, but the Gemini server I use, Agate, won’t apply the correct meta all the time, but only in 95 percent of the time. The config looks like this:


**/*[!.de-DE].gmi: ;lang=en-US
**/*.de-DE.gmi: ;lang=de-DE

These are filesystem globs, not regular expressions, which match files in the filesystem.


Because the files, which are not served correctly by Agate, change after restarting the server with the same config (!), I wrote a Bash script called `check-meta.sh` to find the wrongly served files. This script walks the server’s file system, requests the found capsule’s files using `gemget` from the server using the Gemini protocol, and logs the result. Here’s a hypothetically sample output:


20 text/gemini;lang=en-US | /index.gmi
20 text/gemini;lang=de-DE | /index.de-DE.gmi
20 text/gemini | /nasty.gmi
20 text/gemini | /nasty.de-DE.gmi
20 image/webp | /the-moon.webp
20 text/plain;lang=ia;charset=EBCDIC | /grandpas-diary.txt
20 application/atom+xml | /atom.xml
[…]

With a little command line magic, we can limit the output to find those nasty non-conformists:


check-meta.sh michaelnordmeyer.com | grep "20 text/gemini" | grep -v "text/gemini;lang"

20 text/gemini | /nasty.gmi
20 text/gemini | /nasty.de-DE.gmi

So, now I have a nice little script, which provides some value for a couple of use cases, but I’m not any smarter, why Agate is trolling me. Following a hunch, let’s test Agate’s behavior after a restart and some time later with `check-meta-infinitively.sh`:


#!/usr/bin/env bash

check_meta="check-meta.sh"

if [ ${#} -ne 1 ]; then
  echo "Restarts Agate and calls ${check_meta} infinitively to display the number of text/gemini replies without a lang attribute"
  echo "Usage: $(basename ${0}) <example.com>"
  exit 1
fi

hostname=${1}

echo "$(date +'%Y-%m-%d %T %z') Restarting Agate..."
sudo systemctl restart agate
sleep 2

i=1

while true; do
  echo "$(date +'%Y-%m-%d %T %z') Requesting URLs..."
  ${check_meta} ${hostname} | grep "20 text/gemini" | grep -v "text/gemini;" > ${i}.agate
  echo "$(date +'%Y-%m-%d %T %z') $(wc -l ${i}.agate)"
  let i++
done

The output:


2023-06-30 18:16:38 Restarting Agate...
2023-06-30 18:16:40 Requesting URLs...
2023-06-30 18:16:59 990 1.agate
2023-06-30 18:16:59 Requesting URLs...
2023-06-30 18:17:18 33 2.agate
2023-06-30 18:17:18 Requesting URLs...
2023-06-30 18:17:36 33 3.agate
2023-06-30 18:17:36 Requesting URLs...
2023-06-30 18:17:53 33 4.agate
2023-06-30 18:17:53 Requesting URLs...
2023-06-30 18:18:11 33 5.agate
2023-06-30 18:18:11 Requesting URLs...
<ctrl-c>

Okay, looks like after the first batch with 990 text/gemini URLs without proper meta, it drops down to 33. I have 1099 Gemini files on the server. Maybe Agate needs a file to be requested once, before it applies the meta properly. But why not 1099 and why the 33? The 33 doesn’t change after many requests.


Now let’s wait some time after requesting the first two batches with `check-meta-restart.sh`:


#!/usr/bin/env bash

check_meta="check-meta.sh"

if [ ${#} -ne 2 ]; then
  echo "Restarts Agate twice and calls ${check_meta} three times with a delay between the 2nd and 3rd to display the number of text/gemini replies without a lang attribute"
  echo "Usage: $(basename ${0}) <example.com> <sleep-minutes>"
  exit 1
fi

hostname=${1}
sleep_minutes=${2}

for i in $(seq 1 3); do
  if [ ${i} -ne 3 ]; then
    echo "$(date +'%Y-%m-%d %T %z') Restarting Agate..."
    sudo systemctl restart agate
    sleep 2
  fi
  echo "$(date +'%Y-%m-%d %T %z') Requesting URLs..."
  ${check_meta} ${hostname} | grep "20 text/gemini" | grep -v "text/gemini;" > ${i}.agate
  echo "$(date +'%Y-%m-%d %T %z') $(wc -l ${i}.agate)"
  if [ ${i} -eq 2 ]; then
    echo "$(date +'%Y-%m-%d %T %z') Sleeping for ${sleep_minutes} minutes..."
    sleep ${sleep_minutes}m
    echo "$(date +'%Y-%m-%d %T %z') Waking up"
  fi
done

No dice. Still 33 strugglers after waiting 30 minutes. Looks like I have to find two different errors, when I have the time for that. At least I can reproduce it now.


The Bash script `check-meta.sh` for checking the meta of served files:


#!/usr/bin/env bash

if [ ${#} -ne 1 ]; then
  echo "Usage: $(basename ${0}) <example.com>"
  echo "Needs 'content_directory' and 'gemget' to be configured in the script once"
  exit 1
fi

# The path from the location of this script to the content directory of the Gemini server.
# It is assumed, that the content directory has virtual host subdirectories like example.com and example.net, which are the root dirs for those capsules.
content_directory="$(dirname ${0})/../content/"

# The location of gemget. If it's in the path, `gemget` is the correct value
gemget="${HOME}/gemget/gemget"

# For people having spaces in their filenames
IFS=$'\n'

# Find all files of a capsule excluding hidden files
for file_path in $(find ${content_directory}${1} -type f -not -name ".*"); do
  path=${file_path:$((${#1}+${#content_directory}))}
  ${gemget} --header -q -o - gemini://${1}${path} | head -n1 | cut -d' ' -f2- | { tr -d '\n' ; echo " | ${path}"; }
done

-- Response ended

-- Page fetched on Tue May 21 15:16:24 2024