-- Leo's gemini proxy
-- Connecting to michaelnordmeyer.com:1965...
-- Connected
-- Sending request
-- Meta line: 20 text/gemini;lang=en-US
I don’t know if the problem is located in layer 8, but the Gemini server I use, Agate, won’t apply the correct meta all the time, but only in 95 percent of the time. The config looks like this:
**/*[!.de-DE].gmi: ;lang=en-US **/*.de-DE.gmi: ;lang=de-DE
These are filesystem globs, not regular expressions, which match files in the filesystem.
Because the files, which are not served correctly by Agate, change after restarting the server with the same config (!), I wrote a Bash script called `check-meta.sh` to find the wrongly served files. This script walks the server’s file system, requests the found capsule’s files using `gemget` from the server using the Gemini protocol, and logs the result. Here’s a hypothetically sample output:
20 text/gemini;lang=en-US | /index.gmi 20 text/gemini;lang=de-DE | /index.de-DE.gmi 20 text/gemini | /nasty.gmi 20 text/gemini | /nasty.de-DE.gmi 20 image/webp | /the-moon.webp 20 text/plain;lang=ia;charset=EBCDIC | /grandpas-diary.txt 20 application/atom+xml | /atom.xml […]
With a little command line magic, we can limit the output to find those nasty non-conformists:
check-meta.sh michaelnordmeyer.com | grep "20 text/gemini" | grep -v "text/gemini;lang"
20 text/gemini | /nasty.gmi 20 text/gemini | /nasty.de-DE.gmi
So, now I have a nice little script, which provides some value for a couple of use cases, but I’m not any smarter, why Agate is trolling me. Following a hunch, let’s test Agate’s behavior after a restart and some time later with `check-meta-infinitively.sh`:
#!/usr/bin/env bash check_meta="check-meta.sh" if [ ${#} -ne 1 ]; then echo "Restarts Agate and calls ${check_meta} infinitively to display the number of text/gemini replies without a lang attribute" echo "Usage: $(basename ${0}) <example.com>" exit 1 fi hostname=${1} echo "$(date +'%Y-%m-%d %T %z') Restarting Agate..." sudo systemctl restart agate sleep 2 i=1 while true; do echo "$(date +'%Y-%m-%d %T %z') Requesting URLs..." ${check_meta} ${hostname} | grep "20 text/gemini" | grep -v "text/gemini;" > ${i}.agate echo "$(date +'%Y-%m-%d %T %z') $(wc -l ${i}.agate)" let i++ done
The output:
2023-06-30 18:16:38 Restarting Agate... 2023-06-30 18:16:40 Requesting URLs... 2023-06-30 18:16:59 990 1.agate 2023-06-30 18:16:59 Requesting URLs... 2023-06-30 18:17:18 33 2.agate 2023-06-30 18:17:18 Requesting URLs... 2023-06-30 18:17:36 33 3.agate 2023-06-30 18:17:36 Requesting URLs... 2023-06-30 18:17:53 33 4.agate 2023-06-30 18:17:53 Requesting URLs... 2023-06-30 18:18:11 33 5.agate 2023-06-30 18:18:11 Requesting URLs... <ctrl-c>
Okay, looks like after the first batch with 990 text/gemini URLs without proper meta, it drops down to 33. I have 1099 Gemini files on the server. Maybe Agate needs a file to be requested once, before it applies the meta properly. But why not 1099 and why the 33? The 33 doesn’t change after many requests.
Now let’s wait some time after requesting the first two batches with `check-meta-restart.sh`:
#!/usr/bin/env bash check_meta="check-meta.sh" if [ ${#} -ne 2 ]; then echo "Restarts Agate twice and calls ${check_meta} three times with a delay between the 2nd and 3rd to display the number of text/gemini replies without a lang attribute" echo "Usage: $(basename ${0}) <example.com> <sleep-minutes>" exit 1 fi hostname=${1} sleep_minutes=${2} for i in $(seq 1 3); do if [ ${i} -ne 3 ]; then echo "$(date +'%Y-%m-%d %T %z') Restarting Agate..." sudo systemctl restart agate sleep 2 fi echo "$(date +'%Y-%m-%d %T %z') Requesting URLs..." ${check_meta} ${hostname} | grep "20 text/gemini" | grep -v "text/gemini;" > ${i}.agate echo "$(date +'%Y-%m-%d %T %z') $(wc -l ${i}.agate)" if [ ${i} -eq 2 ]; then echo "$(date +'%Y-%m-%d %T %z') Sleeping for ${sleep_minutes} minutes..." sleep ${sleep_minutes}m echo "$(date +'%Y-%m-%d %T %z') Waking up" fi done
No dice. Still 33 strugglers after waiting 30 minutes. Looks like I have to find two different errors, when I have the time for that. At least I can reproduce it now.
The Bash script `check-meta.sh` for checking the meta of served files:
#!/usr/bin/env bash if [ ${#} -ne 1 ]; then echo "Usage: $(basename ${0}) <example.com>" echo "Needs 'content_directory' and 'gemget' to be configured in the script once" exit 1 fi # The path from the location of this script to the content directory of the Gemini server. # It is assumed, that the content directory has virtual host subdirectories like example.com and example.net, which are the root dirs for those capsules. content_directory="$(dirname ${0})/../content/" # The location of gemget. If it's in the path, `gemget` is the correct value gemget="${HOME}/gemget/gemget" # For people having spaces in their filenames IFS=$'\n' # Find all files of a capsule excluding hidden files for file_path in $(find ${content_directory}${1} -type f -not -name ".*"); do path=${file_path:$((${#1}+${#content_directory}))} ${gemget} --header -q -o - gemini://${1}${path} | head -n1 | cut -d' ' -f2- | { tr -d '\n' ; echo " | ${path}"; } done
-- Response ended
-- Page fetched on Tue May 21 15:16:24 2024