shit.cx

Learning That Gemini Isn't HTTP

2020-11-10T19:18

In my logs, I've noticed that people are browsing to the parent directory of documents. I haven't provided any links to those paths, but there is obviously an expectation that something should be served there. I've also noticed that Gemini Portal automatically provides links to the parent and root directories, perhaps further reinforcing that expectation.

When I created this Gemlog, I didn't think at all about how the documents aught to be structured. I blindly adopted the long established weblog pattern of putting the article date in the path. I have since come to believe that this pattern isn't appropriate for Gemlogs.

Breadcrumbs are typically used to give the web structure. Those breadcrumbs may or may not correspond with the URL path. There were some good reasons to break the association. Circa 2005 in the midst of the web 2.0 hype, tagging was becoming popular and Google were (or were rumoured to) punishing duplicate content — documents with more than one canonical URL. This lead to a disconnect between breadcrumb and path.

In the spirit of simplicity, using the path as the breadcrumb for Gemini is a no-brainer, especially given that Gemini docs cannot place links inline, which would make displaying a breadcrumb kind of icky. The fact that the URL paths on the web have lost meaning (Google even experimented with hiding it¹) shouldn't influence how data is structured in the Gemini space. It was a mistake bloat the path with useless data. It deserves more respect than that.

So I've been thinking about how to address this. Specifically, how posts should be classified and how to create helpful directory indexes. As an aside, I will also be looking into document heading structure — should the document or the site be the main heading? If the latter, should the document contain any metadata pertaining to its source and if so, where should that metadata go? (If you have thoughts, please email me, though it might deserve a thread on the mailing list.)

To address the first problem of document hierarchy, I will use a tree of directories that becomes more specific the deeper you go. There is nothing limiting the depth of the tree, though documents may only live at the tip of each limb. That may change in the future if the need arises.

Currently, the tree looks like this:

├── bikes
│   └── hartley
│       ├── 2020-10-31-i-picked-up-my-next-bike-build
│       └── 2020-11-01-first-look-at-the-hartley-in-daylight
├── keyboards
│   └── h0002
│       ├── 2020-10-30-a-new-keyboard-build
│       ├── 2020-11-05-keyboard-stablisers-fail
│       ├── 2020-11-07-started-making-the-keyboard-case
│       └── 2020-11-09-continuing-the-keyboard-case-dovetails
├── shed
│   └── 2020-11-06-my-shed
└── tech
    ├── devenv
    │   ├── 2020-11-01-a-macos-linux-hybrid-part-1
    │   ├── 2020-11-03-a-macos-linux-hybrid-part-2
    │   └── 2020-11-04-a-macos-linux-hybrid-part-3
    └── meta
        ├── 2020-10-30-initial-post
        ├── 2020-10-30-the-shit-cx-infra
        └── 2020-11-02-measuring-traffic-to-shit-cx

Document directories are those that begin with a timestamp. It holds the document (`index.gmi`) and its assets.

All other directories are what I'm calling container directories. Their `index.gmi` files are automatically generated and show two lists. The first list will recursively locate and link to all child documents under itself. The other list shows the directories of `./`. As you delve deeper into the directory structure there will be fewer, and more relevant children shown in the first list.

As an example, the /tech/index.gmi looks like this:

# shit.cx - /tech/

## Latest Posts

=> ./devenv/2020-11-04-a-macos-linux-hybrid-part-3/ [2020/11/04] a macos linux/hybrid laptop - part 3
=> ./devenv/2020-11-03-a-macos-linux-hybrid-part-2/ [2020/11/03] a macos/linux hybrid laptop - part 2
=> ./meta/2020-11-02-measuring-traffic-to-shit-cx/ [2020/11/02] measuring traffic to shit.cx
=> ./devenv/2020-11-01-a-macos-linux-hybrid-part-1/ [2020/11/01] a macos/linux hybrid laptop - part 1
=> ./meta/2020-10-30-the-shit-cx-infra/ [2020/10/30] the shit.cx infra
=> ./meta/2020-10-30-initial-post/ [2020/10/30] initial post

## Directory Index

=> ../
=> ./devenv/
=> ./meta/

The content for this site is CC-BY-SA-4.0.

Deprecating the old URLs requires a server that supports redirects, otherwise the old URLs will break. Unfortunately Agate² doesn't, so I must move on. Molly Brown³ does so she will do.

Expect to see this change some time in the next few days. I've got it mostly working on a branch. I just need to finish a couple of things and find a couple of hours where I'm not likely to be interrupted. I don't want to take the server down, only to be dragged away before I can stand it back up.

Unlike previous posts, I've not dumped in the code to generate the indexes. It's a bit unwieldy right now but within the next few weeks I intend to share the Git repository.

References

¹ Agate

² Molly Brown

³ Google testing feature to hide parts of the URL in Chrome’s address bar