-- Leo's gemini proxy

-- Connecting to henn.es:1965...

-- Connected

-- Sending request

-- Meta line: 20 text/gemini

Trailing slashes and relative URLs


I discovered a bug in my own gemini server implementation (and my own gemini client implementation as well, but more about that later). My server made no difference, if the URL in a request ended with a slash ("/") or not. If an existing resource was requested, the server just responded the requested resource successfully with status code 20 for both cases: with trailing slash and without trailing slash. This caused a problem, if the requested resource contained a link with a relative URL.


For example the resource at `gemini://henn.es/gemlog/` is an overview of my gemlog entries. A gemlog entry is listed as a link with a relative URL like:


=> en/2020-12-23-hello-world.gmi 2020-12-23 Hello world

If a client follows the link, it has to build the absolute URL for that link first (and use that absolute URL for the next request to the server). You need a base URL for a given relative URL to get an absolute URL:


f(base URL, relative URL) = absolute URL


The base URL that a client will use is most likely the previously requested URL. That's the URL of the resource that contains the link with the relative URL. In my example that's the overview page for my gemlog: `gemini://henn.es/gemlog/`.


The function `f` is the algorithm to resolve a relative URL. This algorithm is described in

RFC 1808 - Relative Uniform Resource Locators.


To understand why a trailing slash matters, you have to read page 11 of RFC 1808:


> Step 6: The last segment of the base URL's path (anything following the rightmost slash "/", or the entire path if no slash is present) is removed and the embedded URL's path is appended in its place. [...]


Let's see, what my server did, when the gemlog overview page was requested with trailing slash and without trailing slash.


Case 1 - with trailing slash:

If the overview page was requested by `gemini://henn.es/gemlog/`, a gemini client would resolve the relative URL from the link to `gemini://henn.es/gemlog/en/2020-12-23-hello-world.gmi`.

That's good! :-)


Case 2 - without trailing slash:

If the overview page was requested by `gemini://henn.es/gemlog`, a gemini client would have resolved the relative URL from the link to `gemini://henn.es/en/2020-12-23-hello-world.gmi`.

The `/gemlog` segment after `gemini://henn.es` would be missing because it was removed by step 6 of the algorithm described in RFC 1808.

That's not good! :-( This resource doesn't exist. The link cannot be resolved. It's broken.


What was wrong? My gemini server shouldn't have responded successfully with status code 20 and the overview page as response body, if the trailing slash in the URL was missing.


To fix the broken link I programmed my server to respond a REDIRECT with status code 51 to the right URL in case a trailing slash is missing in the requested URL. This seems to be common practice.


Additionally my server also responds a REDIRECT if there is a trailing slash too much. For example `gemini://henn.es/gemlog/index.gmi/` will redirect to `gemini://henn.es/gemlog/index.gmi`


... and a bug in my client


The second bug was part of my own client. In case of a REDIRECT response it caused to request the new resource before closing the old connection. That's not very nice but may also cause a timeout in case there is no available connection to the server. My server is still very simple and works with only one single thread at the moment. Maybe you can imagine what happened. :-) My client wasn't able to handle a REDIRECT response from my own server properly. Other servers were no problem because they can handle multiple requests in parallel.


The fix was to close the old connection first and to perform the redirection after that.

-- Response ended

-- Page fetched on Sat May 18 07:06:27 2024