-- Leo's gemini proxy

-- Connecting to michaelnordmeyer.com:1965...

-- Connected

-- Sending request

-- Meta line: 20 text/gemini;lang=en-US

On Gemtext Specification 0.24.0


There is a new version of the Gemtext specification out, and I decided to comment on it publicly, instead of privately, because only this way a discussion involving more people can follow.


Gemtext Specification 0.24.0


First, I have to say that I’m way more interested in Gemtext than in the Gemini network protocol, because Gemtext is the thing I actively use, while the protocol is working in the background diligently. And I love writing and reading Gemtext. So my critique comes from a point of love, and from a developer-educated user’s perspective, who takes the shift from server-side to client-side styling seriously.


I Was Mistaken – Gemini is Awesome


That’s why you find in here only things to improve, because it’s already a great thing.


Missing Parts


When looking at the spec, what’s most obvious is the missing fourth-level heading, because the first-level heading is used many times throughout the document. The first-level heading should only be used once, as a title, if we care about semantics. And we should.


That’s why I propose a fourth-level heading.


Judging from my own writing, where I use third-level headings often and fourth-level headings occasionally, but never fifth- or sixth-level headings, I think this is warranted and takes not a lot of effort for all involved. It won’t break anything, because old clients would just display a visible `#` in front of the unsupported fourth-level heading.


It is also hard to render fifth- and sixth-level headings properly, because they tend to get to small. I only see them in very large documents.


I know three levels is the sweet spot, but I almost needed a fourth-level heading below to divide Formal Errors, but I deleted the second section and got away without needing it.


Improvable Parts


The following points are only subtle changes to the spec.


Formal Errors


I’m a beginner at augmented BNF, but judging from RFC 5234 “3.6. Variable Repetition” `gemtext-document = *1gemtext-line` means a document is a only a Gemtext document if it has 0 or 1 Gemtext lines. It should be `1*gemtext-line`, meaning one or many Gemtext lines.


RFC 5234 3.6. Variable Repetition

RFC 5234 in text format


I’ve already informed Solderpunk about this.


Line Type Token Delimiters


Gemtext strives to be easily parseable, but makes it harder than it should be. While a parser can currently check the beginning of strings for line types, like `=>`, `#`, `##`, `###`, `* `, and `>`, from a developer’s perspective as well as a user’s perspective I would like all line type tokens to have a single mandatory space character acting as a delimiter after them. No tabs, but only a single space. Easy to understand, easy to write, easy to read in TUI clients, easy to parse.


Currently only one has a mandatory space, and it can be questioned why it is important for lists, but not for headings or quotes.


Geopard, a Gnome Gemini browser, does render the space after the quote token, which means users have to write `>A quote.` instead of `> A quote.` to avoid the extraneous space in front of the quote. Spaces after the heading token are not rendered, which probably nobody ever wants, but the behavior should be the same. Even text email replies have a space after the quote character.


From the spec-writing and -reading perspective, something like `=> `, `# `, `## `, `### `, `* `, and `> ` looks much more congruently and removes any doubt about what should be rendered and what not. Depending on parsing circumstances, it is helpful to use a `split` function to separate the token from the content by having a single known delimiter. Delimiters are your friends, and you can watch them in below pseudocode.


The pre-formatted line toggle ````` is an exception, because it is a rendering no-op and just toggles the render mode.


preformat = false

for line in document.getLine
  if line.startsWith(lineToken.preformat)
    preformat = !preformat
    break
  if preformat
    renderPreformat(line)
  else
    if line.startsWith(lineToken.link)
      renderLink(line)
    else if line.startsWith(lineToken.heading)
      renderHeading(line.split(" ")[1])
    else if line.startsWith(lineToken.list)
      renderList(line.split(" ")[1])
    else if line.startsWith(lineToken.quote)
      renderQuote(line.split(" ")[1])
    else
      renderText(line)

<whitespace>


The spec uses `<whitespace>` for delimiting the link line type, which includes tabulators as well. The textual description disagrees with the formal grammar:


>All lines beginning with the two characters "=>" are link lines.

>Link lines have the following syntax: `=>[<whitespace>]<URL>[<whitespace><USER-FRIENDLY LINK NAME>]`

><whitespace> is any non-zero number of consecutive spaces or tabs


vs.


>link-line = "=>" *SP URI-reference [1*SP 1*(SP / VCHAR)] *SP CRLF


The formal grammar doesn’t specify tabs (`HTAB` in augmented BNF) at all, which probably has to be corrected one way or the other, but I argue against using more than one consecutive space or tab characters at all.


Because Gemtext is not a fixed-width format, but has a flow-like structure, it should only use a single space character for separating the parts. While some prefer to have their links in source view aligned, it will ultimately fail to achive this for long links with long link texts:


=> gemini://my-fancy-username.on-an-awesome-bsd-pubnix.withacooltopleveldomain/en/gemlog/2024/04/01/an-awesome-or-not-april-fools-joke.gmi New Gemini Spec Will Add JavaScript Capability to Appease Complaining Ad-Tech Companies
=> /                                                                                                                                       Home

If the benefit of an option is very limited, it’s better to simplify it by not having this option.


Unnecessary Parts


Gemtext is very slender and has no unnecessary parts. But when I started using Gemtext, I wanted to have a horizontal line to separate the footer from the content, like Markdown does with `---`. Easy to understand, easy to write, easy to read in TUI clients, easy to parse, easy to draw. But it violates the idea of leaving the styling to the client. It’s extraneous content, or, better, no content at all. I think it is a bad idea to introduce those kinds of features to Gemtext, even if it is tempting and easy to do.


A Proposal


Based on the above remarks, I propose the following changes for Gemtext specification 0.24.1, including using the postfix `-line` for everything in the ABNF to emphasize the line-based approach:


gemtext-document      = 1*gemtext-line
gemtext-line          =  text-line / link-line / heading-line / list-line
gemtext-line          =/ quote-line / preformat-toggle-line
link-line             = "=>" SP URI-reference [SP 1*(VCHAR / SP)] CRLF
heading-line          = 1*4"#" SP text-line
list-line             = "*" SP text-line
quote-line            = ">" SP text-line
preformat-toggle-line = "```" text-line
text-line             = *(VCHAR / SP) CRLF

Especially a `list-item` is out of place, because in a line-based approach there are neither beginnings nor ends for lists.


If this feels too much for some line types, then an approach without the repeated use of this postfix to create a concise version is an alternative:


gemtext-document = 1*gemtext-line
gemtext-line     = text / link / heading / list / quote / preformat-toggle
link             = "=>" SP URI-reference [SP 1*(VCHAR / SP)] CRLF
heading          = 1*4"#" SP text
list             = "*" SP text
quote            = ">" SP text
preformat-toggle = "```" text
text             = *(VCHAR / SP) CRLF

The ABNF can be validated, if all used parts of all referenced RFCs without duplicates are included, which I did in a downloadable text file.


ABNF Validator

Full ABNF for Validation


Note: The referenced RFCs relax the demands for `CRLF` and allow `LF` in general as well. Only the request and first response line in the Gemini protocol must end with `CRLF`.

-- Response ended

-- Page fetched on Tue May 7 03:35:15 2024