2024-05-11 Why Not Just Improve Gemini? Scroll's Approach in Diverging From Gemini

In order to explore this question, I must link to a video on platform values that I have linked to before in a previous post:

Platform as a Reflection of Values by Bryan Cantrill

The criticism of improving Gemini or using Markdown instead of creating your own stuff are some of the same criticisms that Gemini had early on, many of which are answered well in its FAQ. Everything in Gemini's FAQ regarding the creation of a new protocol applies here as well, but instead of not just using HTML, it's Gemini.

Instead, I'm going to take a different approach based on *community values* and implementation inconsistency. I tried to emphasize the community values aspect in my post on Scroll's diverging values, but perhaps it wasn't successful, so I will continue to expand on this idea:

2024-05-06 Diverging Values and the Scroll Protocol

I should note that this post was prompted by a well-reasoned critique of the above linked post, and I intend to answer many of its questions:

breakfast_champion's Response to Diverging Values and the Scroll Protocol

Gemini Can Handle Any Document Type

It is true that Gemini can handle any document type. In reality, document types have to be supported by browsers or document viewers and servers. When I tried to emphasize "Platform as a Reflection of Values," I was implying that the community and its values drive the document formats that are used and supported. Gemini's values are minimalist-heavy, and let's say over half of its community agrees with that (I do not know the real stats here, but a lot of people do seem to agree with the heavy focus on minimalism). Markdown has not become widespread within Geminispace because it is not minimalist enough, because minimalism is a driving value of the community (and because Markdown parsing is much harder to do).

This has led to two things, however:

1. Many people leave or never join Gemini because its document format is *too* minimalist. Their values do not match the community's values.

2. Many people within Geminispace have to find workarounds to fit what they need into Gemtext. They may have some values that match the community's, but not all of them match.

As evidence of this is the lack of new document formats within Geminispace, and the perception of outsiders that the use of other document formats wouldn't be welcome. None of this is based on the technicals of Gemini, but rather the community values. For document formats to be used, for them to be added to existing Gemini browsers, for them to be supported on servers, they must be embraced by the community itself.

An example of this is Markdown. Most browsers don't support Markdown completely, and Profectus doesn't even support all of Markdown (inline links aren't implemented yet). There is in fact a reason for this, but it also suggests that there are limits to how far the community will go in supporting other document formats.

Lastly, other document formats cannot be used as the index without servers adding support for them.

Why Not Use Markdown?

Gemini didn't use Markdown for several reasons relating to wanting a line-based document format that is easy to parse by looking at the first around 3-4 characters of a line. Scroll takes a similar approach, but there are more reasons to not prefer Markdown.

List Nesting is Overly Complicated

In Markdown, you indent list items by indenting with whitespace. Nesting under a nested item therefore gets even more indentation. This is good for those who read Markdown in plain text, but reading in plain text isn't exactly useful for browsers that are meant to display markup. It is also easy for writers. However, it's much harder for parsers to figure this out, and there can in fact be ambiguities (which will be discussed later).

A simpler solution that is just as easy to read and write is what AsciiDoc does, which I've talked about here:

2024-03-25 The Simplicity of List Nesting: How AsciiDoc Does It

Too Many Heading Variants

Markdown has too many unnecessary heading variants:

Part of this was to ease transition to Markdown from other markup languages, but it is unnecessary, creates more work for parsers, breaks the idea of a line-based parser, and creates non-intuitive interactions with text reflowing.

Text Reflowing and Thematic Breaks

Text reflowing in Markdown is one of the most unnecessary parts of Markdown; it gets you very little benefit for terrible complexity and ambiguity. It turns out hard-wrapping lines allows for poetry, ascii art, and other textual art, while still retaining the semantics of paragraphs and paragraph breaks. It is also very intuitive.

In Markdown, reflowing text behaves in weird ways when we get to list items and blockquotes. To fully detach from a blockquote, one must put two newlines after it. If you placed one newline, the text reflows.

I *frequently* trip up on this while writing Markdown, because it isn't intuitive at all.

Additionally, it turns out thematic breaks use the same syntax as headings, but they are distinguished based on whether there's an empty-ish line above it. I say empty-ish because if it is nested inside blockquotes, then it should look like this:

I ended up learning about the complexities of thematic breaks in Markdown when helping out with Markdown syntax highlighting for the Zed editor. I wrote a comment on an issue here about this:

Markdown Parsing for Zed

To demonstrate the complexities of Markdown, I will post a snippet of Markdown:

Markdown uses the exact same syntax for thematic breaks as for headings. Some variants of Markdown "fixed" this by adding an *additional variant* of thematic breaks that look like this:

So now there's two variants of thematic breaks *and* two variants of heading syntax. The craziness is crazying! Thank God Gemini didn't use Markdown!

Headings Inside List Items

Yep! Markdown allows headings to be placed *inside* list items. What the what?!? Why is this useful?

Here's an example where a Markdown renderer (Github) actually renders this:

Markdown Craziness

Markdown has some good ideas, but it turns out most of its good ideas actually just come from other older markup languages. More on this will be discussed in a future post.

Italics and Bold

Italics and bold in Markdown also have two variants, underscores and asterisks. Two asterists = bold; One asterisk = italics. This just creates more *unnecessary* work for the parser, and creates an inconsistency for writers and those reading the plain text of Markdown. Additionally, if you wanted to do both bold and italics, then you would use three asterisks or three underscores. However, what happens when italics should end before bold ends?

Implementing this requires either a lookahead or storing two characters at once: you *must* look at the next character whenever there is an asterisk to check if it is a double asterisk first before deciding that it's a singular asterisk. Therefore, tripple asterisks will *always* be parsed as a bold toggle and then an italic toggle. This is how my own code works. This is certainly doable, but is it easy to write and read?

AsciiDoc takes a different approach that is easier to read: one asterisk for bold, one underscore for italic. Underscore+Asterisk then becomes a bold+italic, but you don't need to do a lookahead like with the double asterisks/underscores in Markdown. Unfortunately, as I discussed in my emphasis and strong article, this goes counter to common usage:

The Necessary Semantics of Emphasis and Strong

Markup Essentials

The other critique is that many of Scroll's additions to scrolltext are nicities that aren't compelling. I do not agree. I view them as essentials that are particularly compelling because of my own, and other people's, frustrations with using Gemtext. In fact, many of these features were discussed in the mailinglist, and many people, including myself, were in support of italics and bold then. The side that viewed these as too complicated to parse ended up winning out because Solderpunk decided to go along with them. Ever since then I have always found this to be very unfortunate, especially when people use markdown-syntax for italics and bold in their Gemtext anyways.

List Nesting, Italics, Bold, and 4th-level Headings

I've already discussed the uses of list nesting, italics, and bold in previous articles:

2024-03-26 The Case for a 4th-Level Heading

2024-03-25 The Simplicity of List Nesting: How AsciiDoc Does It

2024-03-24 The Necessary Semantics behind Emphasis and Strong

Geminispace might not use these, but that doesn't mean they aren't necessary. Rather, it means people found less-than-ideal workarounds or they just didn't publish the type of content that requires these markup essentials.

In my own case, I've not posted quite a bit of content to Gemini *because* they require these, and I couldn't find good workarounds. Rather than tring to shoehorn this stuff into Gemtext, I just chose to not put it up on Geminispace instead. There's really not much I can do about that.

For example, I have a Senior Thesis that I *could* try to post here on Geminispace in Gemtext, except I can't because removing the use of 4th-level headings would require *extensive* work and make the document significantly less readable. The same can be said for outlines, which require list nesting. And citations with links in them? Yeah, good luck.

In the case of emphasis and strong, sometimes I just don't care and end up using markdown-syntax for bold and italics anyways even if it's unreadable in most Gemini clients, like I did with Talmud commentaries from the Sefaria proxy. Emphasis and strong are actually essential for inlined commentaries, like that with the Talmud and Biblical scripture, because it distinguishes between the commentary and the translated text. Currently, all users of clients that don't support italics and bold will find that it's pretty hard to read. There's no good solution for this, as other symbols are sometimes used in the text or commentaries themselves (like square brackets), so I just don't even try to do anything about this.

Berakhot 2a

I *frequently* run into wanting list nesting for feature lists and todo lists for my projects, but I can't use them, so I use awful workarounds that are less readable, like using colons followed by a list of items after the colon all on the same bullet. I've ran into cases where this literally doesn't work well at all, and introduces the need to use both commas *and* semicolons to differentiate between sub-items that use commas and the other sub-items. To add insult to injury, some Gemini clients will even strip whitespace at the beginning of lines, so you can't even *simulate* nested lists, or indent paragraphs, if you wanted to, unless of course you use preformatted blocks. It's so obnoxious.

I was emailed by someone recently about how they find gemtext to be annoying as well, and how plain text is actually more usable and *less restrictive* than Gemtext. Quoted lists is one of the things they mentioned as something they would find useful.

Gemtext's restrictiveness, even in cases where sticking to the rule of 3-4 characters at the beginning of a line determining the linetype, like with AsciiDoc's list nesting, is the biggest criticism against Gemini that I have seen. A lot of people are detracted away from Gemini *because* of the document format. They might even like that the protocol is simple, that it's not extensible, but the document format is frequently viewed as way too restrictive. [^1]

[^1] It is true that Gemini doesn't need to appeal to everyone. Hence why I created the Scroll protocol...

Streamable Document Format

The emphasis on a streamable document format in Scroll is not because Gemtext is not streamable, or because Gemini doesn't support streaming, but because of the volatile streaming situation within Gemini early on, before streaming was explicitly allowed in the spec. Many of these issues exist even today, unfortunately.

Gemini didn't emphasize or explicitly allow for text streaming early on. In fact, the idea originates with ruminations on the IRC and then a post by Tomasino on the mailinglist, followed by a post from Solderpunk, and then a change in the spec.

gemini streaming (Tomasino)

A Vision for Gemini Applications (Solderpunk)

Mozz's Response

There are a couple of things that happened because of the way this played out:

1. Clients that didn't get updated after this spec change will probably not handle text streaming.

2. Because it's optional, not all clients support text streaming, and even today many clients wait until the connection closes before presenting the document.

3. It was mostly too late to consider keep-alive packets and other things that would help streaming.

4. Streaming remains controversial in some parts of the community, just like it was at this time too. There was some pushback against streaming before the spec update because the spec didn't *explicitly* allow for handling incoming data before the connection has ended, and I guess nobody brought up keep-alive TCP packets, which gave the impression that in streaming situations, one wouldn't be able to tell if one side of the connection broke or errored out.

5. All of this applies to other streaming, like audio.

I consider this one of the biggest blunders in Gemini that wasn't intentional at all, because now we have a wide mix of old clients that don't present documents until the connection closes, which doesn't just prevent streaming, it also hinders the UX of these clients when on slow connections. Great examples include GemiNaut and Amfora.

Allowing that clients can present text *before* the connection closes in the spec from the very beginning means developers will actually consider this aspect when they write their clients and we won't have a situation where older clients function differently from new clients. Having keep-alive packets is a bonus that helps determine when one side of the connection errored out or broke. Keep-alive packets are a part of TCP and they let the other side of the connection know that you are still connected. It basically fixes a lot of the problems with keeping connections open infinitely in Gemini. Unfortunately, they weren't added to the Gemini spec, meaning many clients probably don't utilize this at all, and yet some will, which again creates an inconsistency, and prevents servers from knowing whether this will be supported or not.

In order to have documents be displayed as they are streamed in, the document format needs to be parseable in a linear stream fashion. This is particularly important if we were to consider adding things like tables, for example. Acidus wrote about why this is important for tables:

gemini://gemi.dev/gemlog/2023-06-30-tables-in-geminispace.gmi

I disagree, however, with Acidus' overall conclusion that tables wouldn't work well with streaming, would break the line-oriented system Gemtext has[^2], and that it would complicate clients, but these are discussions for a future article.

So, here we also meet the "Platform as a Reflection of Values." Does Gemini support streaming well? I would say yes, considering I've got music streaming, text streaming, and a public radio all working very well on AuraGem. Others will say no. Does Gemini support large documents? I would say yes, but the FAQ says (or used to say, at least) no. Should all clients start presenting documents as they are streamed in rather than waiting for the connection to close to present them? I would say yes, but many clients don't in fact do this, including GemiNaut and Rosy Crow (and Amfora). All of this creates an inconsistency in the usability of Gemini browsers. In fact, not to be a downer, but this is like the web all over again: the UX is vastly different in different browsers with something as simple as viewing/downloading files, and servers now have to reduce themselves to what is supported on as many browsers as possible, or just deal with not supporting certain browsers.

Unfortuantely, by the time Gemini was able to consider some of these things, it was too late as there were too many client implementations that would have broke or become spec-non-compliant had some of these things been added to the spec. Quick adoption isn't always a good thing unless you can get the essentials in a spec nailed down from the very beginning. Other issues that resulted from the way Gemini's spec developed include the handling of certificates - should expiry dates be validated? What about common names and SANs? Different clients implement these differently. Not having it in the spec or the reference implementation(s) before the explosive growth meant client and server authors often didn't consider many of these problems. Hindsight's 20/20, as they say.

[^2] Nothing in the spec says you cannot rerender previous lines, and rendering is often disconnected from parsing in general, especially in TUI and GUI clients, although it's not common in CLI clients. I would say this *doesn't* add "a ton of complexity to the clients" when rendering and parsing are disconnected. Tables have other problems, but this realy isn't one of them.

Protocol Essentials

Language and Internationalization

Some of Gemini's biggest improvements upon Gopher are that it added a mimetype on responses that gave clients the type of document *and* its charset. Gemini also made UTF-8 text the default, which allowed for internationalization. In fact, you can even put a list of languages in the mimetype for multi-language documents. Unfortunately, this isn't used properly, probably because servers don't have easy mechanisms for describing or detecting the language of a document. This was described well by Martin/clehaxze in his article on crawler pitfalls where he talks about servers sending the wrong mimetype.

Common Gemini crawler pitfalls

While the above post emphasizes mimetypes in general, I want to add that the language is often either missing or just plain incorrect on a lot of servers. This is a complete failure of Gemini servers in general, and it has effectively diminished a very useful function that the protocol provides us.

This is why Scroll's reference server implementation tries to correct this by providing a means of choosing a default language for one's scrollery[^3], and also the ability to choose a language for each of your documents. More work needs to be done to add more options, like providing the same document in different languages, but this first step is important.

Scroll also adds another option, which is the ability to request documents in a language, and if no matches were made, then the default language would be received. I view this to be very useful for wikis and multi-language scrolleries. Rather than having to provide several different links for every language variant of a document[^4], one could provide one link, and the browser handles choosing the language. This is one more aspect of giving the *user* control. In Gemini, the person who posted the link controls what language you get. This works, but it's not ideal. I see Scroll's addition of languages on requests as a finishing of the job that Gemini started when it allowed for languages in responses. Languages in requests is also particularly useful when you want your error messages to be in the user's language, especially in circumstances where errors are not fully explained by their error codes.

[^3] Scrollery is the equivalent of capsule for the Scroll protocol.

[^4] Having to show different links for different languages of the same document also puts more pressure on servers to create a readable and easy to use navigation system on their pages. Being able to request documents in a language by the client effectively moves this navigational aspect to clients rather than servers. Servers already have enough to deal with, and clients are best positioned to handle languages anyways, since they know what language the user has set in their settings. The server is merely responsible for giving documents in the requested language if available, and falling back to a default language if not.

Different Content Types

The final thing I want to emphasize is that Geminispace has gotten away with not having some of these markup essentials because of the type of content that is written in Geminispace. This is another big divergence that Scroll takes from Gemini in being more *document-centric* rather than page-centric. Scroll is tailored for longer documents that require more markup. Things like academic papers, essays, books, outlines, etc. are what I am trying to tailor Scroll to. This is very different from Geminispace, which is mostly gemlogs and short documents as pages. The idea of splitting long documents into pages is in fact a tradition on the web, something that I am making a conscious decision to move away from. This is one reason I chose the name "Scroll" to begin with. More on this will be discussed in that future post I mentioned previously.