-- Leo's gemini proxy

-- Connecting to gemi.dev:1965...

-- Connected

-- Sending request

-- Meta line: 20 text/gemini

Gemipedia Improvements

2022-05-23 |#gemipedia #wikipedia #cgi | @Acidus


I'm really pleased with Gemipedia, both in people's response to it, and also just how much I'm personally using it. I've basically stopped using the dedicated Wikipedia app on my phone, and use it exclusive on my Kindle. Since launching it about 2 weeks ago, I've been hard at working improving it.


Featured Content


Some of my favorite experience with Wikipedia is stumbling onto a topic I would have never thought to read. For example, prepare to spend about 15 minutes on this lovely piece:


The List of Common Misconceptions


Finding content like that is fun. Wikipedia actually has several features to surface cool content, and makes them available via their API.


I have created a "Featured Content" page. This updates daily and shows the Featured Article from the front page of Wikipedia. Featured Articles are a special designation that only the best, most complete, and well written article receive. Only about 0.001% of all articles in the English Wikipedia are featured articles. Even if it's a topic you have never heard of, it will be well rounded and engaging content.


Wikipedia also lists the most read articles for that day. Personally I use this as kind of a crowd-sourced filter to see what the current topics are in the zeitgeist without having to actual read the news site. For example, apparently Australia had an national election and elected has a new PM.


Gemipedia's Featured Content (updated daily)


New Search Results


Originally I exposed search, but it wasn't a great experience. You would search for "Ethernet" and then get a search result page, and then you would click the "ethernet" article. The "Go To Article" works much better for just trying to get to an article about something. Search was just adding a step.


The other thing was the search API was using was pretty old. The results were fine, but the metadata returned was poor. You basically got the name of the article, and a snippet of the article where the text was found. I found that, unlike with a general search engine, the snippet where the term was found is often less valuable. Instead you are trying to figure out "what is this article about?" to decide if you should read it.


To help, I updated to use a a more modern Wikipedia API for search, which allows me to display richer results:

Thumbnails for a featured image of the result, making it easier to see if a result is interesting/relevant or not, if available.

Description of the article, if available


Once I had this built, I realized search was much more of an additive experience ("show me more articles related to X) than a targeted one ("find my X"). As an example, here are the results for "Amiga" surfacing interesting articles all about Amiga computers:


Gemipedia: Amiga search results


Because of this, each article now includes a link to search for more articles which mention the topic of the article you are reading. This is a great way to find extra articles to read, without needing to dive into the article references in each section.


New Parser and Renderer


Without getting into the weeds too much, the core of Gemipedia is the code which parses Wikipedia HTML and then renders that to gemtext. To get started, I ignored most of the content (tables, media, sidebars, lists, etc) and added just enough logic to render a simple article. Then I grew it organically, adding more capabilities as I tried more complex articles. Ultimately this led to messy code that made assumptions about the structure and order of content that was not flexible enough. Additionally, the renderer didn't track when new lines started, so it might attempt to write out a new list line or link line in the middle of a text line, leading to broken gemtext that wouldn't render correctly.


So I rewrote the parser making it more generic and able to handle strangely formatted content without needing special code. And I built a new renderer where it can ensure it is at the start of a new line before writing out new line types like links or block quotes. Both of these improvements solved so many issues where article content would be missing or not rendered properly, especially on more complex pages.


Other Improvements


Gemipedia has entered a great virtuous cycle in the last 2 weeks. It's pretty good, so I use it more. By using it more, I make small improvements which make it better. Because it's even better, I use it even more. Here are some of the other things I've added:


Add white background to transparent images for better reading on clients with dark mode

Sharper, clearer-looking math formulas by referencing Wikipedia's PNGs directly

Serve media with proper extension, mimetype, for better downloading (easier to tell if something is an animated GIF, etc)

Support for Image Galleries (images would show up using generic media finder, but weren't getting the appropriate captions)

Use 1.5 resolution images for thumbnails if 2x ones aren't available

Support for more tables types

Support links to other articles via image maps on all images, not just timelines

-- Response ended

-- Page fetched on Wed May 22 03:16:08 2024