-- Leo's gemini proxy

-- Connecting to tilde.club:1965...

-- Connected

-- Sending request

-- Meta line: 20 text/gemini; lang=en



▀█▄ ▄▄ ▄▄ ██ ▄▄

▀▀ ██ ██ ██ ▀▀ ██

██▄███▄ ▄█████▄ ▄█████▄ ██ ▄██▀ ███████ ████ ▄█████▄ ██ ▄██▀

██▀ ▀██ ▀ ▄▄▄██ ██▀ ▀ ██▄██ ██ ██ ██▀ ▀ ██▄██

██ ██ ▄██▀▀▀██ ██ ██▀██▄ ██ ██ ██ ██▀██▄

███▄▄██▀ ██▄▄▄███ ▀██▄▄▄▄█ ██ ▀█▄ ██▄▄▄ ▄▄▄██▄▄▄ ▀██▄▄▄▄█ ██ ▀█▄

▀▀ ▀▀▀ ▀▀▀▀ ▀▀ ▀▀▀▀▀ ▀▀ ▀▀▀ ▀▀▀▀ ▀▀▀▀▀▀▀▀ ▀▀▀▀▀ ▀▀ ▀▀▀


Backtick API

2022-09-11 | #tilde.wtf #backtick #golang #postgresql #search


Now that I've built out the crawler and the database that spiders the tildeverse at a regular interval, it was time to create the API component. I wanted to ensure that the community would not be bound to a WWW-only interface for the search data (though I will be creating a WWW frontend myself here in the near future). Thus, it makes sense to first create an API that allows you to search through the index. This ensures the community can create all sorts of frontends to the data they'd like -- whether it's IRC, Gemini, etc.



API


The API is now up and running and serves parseable JSON with the results for your query. You can access it via HTTPS:


https://search.tilde.wtf/search?q=tilde



It supports pagination with an offset value. While this is not the best choice for performance over a very large index, the tildeverse probably isn't going to become large enough for this to be an issue. The index can multiply in size many times over before it may become an issue.



SQL


So how is the search happening? It's actually using PostgreSQL's smart full text search:


rows, err := db.Query(`SELECT url, title, crawled_on,
ts_headline(body, plainto_tsquery(' '|| $1 ||' '), 'MaxFragments=0, MinWords=25, MaxWords=60')
AS headline FROM tildes WHERE searchtext @@ plainto_tsquery(' '|| $2 ||' ')
LIMIT 30 OFFSET $3;`, query, query, offset)

Details are available in the Postgres docs:


https://www.postgresql.org/docs/current/textsearch-controls.html


The benefit of all this is: "PostgreSQL provides two predefined ranking functions, which take into account lexical, proximity, and structural information; that is, they consider how often the query terms appear in the document, how close together the terms are in the document, and how important is the part of the document where they occur."


All of this without a bunch of complicated effort on my part. Lovely.


Next


Next up is to work on a basic WWW frontend that will live as the official tilde.wtf page. I'll also be building this component in Go and make use of basic html/template's for it. It will have zero javascript and be minimal.

-- Response ended

-- Page fetched on Tue May 21 08:26:45 2024