-- Leo's gemini proxy

-- Connecting to geminispace.info:1965...

-- Connected

-- Sending request

-- Meta line: 20 text/gemini

geminispace.info - Gemini Search Engine


Home

Search

Query backlinks


Documentation: Searching

Documentation: Indexing

Documentation: Backlinks


Documentation: Indexing


GUS is a search engine for all content served over the Gemini Protocol. It can help you track down textual pages (e.g., `text/gemini`, `text/plain`, `text/markdown`) with content containing your search terms, but it can just as easily help you track down binary files (e.g., images, mp3s) which happen to be served over the Gemini protocol.


What does GUS index?


GUS will only index content within Geminispace, and will neither follow nor index links out to other protocols, like Http or Gopher. GUS will only crawl outwards by following Gemini links found within `text/gemini` pages. If you return a `text/plain` mimetype for a page, Gemini links within it will not register with GUS (though the content of the `text/plain` page will itself get indexed).


Textual pages over 5 MB in size will not be indexed.


Please note that GUS' indexing has provisions for manually excluding content from it, which maintainers will typically use to exclude pages and domains that cause issues with index relevance or crawl success. GUS ends up crawling weird protocol experiments, proofs of concepts, and whatever other bizarre bits of technical creativity folks put up in Geminispace, so it is a continual effort to keep the index healthy. Please don't take it personally if your content ends up excluded, and I promise we are continually working to make GUS indexing more resilient and scalable!


Controlling what GUS indexes with a robots.txt


To control crawling of your site, you can use a robots.txt file, Place it in your capsule's root directory such that a request for "robots.txt" will fetch it. It should be returned with a mimetype of `text/plain`.


geminispace.info obeys the following user-agents, listed in descending priority:

gus

indexer

*


How can I recognize GUS requests?


You can identify the GUS by looking for any requests to your site made by the following IP addresses:


IPv6: 2a03:4000:53:f82:b8f1:ff:fe15:5ec9

IPv4: 202.61.246.155


Does GUS keep my content forever?


No. After repeated failed attempts to connect to a page (e.g., because it moved, or because the capsule got taken down, or because of a server error on your host), GUS will invalidate that page after 1 month of unavailability in its index, thus removing it from search results.

-- Response ended

-- Page fetched on Fri Aug 6 04:24:27 2021