-- Leo's gemini proxy

-- Connecting to thrig.me:1965...

-- Connected

-- Sending request

-- Meta line: 20 text/gemini

In the Hall of the Broken Things


gemini://michaelnordmeyer.com/gemlog/2024-02-24-why-i-blocked-bing-yandex-and-other-high-profile-crawlers.gmi


Blocking is rather common on the modern web, especially if you stray from the norms. A typical DuckDuckGo search results in:


                                      DuckDuckGo

                        Unfortunately, bots use DuckDuckGo too.
     Please complete the following challenge to confirm this search was made by a
                                        human.
                         Select all squares containing a duck:
                                          [ ]
                                          [ ]
                                          [ ]
                                          [ ]
                                          [ ]
                                          [ ]
                                          [ ]
                                          [ ]
                                          [ ]
                                        Submit

                                [ ] Images not loading?
                          Please email the following code to:

                               error-lite@duckduckgo.com

duckduckgo.png


If you select no squares (on account of none of them containing a duck) you get an error and a "three strikes" warning. The actual workaround (at present; I'm sure that they will get around to making it worse) is to hit "reload" and the nag screen goes away. Doubtless the behavior has something to do with my choice of User-Agent header:


        Strcat_charp( s,
          "User-Agent: I met a traveller from an antique "
          "land, Who said--\"Two vast and trunkless legs "
          "of stone Stand in the desert. . . . Near them, "
          "on the sand, Half sunk a shattered visage lies, "
          "whose frown, And wrinkled lip, and sneer of "
          "cold command, Tell that its sculptor well those "
          "passions read Which yet survive, stamped on "
          "these lifeless things, The hand that mocked "
          "them, and the heart that fed; And on the "
          "pedestal, these words appear: My name is "
          "Ozymandias, King of Kings; Look on my "
          "User-Agent, ye Mighty, and despair! Nothing "
          "beside remains. Round the decay Of that "
          "colossal Wreck, boundless and bare The lone and "
          "level sands stretch far away.\r\n");

which was set on account of DuckDuckGo starting to clutch their pearls over the old agent string (MSIE 8 on Windows 8, yeah baby yeah!). So some poor slob has the job of squinting at agent strings and doing random things with the code in response. Well, there are north of eight billion humans on the planet, so you have to keep them productive doing something, I guess?


Anyways, one might suspect that the competent bot writers are using headless Chrome agents wired up to mechanical turk wage-slaves to fill in any annoying CAPTCHA that their automation has difficulty solving. This leaves ordinary users suffering from CAPTCHA. Probably not a good experience, but then again it's been a while since those "do no evil" days, and the internet has evolved.


CloudFlare is another problem; probably this was due to them trying to run JavaScript to "check the security" of the SSL connection in w3m. Hence the new moniker for them, ClownFlare, as w3m is rather lacking in JavaScript support, and the notion of having to run JavaScript, a source of security issues, to check on who knows what with SSL strikes me as being, at best, silly. At present this isn't much of a problem as one usually can find some other page that hasn't been clownflared. Rather a larger problem for those academic papers is the iron curtain of JavaScript on the paper serving sites. The ones that do not use CloudFlare, that is.


Some may observe the modern web (big, bloated, slow, and annoying) as an outgrowth of Windows (big, bloated, slow, and annoying); naturally, one may expect that Windows has similar annoyances. Apparently it does, as entire articles are posted on how to make Windows less terrible:


https://arstechnica.com/gadgets/2024/02/what-i-do-to-clean-up-a-clean-install-of-windows-11-23h2-and-edge/


> I've used computers daily since the 80s, and feel like they have peaked in real usability along the way. While I used to look forward to new releases, now I dread them.

>

> What always baffles me, while fighting with entirely avoidable hassles, is how the engineers and designers who created those hassles can live with their own products, away from work. Do they not also encounter the same soul-grinding annoyances and time-consuming corrections required to actually use what they have built?

> — dlux, in the comments section


Probably the blocking will get worse as everyone circles their wagons as the old web hardens to scar tissue. On the other hand, there are new engines like Kagi or marginalia.nu that provide alternatives for the probably more geek-minded out there.


Grieg: Peer Gynt Suite No. 1, "In the Hall of the Mountain King"

-- Response ended

-- Page fetched on Tue May 21 20:02:31 2024