Link Rot and Permanent Redirects

I’ve changed “engines” several times over the lifespan of this blog, and have even changed TLDs a couple of times, moved from HTTP to HTTPS, along with several URL scheme changes. I was generating heaps of link rot: Links from wikis and Stack Overflow, from blogs like Atwood’s Coding Horror, and media and countless message boards going to 404s.

In my most recent move — from a very efficient blogging engine that I wrote, primarily to facilitate running on the tiniest server imaginable, moving to a WordPress blog for content management advantages — I leveraged a significant number of nginx rewrite rules to try to avoid this problem. Rules crossing domains, differing URL structures, redirecting RSS feed users, etc. Rules leveraging perl to decompose URLs and recast them to their more contemporary form.

For all of these deprecated URLs I served up a permanent redirect saying to every caller “The URL you actually want is over there. Use it from this point forward.”

But of course the static links on other sites (e.g. in comments on StackOverflow) never change, forever pointing to the original, time-limited link, awaiting the day they inevitably turn to link rot. In an ideal world these static content sites would have a perpetual bot validating links, updating them to contemporary forms where appropriate, or flagging them as unavailable when they turn to rot (or when they revert to placeholder pages as domains get scooped up after being abandoned).

The static content is a bit more of an issue, but what really surprises me, though, are the number of RSS readers and other automated consumers that just completely ignore permanent redirects, treating every one in an immediately forgotten fashion. Literally years after I switched to this platform, serving up 301 permanent redirects to millions of requests in the meantime, these readers still keep slamming on the no longer available URLs.

No longer available because I finally dropped the rewrite rules. They added complexity and risks to the ruleset, and required a module that I didn’t want to install now that I switched to the mainline version of nginx (primarily to have some fun with the HTTP/2 functionality).

If you’ve hit the root page of this blog because you were sent to one of these very dated links — I apologize. The web’s mechanism of dealing with link change is far from ideal.