Cool URLs

An idea about preserving web content, even though the author acknowledges difficulties in maintaining original URLs.

A great idea

According to the people who think about web standards, the World Wide Web Consortium (W3C), Cool URIs don’t change. It’s a basic idea: once a page on the internet is created, it should – in theory and with enough money – stay at the same address forever. Always there and available for reference. I think the idea is sound. We have all come across links that no longer work; a concept that became known as link rot. I wrote about it in June 2004, in a post entitled Learning from Others.

Harder in practice

I might like and support the idea, but as I enter my 29th year writing on the web, I know I’ve been unable to honour the concept. For example, above is not, strictly, the right link for the ‘Learning from Others’ post. I might argue that the version here, which is on the domain I used to use for blogging, is more accurate. The content is the same, they’re still my words, but was the original home.

That’s still not the original URL, however. Sometime in the mid-2000s I archived the site when I switched blogging platform. I imported posts into the new tool without much thought. I wasn’t sure I was going to keep the old site around. I also copied some of the posts into so that I would keep a copy even if I killed off the other site. The closest to the original URL is now at the wonderful Internet Archive (or Wayback Machine), and is a snapshot from July 2004: Archived: Learning from Others.

If you didn’t know about the archived version and tried to go to the original post it would generate a ‘page not found’ type of error; 404, in internet speak. Even worse, there would be almost no clue that it’s still possible to read the original words. I could do something clever on the server to rewrite the links. Maybe I’ll get to that when I have time to write some code.

Correcting link rot

Those original posts were not updated when I mothballed the site into the new platform. As a result, had quite a bit of internal link rot. Occasionally, I look back and read something old and decide to correct the internal links. Eventually, I will finish that task and everything will be properly linked.

While I am in ‘correction mode’, I also check other outbound links on those old posts. If they no longer work I decided I’d update them. If I can find an online version of the original text at a different URL then I correct it. If I don’t, I try the Internet Archive. If I can find neither, I leave the broken link.

Last summer, James Cridland wrote about Fixing 404 errors and link rot, while maintaining authenticity. He took a different approach to updating dead links. I think his path is more inline with the ‘cool URL’ concept, but I’m happy with my compromise.

My weeknotes

When I started my weeknotes, I decided to prepare for future link rot and preemptively included a reference to the Internet Archive version of all the things I’d linked to in that week’s note. That way, I knew there would be a snapshot taken around the time I wrote a note and, in the future, it would be easier to navigate to the archive if link was broken.

I have been reviewing my 2023 weeknotes. It’s an interesting exercise to understand my year. But, I think the ‘Archive’ section that includes the Wayback Machine links makes reading a series of notes harder than it need be.

So, while I’m going to make sure all the links are added to the Internet Archive whenever I post a new weeknote, I’m dropping that section.

My URLs, however, will stay cool (perhaps the only thing I do that is).