Moving Targets, and Why They’re Bad

Executive summary: if you write anything or show off any images, you should make a real website, both for yourself and for others.

We’ve been updating the Real-Time Rendering site (take a peek – you might at least enjoy the 4th edition cover). Today I’ve been grinding through updating URLs for the references in the book. Even though the book’s not yet out, you can see what articles we reference and jump to the article from this page.

Most of the articles can be found through using Google or Google Scholar. A few articles are trickier to find, or have a few URLs that are relevant – that’s the value I feel I’m adding by doing this laborious task. The other reason is for helping avoid link rot – I’ll explain that in a minute. Another is virus protection. For example, one blog URL, for the article “Render Color Spaces” by Anders Langlands, has had its domain anderslanglands.com (DON’T GO THERE (REALLY)) taken over by some evil entity in May 2018 and now leads to a page full of nastiness.

In going through our reference page today and adding links, doing so reminds me how tenuous our storage of knowledge is for some resources on the internet. Printed journals at least have a bunch of copies around the world, vs. one point of failure. I’ve noted this before. My point today is this: if you publish anything, go buy yourself a domain and host it somewhere (I like bluehost, as do Morgan McGuire and Andrew Glassner, but there are no doubt cheaper ways). All totaled, this will cost you maybe around $110 a year. Do it, if you care about sharing your work or are at all serious about your career (e.g., lose your job or want another? You now have a website holding your CV or work, ready to show). URLs have a permanence to them, vs. company-specific semi-hosting schemes such as Github or Dropbox, where the rules can and do change. For example, I just found a Github-based blog entry from Feb. 2017 that’s now gone (luckily still on archive.org). With some poking around, I found that the blog entry is in fact still on Github, but appeared to be gone because Github had changed its URL scheme and did not redirect from the old URL to the new one.

Once you have a hosted URL, look at how others arrange their resources, e.g., Morgan McGuire recently moved all his content from the Williams College website to his own personal site. Grab a free template, say from W3 Schools or copy a site you like. Put any articles or presentations or images or whatever that you want people to find on that site. Me, I’m old school; I use basic HTML with a text editor and FileZilla for transfers, end of story. Start a WordPress or other blog, which is then hosted on your site and so won’t die off so easily. Once you have a modest site up, you are now done, your contributions to civilization are available to everyone until you forget to renew your domain or pay for web hosting. Assuming you remember, your content is available until you’re both dead and no one else keeps up with the payments (another good reason to renew for the longest duration). Setting up your own website isn’t some ego-stroking thing on your part – some of the rest of us want continued access to the content you’ve provided, so please do keep it available. If your goal in writing is to help the graphics community, then allow your work to live as long as possible. “But my blog posts and whatnot have a short freshness ‘read by’ date,” you complain. Let us decide that; as someone who maintains the Graphics Gems repository, a collection of articles from 1990-1995, I know people are still using this code and the related articles, as they report bugs and errata to me. “I have tenure, and my school’s been around for 200 years.” So when you retire, they’re going to keep your site going?

Most of us don’t grab a URL and host it, which is a pity for all concerned. Most of the links I fixed today rotted for one of three reasons: the site itself died (e.g., the company disappeared; I now can’t find this talk from 2012 anywhere, and at least 14 other sites link to it), the subdirectory on the site was deleted (e.g., for a student or faculty member no longer at the institution), or the URLs were reorganized and no redirection was put in place (and if you’re a webmaster, please don’t do this – take the time to put in some redirection, no matter how untidy it may feel to you). Some resources that still work are hanging on by a thread, e.g., three articles on our page are served up by FTP only. FTP!

BTW, people have worked on how to have their sites outlive them, but so far I don’t know of a convincing system, one where the service itself is likely to outlast its participants. Some blog and presentation content does outlive its creator, or at least its original URL, as much of the internet gets archived by The Wayback Machine. So, for the virus-ridden anderslanglands.com site, the article I wanted to link to is available on archive.org. Jendrik Illner does something for his (wonderful) summary posts that I hadn’t seen before: each link also has a “wayback-archive” link for convenience, in case the link no longer works. You can also easily try such links yourself on any dead site by using this Chrome extension. With this extension active, by default a dead page will cause the extension to offer you to look on archive.org. Links have an average life of 9.3 years before they rot, and that’s just the average. You’re likely to live longer, so do your future older self a favor by saving them some time and distress: make a nice home for your resources now so you don’t have to later.

If you’re too busy or poor to host your own content, at least paste your important URLs into archive.org’s site (you can also use the “Save Page Now” option in the Chrome extension, if you have a lot of pages) and your content will get archived (though if it’s a large PDF, maybe not). However, content on archive.org is not included in Google searches, so articles there effectively disappear unless the searcher happens to have the original URL and thinks to use the Wayback Machine. Also, people may stop looking when they try your original URL and find, for example, a porn site (e.g., this archive.org graphics site’s original URL goes to one now). This won’t happen if you have your own URL and maintain it.

For longer-term storage of your best ideas, don’t just blog about a topic, submit it to a journal (for example, JCGT takes practical articles) or article collection book (e.g., GPU Zen series, Ray Tracing Gems) and so have it become accessible for a good long while. It is possible and reasonable to take good blog content and rework it into an article. Going through peer review and copy editing will polish your idea all that much more.

These ramblings reflect my (limited) view of the world. If you know other approaches or resources to combat any aspect of link rot, please do let me know and I’ll fold them in here and credit you. Me, I hate seeing information get lost. Fight entropy now. Oh, and please put a date on any page you put up, so the rest of us can know if the page is recent or ancient history. Blog entries all have dates; everything else should, too.

Update: see my next post for some followups.

 

  1. isonno’s avatar

    I’d modify your suggestion about submitting ideas to journals with -open- journals. A lot of the old-school academic publications (CAD, for example) are run by old-school greedy academic publishers. They’re more than happy to put your paper behind a $30 paywall (of which the author doesn’t get a dime).

    The sooner these are displaced by open publications like the JCGT, the better.

  2. richgel999’s avatar

    I’ve had a blog on blogspot for around 6 years now. I don’t pay anything for it, and it seems very stable.

    I guess it could go down eventually. Google did shut down Google Code.

Reply