Posts tagged “wayback machine”.

On the dead and gone web

June 3rd, 2010

404 Error!

I first got a glimpse of how non-permanent (as Buddhists would say) the Web is while compiling my list of Newton-related sites. Maybe 40 percent of any Newton site are now dead and gone.

It’s not just archival, dead-platform sites that suffer from 404-itis. Relatively modern blogs leave a trail of links that are, today, dead ends.

For fun, I like to browse through John Gruber’s Daring Fireball Linked List archives, just to see what life was like in the Mac world before 2005, the year I switched. Most of the links back to Dan Benjamin’s Hivelogic blog are gone. And one, an explanation of FTP from Panic’s Steven Frank, is a non-starter. Searching for these posts is an exercise in futility. The only available option is archive.org’s Wayback Machine (where I finally found Frank’s post – love his old blog design).

The Web’s hyperlinks are the key to its success and openness. You find stuff because other people find stuff, so you click a link to find what they found. But when what they found is gone, or missing, it’s frustrating.

For blogs, the switch to a new platform can make all your links, maybe hundreds gathered over the years, non-functional. That’s what I imagined happened with Dan Benjamin’s Hivelogic. Or Steven Frank switching to Tumblr. I, too, switched to Tumblr for my personal site, leaving behind a Blogger-hosted weblog. All my old links are still available because the Blogger blog is still around, an abanonded building in a shoddy neighborhood. If there was an easy way to transfer all those blog posts to Tumblr, I would do it in a heartbeat. But still, if I shut down the old Blogger blog, all my old hyperlinks would become dead ends.

WordPress makes it a little easier, with XML exports and domain name serving. I exported the WordPress.com-hosted Newton Poetry and imported it into the new, self-hosted version. A lot of my pictures were left behind, but the text and links work decently (Thomas Brand’s words still haunt me to this day).

Now, if you write regularly, maybe you produce so much content that your old posts don’t matter as much. There’s plenty of new content to overwhelm the old stuff. But it seems to me, as a writer, that the old stuff – the really good stuff – is just as important and should be preserved in some form.

For instance, I (foolishly) kept a Myspace blog and wrote a ton of material for a few years. But when I left Myspace and deleted my account, all that material disappeared. To prevent a total erasure of memory, I copied and pasted all of those posts into my Blogger site. Not like blog to blog, but post to post, individually. It was such a chore. But I felt that a lot of the material was too good to let go. What’s a real shame is that I had no choice but leave comments behind.

There’s no easy way to take your written material with you when you make a switch. There are ways to do it, but usually they’re incomplete or, like my Myspace-to-Blogger example, a mind-numbing project.

And it’s not just that words that are the problem. The missing or incorrect hyperlinks will still be out there in the ethernet ether somewhere, a collapsed barn in some weed-riddled field. If you don’t keep your domain name maintained, or stop paying your web hosting bill, kiss your links goodbye.

This seems like the perfect project for Google, or for the Smithsonian. It would be a heckuva lot more useful that archiving Twitter. The problem would be the server space to host all those images, videos, text, and PDFs. But if anyone has the muscle to tackle a Web-wide archive, it’s Google.

The Web is too democratic to be under a for-profit business’s lock and key, however. It needs to stay public, whatever – and however – that means.