Data preservation

I was going through my computers yesterday trying to find some data I hadn’t seen in a long time. I finally found it on a server I haven’t powered up in over a year (the server’s name is Fyre, if you care). The data consisted of everything I’ve ever done on GNU/Linux computers up through two years ago, both on that server itself and on the school’s GNU/Linux server in high school. This data cache is about 3.5 GiB large and contains dozens of programs I wrote back when I was still learning, say, C++. In addition to all of this data from GNU/Linux computers, I still have all of the data from my family’s long line of Windows desktops, going back about eleven years. Yeah, I’m a data rat.

My data preservation strategies aren’t really clever or well thought out. I’ve just always known that I didn’t want to lose this stuff, so I’ve always taken small steps towards accomplishing that goal. So every time we got a new computer, I’d copy over the My Documents folder from the previous computer onto it and burn it to CD (or DVD). More recently, I started putting them into archives so they were more manageable. Now I have about twenty archives on my current Windows desktop that contain just about anything of note I’ve done on any computer system in the past eleven years (with the sole exception being the programming I did in middle school; I may have saved it all to floppies a long time ago, but haven’t located it yet). Heck, I’ve even put all of my old websites back online.

So what do I do with all of this data? Well, I simply save it in case I ever end up needing it, which does happen on occasion (especially when I’m going back through my old stuff looking for something I wrote years ago). It’s also fun to just go through the stuff I was writing for school in, say, seventh grade. Man, has my writing ever improved since then. I have to fight the urge to edit the documents as I’m reading through them, because I know it’s my work, and the poor quality of it bugs me.

I’ve asked around with my friends and nobody else seems to still have digital copies of the work they did so long ago. It seems a pity — I did put lots of effort into all of that homework and coding, and I wouldn’t want to lose it. If that makes me a digital pack rat, so be it. Now my main worry is mostly about data formats. The data itself is on multiple hard drives and various backup DVDs, but that’s all meaningless if the file formats themselves are so antiquated they cannot be read by contemporary applications. All of my work from sixth grade, for instance, is in Microsoft Works format, and the filenames are strangely antiquated, consisting of eight characters, all upper case (DOS limitations?). A lot of my programming is in whatever format we used on the school computers, and really old Borland workspace files, for instance, don’t age well. At least the source code is mostly stored in text files, even if it won’t compile.

I suppose it might be fun to go through some of my old writing, post it here, and analyze it, not just for style but also for content. Unlike so many others I at least have that opportunity, so I may as well make use of it.

Feel free to leave a comment: