October 22, 2009

Backing Up Is Hard To Do

I'm sure everyone has a story of losing valuable data, not the stolen kind the destroyed kind.

There are so many ways things can go wrong, drive motors fail, platters become unbalanced, chips go bad, electronic noise, communication failures, programs go awry, viruses, the list goes on... And on... And on... I can relate stories of each type however, I have never found any of those listed to be as bad as letting a human near a computer.

I have done many stupid things to my data, deleted the wrong folder, formatted the wrong drive, thought I backed things up, not saved often enough, etc. The worst thing I have ever done is on a shared web host I did "rm -rf /" instead of "rm -rf ./" and I as a lowly user had permissions to do it (for all the non *nix'ers out there that is a remove all files and folders recursively from the root of the file system which can include all the disks and remotely mounted file systems, without prompting). Stories like this come from everywhere friends, relatives, or even giant companies like T-Mobile and Microsoft with the user data loss on the T-Mobile SideKick because of server failure on the Microsoft/Danger cloud.

So this gets me to the meat of the story, I'm a bit of a bit freak (haha yeah I know soul crushing pun). The recent story about the SideKick data loss got me thinking about more robust backup solutions. So I first thought about what questions need to be answered in order to decide what approach is best suited for the needs. Here are the 5 most important items I could come up with:
  • Is user error a concern? - Unintentionally deleted a file/directory
  • Do old versions and/or copies need to be kept? - Overwrote important calculations
  • Is timeliness a factor? - The contract only allows an hour of down time a year
  • How many people and devices are there? - One computer, a small office with a few computers, or a large number
  • Is security a concern? - Do certain people only need certain pieces of data, or is it ok for everyone to see everything
User error is the easiest to solve as all it requires is making frequent backups, in doing so the likely hood or complete data loss is reduced to near zero and data is recoverable to some relatively recent time frame. There is software built right in to most operating systems and then there s software like Acronis True Image® which will make bootable images of a system making recovery throw in a DVD, reboot and say go.

Versioning can be solved using a version control system these are often used by programmers to manage the source code so many developers can work on it at the same time systems like Subversion,Git and Mecurial exist for this purpose. This makes the process slightly more complicated as it is no longer just a file system to deal with. The first new piece is having a central repository which is similar to a file server just a bit more advanced. Users then ask the repository for a local copy (aka check out) which they would then use to view and make changes, once they are done making changes they need to send their changes to the repository (aka check in) so others have them available. Once a system like this is in place it will maintain each check in of a file as a separate version thus old versions can be looked at and used. These systems can grow to be very large because of all the copies of data so keep that in mind if more than text files are planned to be placed within.

Timeliness can only be solved through having spares of everything on hand or better yet a replica hot spare system on hand and ready to go. RAID sounds good at first but is is a pretty poor solution, if power goes out the write fails, if corrupt data is written its still corrupt, plus RAID cards are expensive and you NEED to always have a spare on hand.

The number of users affect whether a server is dictated, if only one computer is used it is senseless to go any further than the a drive or folder on that system. A small number of computers may indicate a basic NAS is a good inexpensive solution, but often they lack the sophisticated controls of a full fledged server. So on a larger scale where additional services (i.e. FTP, windows file sharing, SSH, Web Server, etc) are needed, or access restrictions a server may be the best choice.

The last issue of security falls with the answer to the number of users, a NAS may have some access controls to prevent users from getting data they shouldn't but a server running Windows, Linux, or Unix will have the controls.

In my case I am implementing a solution for my home, my concerns are user error, vesrioning and the number of devices (we have several computers, media servers, and other connected devices). My solution is to set up a server to deal with the number of devices, running Git to version my documents and code, and a NAS with a large amount of storage for automatic backups.

Be safe - protect your bits ;-P

No comments:

Post a Comment