fedops blog

Privacy in Computing

Sun 15 April 2018

Backup and Restore

Posted by fedops in Howto   

Anyone creating and working with data on computers should be aware that data is but a configuration of electrical or magnetic memory cells. A single mistake, a software malfunction or a hardware problem can erase or falsify literally millions of bytes in a heartbeat. The far most likely thing to happen, though, is plain and simple human error. Erase the wrong directory and all your documents are gone.

Luckily, digital files can be easily copied to another place, and buying such places has become very cheap. So why are people still losing data? Because for all the advances we have made in computing in the last decades, unobtrusive, easy to use, and reliable backup programs are still few and far between.

Apple years ago introduced what still is the gold standard of such software, Time Machine. As long as there's a backup disk connected to the computer or available over the network, the software lurks in the background and copies new and changed files over every hour. Backup sets are maintained until the disk fills up, at which point the software intelligently starts removing the oldest backups. No user intervention is required, and the whole configuration dialog is famously minimal; essentially limited to allowing the user to include and exclude certain directory trees, nothing more. Getting data back is trivially simple as well.

Luckily for us, Sergio Costas Rodríguez from Spain liked Time Machine enough to write a FOSS version of it, called anaCronoPete after Enrique Lucio Eugenio Gaspar y Rimbau's time traveling novel "El anacronópete". The software uses rsync underneath a user interface so reminiscent of the Apple product it makes you wonder what the folks in Cupertino are thinking:

anaCronoPete settings

The restore dialog is also very similar to Time Machine's.

The use of rsync means there's extremely robust and flexible code doing the heavy lifting, and the backup destination can be eaxmined using all the normal implements in the Unix toolbox. So even if the restore software should develop a problem, or you need to restore data on a Linux system that doesn't have anaCronoPete installed, you can do that very easily:

backups in the filesystem

In this example anaCronoPete has created a directory on April 4th with the full dump of /home, and another directory on its next run on the 15th of April with just the changed or added files and directories. If you were interested in obtaining a list of all changes to a given file, a simple ls -l */path/to/file in the root directory of your backup disk would give you that. Likewise, a diff could be performed between any of those files to see what has changed in the file's contents between those dates.

While we're on the subject, here's a quick rundown of my backup strategy:

  1. backups are safer in your cupboard than they are connected to the network. This is especially true when your failure scenario accounts for operator error. You can't delete files from a powered-off disk. This is why my favorite backup medium are USB disk drives.
  2. USB drives of the same capacity are cheaper than servers or NAS systems. Once you've bought one drive, it's not a big deal to buy a second drive and in regular intervals copy the first to the second. Besides adding redundancy it also alerts you to read errors or corrupted files that otherwise might go unnoticed until the day you are attempting a restore.
  3. once you have those two drives, take the second one with you to work and store it in a cupboard there. Or give it to a friend or relative to safekeep. Bring it home with you every now and then, say once a month, only to copy your primary drive to it, then take it away again. That way, should you have a breakin or even a fire in your house, your data will still be safe. Remember: to a thief, your $50 drive is just $20 for the next meth pill; for you, it's your digital life.
  4. depending on your off-site storage location, consider encrypting the disk drive and making sure to safekeep the secret key. I personally store all sensitive files encrypted anyway, so there is no need to encrypt the file system.

Personally, I don't like the added complexity of encrypted drives in an emergency restore situation, and encrypted drives do not protect against the same threats as encrypted files. A notable difference being that an encrypted drive that is mounted on your system does not protect the files against access from said system, e.g. malware processes lurking in the background. Single file encryption with files being decrypted only as and when the need arises is somewhat less convenient, but much more robust - in my mind anyway.

In the relatively short time I have been using Sergio's software, I have not noticed any oddities. This is an extremely valuable utility to have running. Now all that remains is to make sure the backup disk is connected frequently. Also, don't forget to execute test restores every now and then to make sure everything works as it should.

Because, as the Admin Zen says:

Nobody wants backup, everybody wants restore.
To restore, you need a backup.