Problem solve Get help with specific problems with your technologies, process and projects.

How to secure data backup

Financial-services firms are among the many organizations that have reported losing backup tapes with sensitive customer data. In this tip, W. Curtis Preston explains how to secure data backup and the pros and cons associated with the three basic methods: source encryption, backup software encryption, and hardware encryption.

It's a security mishap that seems to happen all too often: Organizations losing backup tapes with sensitive customer information. The problem plagues every industry, including the security conscious financial-services industry.

Last year, the Bank of New York Mellon reported that third-party couriers lost unencrypted backup storage tapes on two occasions, potentially exposing the data of approximately 4.5 million people. Around the same time, Boston-based State Street Corp. said a contractor hired to conduct data analysis lost a disk drive containing the personal information of 5,500 employees and 40,000 customer accounts. The data was initially encrypted but the contractor unencrypted the information and stored it on computer equipment, which was stolen from its facility.

At the Bank of New York Mellon, the breach led the firm to boost its security by requiring that confidential data that must be written on tapes or CDs for transport be encrypted, or transported with added controls

Storage encryption is the key to avoiding the fallout from a breach, which includes both the direct costs of notifying customers of the breach and the indirect costs of brand damage. No matter what your vendor tells you, all unencrypted backup tapes are readable by determined criminals. Don't fall for the claim some vendors make: that their backup format is proprietary and can't be read without their database and software. Backup formats are irrelevant to laws such as California's SB 1386; if you lose control of unencrypted personal information, you must notify the affected customers. Most state data breach laws do not require notification if data is encrypted.

It's a clear business case for encrypting tapes that are going to leave a company's physical location. Secure data backup could save your organization millions of dollars if a tape is lost, and will ensure that any damage to your brand is minimal.

Risks of storage encryption

When you consider encrypting backup tapes, you need to consider the risks of storage encryption. They're very different from the risks associated with encrypting data in flight. If you have a problem encrypting data in flight, you know it right away and fix the problem. For example, if an application is sending data across an encrypted channel, and something happens to the encryption, the application will crash. A root-cause analysis will determine that encryption failure was the culprit.

However, if you have a problem with encrypted data at rest, you may not know it for weeks, months or years--and probably not until it's too late. This is because unlike encryption of data in flight, the read step only happens when you verify or restore from a tape. Therefore, unless you're verifying every tape after you've written it, the only time you're going to test your encryption system is when you perform a restore--the worst time you want to find out your encryption system doesn't work.

The biggest risk with encrypting backup tapes is that you can make them so secure that even you can't read them. If you lose your keys, or if your processes break down, you end up with unreadable backup tapes. Unfortunately, you might not discover this has happened until the moment of truth: when you absolutely need to read that tape.

This is why encryption key management is so important. Keys will be needed any time you attempt to read encrypted data, such as when performing a restore, and access to keys by the wrong person could render your encryption system useless.

Consequently, keeping track of the keys used to encrypt data is paramount. A database is critical for tracking what key was used on what day to encrypt what data. Any time you change the key, the database must be updated accordingly. You must also control and monitor access to the key database to prevent unauthorized access.

Although secure key management systems already exist for encryption of data in flight, they don't support multiple keys associated with a single system or tape drive over different periods of time. As a result, key management systems designed for encrypting data at rest are often much less mature.

Another risk with encrypting backup tapes is failure to strike a balance between security and usability. This is a constant battle in security circles. If you make a system too secure, it becomes unusable or difficult to manage. If it's easy to manage, it's usually not very secure. The goal here is to find a balance--making the system more secure without significantly increasing management costs, or changing the user experience.

There are three basic ways to encrypt backup data:


  • Source encryption.

  • Backup software encryption.

  • In-line hardware encryption


All three will encrypt data before it's potentially lost. However, each option will have a different impact on usability and cost, which must be taken into account. For example, many would consider a method that causes the speed of all backups to slow by 40 percent, or decreases the capacity of a tape library by 100 percent, to have too great an impact on usability to be a viable option.

Source Encryption

Source encryption systems encrypt the data at its source. A file system, such as Windows Encrypting File System, or a database encrypts the data stored therein. If data is stored encrypted, and it's not unencrypted when backed up, it would meet the encryption requirement of various breach notification laws; you wouldn't need to notify customers if you lost a tape with personal data on it. These systems have the added benefit of encrypting data while it's being transferred across the network. For those concerned about malicious insiders, this can be a real plus.

Source encryption systems suffer from a number of drawbacks. First, they typically impact the performance of the file system or database in question. Every file or database record must be encrypted when written, and then unencrypted when read, which can significantly impact performance.

Another challenge is key management, because each file system or database type would typically have its own encryption system and set of encryption keys to manage. Since key management is a top priority when storing data at rest, this is a big concern. If you've got several data types to encrypt, this may become a show stopper. Managing one key system is challenging enough; managing multiple key systems would be even harder.

Data classification systems
If an organization wants to encrypt only data that would result in it having to do a public disclosure in the event of a breach, then it needs to be concerned about personal information. The challenge is in making sure you locate where such information is stored.

Some of the obvious locations are customer information databases, as well as imaging systems where contracts are stored. Other locations may be less obvious and very difficult to find.

The best way to ensure you've found all that personal data is to use a data classification system, or appliance. Without one, data classification is near impossible. These appliances crawl your file systems, databases, Web pages and even backup tapes, looking for metadata to classify information based on sensitivity--and always seem to find data in places you didn't think it resided.

The cost of these data classification systems ranges from a few thousand dollars to tens of thousands of dollars, and they support several different ways to access your files, such as HTTP, NFS and CIFS. They typically are very easy to use and install.

Finally, encrypting data at the source removes the compression feature of any backup system since encrypted data cannot be compressed. This would result in a 25 percent to 50 percent loss in capacity and performance from Unix, Windows and MacOS backup environments. (Hardware compression increases performance as it reduces the number of bytes actually written to tape.)

As a result, source encryption is mainly applicable to encrypting very small amounts of data, such as a single file system or database storing personal information. It's also appropriate if you're concerned about encrypting data before it's transmitted across an unsecure network.

Backup Software Encryption

With backup software encryption, the backup application encrypts the data as it's stored on tape. Most backup software products have encryption options, and a number of vendors have beefed these up in recent months.

While this solves the multiple key problem with source encryption by employing a single key management system, the key management systems employed by many backup software applications are antiquated. A few vendors have updated their key management techniques, and some have partnered with other companies to do so. Others, however, are stuck in the '80s and use systems that are easily defeated.

For example, they use a single key that has no concept of access control; if you have that key, you can read the tape. If a rogue employee gains access to the tape and the key, he or she will be able to read the tape. If you change the key due to that employee, he will still be able to read the stolen tape that was encrypted with the old key, but you won't be able to read backup tapes that were written prior to the date you changed the key--you would have to temporarily put the old key back in place to read old tapes.

Backup software encryption will also impact backup performance since encryption done in software is very slow. Although faster CPUs and more efficient code will help, software encryption will probably always lose the speed battle. Like source encryption, backup software encryption will also remove compression from most backup systems, unless the customer uses client-side software compression that slows the backup even more.

As a result, backup software encryption, like source encryption, is mainly applicable to encrypting small amounts of data. For instance, if you have a single database that stores personal information, you could encrypt the backups of just that database. However, it can be quite difficult to identify all databases and file systems that store personal information. If you can't be sure you've identified all such databases, you'd have to encrypt all backups to make sure you don't have to notify any customers if you lose a tape. If that were the case, this option would probably not be viable due to its impact on performance and capacity. Backup software encryption is appropriate, however, for backing up systems across unsecure networks.

Hardware Encryption

The relatively new option that has increased interest in encryption is a hardware appliance that sits in the physical data path and encrypts data on its way to tape. Because the encryption is done in hardware, it can be done much faster and does not slow down the backup. In addition, encryption appliances designed for tape compress the data before it is encrypted.

These hardware encryption systems typically have very sophisticated key management systems that cannot be defeated by a single malicious employee. For example, they often separate the keys used to encrypt the data from the keys used to authenticate and authorize systems and personnel. They also offer features that ensure keys never get lost, such as replication and key vaulting. These systems are advanced because they were all developed within the last five years, and take advantage of decades of lessons in data security.

The first encryption appliances available were single-purpose appliances with a few ports in and out. As of late last year, these systems owned the lion's share of the backup encryption market. At the same time, encryption functionality is now being included inside tape libraries, intelligent switches and tape drives. This leads to the question: where should hardware encryption reside? As long as the hardware system compresses, encrypts and has a strong key management system, it doesn't really matter.

Hardware encryption is the most viable option for anyone wishing to encrypt a large amount of data on its way to tape. Customers can compress their backup data without any performance or capacity loss; they just need to buy enough appliances to handle their backup bandwidth requirements. The only drawback is cost; these appliances typically start at $20,000. However, their costs pale in comparison to the costs of a public breach disclosure.

Which One?

What matters most to your business operations and which problems you're trying to solve will determine the best approach for you. If you want sensitive data to always be encrypted, then you'll want to choose source encryption. If you just want to make sure data is encrypted as it's leaving a system on its way to backup, you'll want to select backup software encryption. Just keep in mind that both of these methods have serious performance and capacity drawbacks. If you don't want to figure out what should be encrypted, and simply want to encrypt everything on the way to tape--plus avoid any performance or capacity loss in the backup system--then you'll want hardware encryption.

Whatever encryption method you pick, it should provide some peace of mind if a backup tape goes missing. Ultimately, it's a decision an enterprise can't afford to avoid in this age of data privacy regulation.

About the author:
W. Curtis Preston has focused on data backup and recovery for more than 15 years. He is an executive editor and independent backup expert at TechTarget, webmaster of, and author of The Storage Security Handbook and Using SANs and NAS.

Dig Deeper on Data classification methods and guidelines

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.