Are Your Backups Quantum Safe?
Introduction
In November 2024, Erik Crim at WWT published The Time for Post Quantum Encryption is Now, Not Later. In that blog, he addressed the threat of Harvest Now, Decrypt Later (HNDL), where data that is encrypted is still at risk of exposure due to the future threat of quantum computing to decrypt that can be copied and stored today.
This blog is intended to address an additional level of the HNDL threat: targeted harvesting of backup data to be decrypted later.
All backup architectures have four primary components: the clients being protected, the control plane that manages the backups, data movers that interface with the clients and finally, the target storage that contains the backups for the retained time. If disaster recovery or cyber recovery is implemented, there may also be fourth and fifth components.
Backup storage targets
Nearly all backup target storage falls under the following storage targets:
- Tape
- Block Storage on a Storage Area Network (SAN)
- Block Storage on Direct Attached Storage (DAS)
- Network Attached Storage (NAS), such as NFS or CIFS/SMB
- Object Storage, such as S3 or Azure BLOB
Backup data
There are countless types of data being protected with backup and recovery software, including databases, unstructured data (documents, spreadsheets, etc.), email, and virtual machines. Given that backup storage could have one or more copies of every bit of data in an enterprise, exploiting it would be a potential windfall for any malicious user or group.
Just accessing the backup catalog can provide enormous amounts of metadata that can be used to augment a threat actor's understanding of an enterprise's network and topology.
To this end, backup data must be protected at the same or even higher levels of security than the original production data.
Q-Day
Q-Day, sometimes referred to as Y2Q, is the hypothetical day when quantum computing will be available and capable of breaking current encryption methods. Q-Day presents a potential global security threat due to the likelihood that quantum computers will be used to break widely deployed encryption algorithms.
Many current cryptography methods rely on mathematical problems (like integer factorization, elliptic curves, and discrete logarithms) that are considered safe from current computers. While current cryptography isn't considered unbreakable, the large quantity of resources required to break it makes it financially unfeasible, even for most nation-states.
However, these encryption methods are believed to be breakable by sufficiently powerful quantum computers.
While Q-Day is some day in the future, "Harvest now, decrypt later" (HNDL) is a strategy attackers use to collect encrypted data with the intention of decrypting it in the future, likely waiting for advances in quantum computing. This means even if encryption is strong today, it may not be strong enough to defend against the upcoming quantum threat.
Quantum computing alone isn't enough to break encrypted files. There has been extensive research on quantum computing algorithms that can be applied to encrypted data to break encrypted files. The two most cited are Shor's Algorithm and Grover's Algorithm.
Shor's Algorithm
Shor's algorithm is a quantum algorithm for finding the prime factors of an integer. It is one of the few known quantum algorithms with compelling potential applications. The algorithm's importance is that it will find potentially large prime numbers used as keys for some cryptography algorithms, such as RSA and ECC (see Classical Cryptography, below).
Grover's Algorithm
Unlike Shor's Algorithm, Grover's Algorithm utilizes quantum computing to potentially speed up Brute-Force attacks. This provides the potential capability to break an encrypted file in a fraction of the time conventional computers would require.
Classical cryptography
Most of the top backup and recovery software uses one or more of the following encryption methods. The details for each method aren't necessarily relevant outside the belief it can be broken with quantum computing.
- Rivest–Shamir–Adleman (RSA)
- Elliptical Curve Cryptography (ECC)
- Triple Data Encryption Standard (3-DES)
- Blowfish
- Twofish
- GOST
- Serpent
AES-256
While the list above may be concerning, there is good news for most backup deployments; every major backup vendor today supports AES-256.
Of all the encryption algorithms currently in wide use, the Advanced Encryption Standard using 256-bit keys (AES-256) is considered safe for Post-Quantum Cryptography (PQC).
According to the National Institute of Standards and Technology (NIST) "… it is quite likely that Grover's algorithm will provide little or no advantage in attacking AES, and AES 128 will remain secure for decades to come. Furthermore, even if quantum computers turn out to be much less expensive than anticipated, the known difficulty of parallelizing Grover's algorithm suggests that both AES 192 and AES 256 will still be safe for a very long time."
Post Quantum Cryptography
While AES-256 is considered post-quantum safe, there are several post-quantum cryptography algorithms that have been approved by NIST in 2024. Those currently include:
- ML-KEM (CRYSTALS-Kyber): For general public-key encryption and key establishment
- ML-DSA (CRYSTALS-DILITHIUM): A lattice-based algorithm for digital signatures
- SLH-DSA (SPHINCS+): A hash-based signature scheme
For a full description of the algorithms supported and timelines, please refer to The Commercial National Security Algorithm Suite 2.0 and Quantum Computing FAQ here.
Key management
Key management for backup and recovery servers is critical because these systems often store the most sensitive form of data: entire snapshots of production environments. If encryption keys are mishandled, an attacker could either decrypt stolen backups or cause permanent data loss by destroying keys.
There are a variety of deployment models for key management, each with its own merits:
- Local Key Store (simple, but higher risk): Keys stored in the backup software's internal database, protected by admin credentials.
- Integrated Key Management Service (KMS): Backup vendors integrate with enterprise KMS platforms.
- Hardware Security Modules (HSMs) backed KMS: High-assurance key storage inside HSMs, adhering to FIPS 140-3 validated protection.
Entropy
To increase the strength of an encryption key, it must be randomly generated. Entropy is a measure of randomness in a system, and when used regarding random numbers, especially for cryptography, entropy describes how unpredictable those numbers are.
High-entropy random number generators create keys that are considered more difficult to break than lower-entropy generators. It's an important consideration when you determine what key management system you plan to implement.
Backup targets
Let's examine some of the backup storage targets and highlight some of the HNDL risks associated with them.
Tape
Tape can be considered the least likely target for HNDL. Harvesting tape data involves physically accessing tapes or taking over a tape drive in a tape library. The tapes would need to be read and then stored for later decryption. Although possible, the effort involved is generally unattractive for capturing the data.
Even though tape is an unattractive medium for HNDL, it's imperative to encrypt the data going to tape. Since tape is generally used to store data for longer retention periods than disk, this expands the available window of time for harvesting the data.
Lastly, existing backup tapes that may have been written without encryption or using weak encryption should be cataloged for their contents and have a proper risk determination performed.
Hardware encryption
Many tape drives support encryption at the drive level. The most popular tape technology today is Linear-Tape Open (LTO). LTO has supported hardware-based encryption since LTO-4 was introduced in 2007. LTO drives employ AES-GCM also known as AES-256-GCM to encrypt tapes, making it secure from post quantum decryption.
The key benefit to using hardware encryption is that the encryption is performed post-compression of the data. This improves density and tape utilization and can even improve the performance of writes to tape in most instances.
Software encryption
If hardware encryption isn't available, encryption from the backup and recovery software will perform encryption before writing the data to tape. This is generally considered secondary to hardware encryption since it consumes CPU cycles on a host server and nullifies the data compression capabilities of modern tape drives.
Double data encryption
There are rare instances where both hardware and software encryption are employed. Double data encryption applies two independent encryption layers (software then hardware) to sensitive data on tapes, providing an extra layer of security beyond single encryption. As with software encryption, there is no compression benefit from the tape drive.
Storage area networks (SANs)
Storage Area Networks are dedicated, high-speed networks, servers and storage devices such as disk arrays and tape libraries that create a pool of shared block-level storage LUNs. SAN disk arrays can be very large, scaling to petabytes of storage and 100s to 1000s of LUNs. This makes SAN disk storage an ideal target for backup data, where large quantities of reasonably priced storage are required.
SAN disk arrays used for backup consolidate data from multiple sources, making them a concentrated, high-value target for attackers. If an attacker can breach the backup target storage, they could potentially access all the data in the enterprise from one point.
SAN disk array security
SAN disk array security relies on a multi-layered approach, including network access controls like zoning and LUN masking, strong authentication, network isolation to segment traffic, continuous monitoring for threats, regular firmware/software updates, and strict physical security of the storage hardware.
These measures collectively protect data by controlling who can access it and ensuring the data is kept confidential.
If all SAN security controls are bypassed, encrypting the data with PQC encryption will not prevent the intruder from accessing it and will not prevent them from harvesting it for later decryption. A snapshot of the desired LUNs can be created and mounted to another host, thus allowing the harvesting of the backup data.
Self-encrypting drives
Many arrays today use Self-Encrypting Drive (SED) technology with built-in hardware encryption capabilities that automatically encrypt and decrypt data as it is written to or read from the drive. SEDs use an internally generated and secure encryption key. This encryption provides strong security for data at rest for individual drives that may be compromised.
SEDs generally don't provide any protection if the SAN array is compromised since the data is read and written at the LUN (Logical-Unit Number) level above the SEDs. However, they do provide protection needed when drives are routinely swapped for failures.
Host-based encryption
Host-based encryption (HBE) is implemented at the operating system or hypervisor level of a host server or virtual machine. It utilizes software-based encryption solutions like full disk encryption (FDE) or file-level encryption, often integrated with the host's operating system, such as BitLocker for Windows. The advantages HBE provides are protecting data at rest on a disk array, even if the array itself is compromised and a common encryption method that is portable across differing disk arrays. Depending on the software, it can be used to encrypt devices such as a LUN, a disk volume or a file system.
The chief disadvantages of HBE are the additional CPU overhead and the impact on array-based compression or deduplication.
Application-based encryption
Application-based encryption (ABE) is performed entirely by the application itself. Common implementations of ABE are Transparent Data Encryption (TDE) performed by Microsoft SQL Server or Oracle Database Server.
Most backup and recovery software applications can perform ABE for writing backups to target storage. The benefit of using the backup software for encryption is that a compromise to the underlying storage target will result in data that is PQC. As with HBE, ABE consumes CPU and impacts target storage deduplication and compression.
The impact on the target dedupe and compression is often alleviated by the backup software performing its own compression and dedupe.
Direct-attached storage (DAS)
DAS is used in several methods for backup targets. The first method is a data mover that contains DAS to act as a backup target for itself. This is not common practice in large enterprises.
The second method for using DAS is to deploy hyperscale servers (also known as nodes) that are clustered to form a single backup system. Each node manages its own DAS, while the entire cluster works together to form a larger storage target by combining the nodes.
In both cases, encryption at rest on the drives should be performed in the event of disk replacement or theft. However, much like SAN disk storage, gaining access using administrator or elevated credentials will allow visibility to the storage. Each backup vendor has best practices to harden the servers to prevent that access.
Network-attached storage (NAS)
NAS storage is a very common backup target in enterprise backups. For purposes of this article, we'll limit the scope of NAS to NFS and CIFS and discuss object storage later.
Just as with SAN and DAS, data at rest should be encrypted to prevent compromise if drives need to be replaced or are stolen. Also, like SAN storage, encryption doesn't prevent an attack if credentials are compromised. That leaves the best option for encryption to be host based or application based.
Object storage
Object storage is today's fastest-growing backup target. Whether it's Simple Storage Service (S3), Azure Blob or any other service, object storage provides significant advantages as a backup target, including nearly infinite scalability, reliability, inexpensiveness and immutability.
Like other storage targets, there are various ways to encrypt object storage, each with benefits.
Using the default encryption that comes with the object storage guarantees encryption at rest. However, it is also accessible if administrator access to the storage is obtained.
For increased security, the object storage can utilize customer-provided keys that are applied when objects are written or read. By separating the keys from the object storage, an additional level of security can be created.
Lastly, client-side encryption, such as application-based encryption described previously, provides the strongest level of encryption protection. Even if an intruder has complete access to the object storage, the objects will be safe from being decrypted.
Artificial intelligence
Given the complex nature of how backup software stores backups, especially when you factor in daily incremental backups, synthetic full backups and deduplicated storage, HNDL attacks would require an extraordinary effort to make sense of the backups, making the work required more than the resulting data is worth. Simply put, attackers would focus on data that is easier to steal or of higher value.
With the introduction of artificial intelligence (AI), HNDL becomes viable due to AI's ability to scan the entire backup set and determine the patterns within it. Once the patterns are determined, AI can synthesize the data for consumption. Protecting the data with PQC encryption is now the long-term focus for preventing HNDL.
Conclusion
Quantum computing poses a future threat to current encryption methods, making it essential to evaluate the quantum safety of backup data. The concept of "Harvest Now, Decrypt Later" (HNDL) highlights the risk of attackers collecting encrypted data today to decrypt once quantum capabilities mature. Protecting backups with post-quantum cryptography and robust key management is critical to securing sensitive enterprise data.
Even with strong Post-Quantum Cryptography, robust security requirements need to be employed to prevent access to backup data in plaintext. Post-quantum cryptography is essential for data-at-rest, but backups need mature and well-defined security safeguards.