Between prioritizing how to secure data and keeping up with the increased data demand, organizations need solutions that will help them keep up in an age of shifting threat landscapes and technology disruption. 

In this article, we will discuss where we see our technology partners focusing their efforts regarding alternate hypervisor backups, security, artificial intelligence and next-generation solutions in the data protection market.

At WWT, we primarily, but not exclusively, evaluate and recommend solutions from the top six data protection vendors according to leading market analysts. Those vendors include Cohesity, Commvault, Dell, Rubrik, Veeam and Veritas.  Our observations and conclusions are largely focused on those top vendors.

Emergence of 3-2-1-1-0 backup rule

Since our last trends report, we're seeing the traditional 3-2-1 backup strategy being enhanced to 3-2-1-1-0 where backups include:

  • 3 Copies of Data: You need three copies of your data, including the original data, and two backup copies
  • 2 Different Storage Media types: You should store backup copies on at least two different types of media, such as disk, tape, cloud, etc.
  • 1 Offsite Copy: At least one copy of your backup should be kept offsite, meaning away from your primary backup.

Additionally, we now include an additional '1-0' for:

  • 1 Immutable or Air-Gapped Copy: Either or both on-site and off-site copies should be an immutable backup that cannot be modified, deleted or overwritten for a certain period. 'Air-gap' or offline copies should be inaccessible to attackers even in the event of a fully compromised production environment.
  • 0 Errors After Backup Verification: Finally, have zero errors in backups to ensure that restores will be successful.

The '1-0' is in response to ransomware and cyberattacks. If attackers compromise your primary environment, they won't be able to tamper with or encrypt the immutable copy ensuring you always have a version of your data from which to recover.

Alternate hypervisor backup and restore

The proliferation of cloud computing, along with the acquisition of VMware by Broadcom, has heightened interest in the requirement for data protection products to provide the capability to back up virtual machines (VMs) across a range of different hypervisors. We see this trend being driven by several factors:

  • Moving from VMware to another hypervisor: Many of VMware's existing customers are considering a move for financial reasons.
  • Migration to and from the cloud: The ability to move workloads to the cloud or repatriate them back to on-premises is significantly easier if the backup tool provides the needed format to match the target hypervisor. This can include adjusting hardware configurations, disk storage, CPUs and memory allocations.
  • Disaster recovery: The ability to stand up a disaster recovery (DR) site in an alternate hypervisor, such as a public cloud provider, provides flexibility and potential cost savings by eliminating the need for a full-time disaster recovery site.  Note: An alternate hypervisor restore for DR will have a significant impact on Recovery Time Objective (RTO).

Introduction of artificial intelligence in data protection

The last year brought substantial investments in artificial intelligence (AI) use cases for data protection. The use cases are widely varied but are primarily on ransomware detection and remediation, data governance, and data discovery. The top AI implementations are Cohesity GAIA, Rubrik Ruby, Commvault Arlie, Veeam AI Assistant and Veritas Alta Copilot. Dell's implementation of CyberSense from Index Engines uses machine learning to detect 99.99% of ransomware corruption[i]

For more information on AI in Data Protection, please read How Machine Learning and Generative AI will Affect Data Protection in 2024 and Beyond.

The most immediate benefit of AI for data protection is the use of chatbots that backup administrators use for assistance in managing backup and recovery environments. Many data protection vendors have trained AI on their documentation library and support case history. Customers can query these AI models using natural language and the AI returns the appropriate solution and cites the source documentation. This allows data protection administrators to quickly and accurately implement solutions that could otherwise consume hours of searching.

Data Protection for artificial intelligence assets

In the race to bring AI solutions to market, the need to protect those solutions is frequently overlooked. A final AI model that is ready to deploy can take weeks to months to develop with potential costs in millions of dollars.  It makes sense to protect those assets like any other valuable computing resource. Some areas to consider include:

  • Training data: Training data sets are usually large and often unstructured and grow exponentially as models become more sophisticated.
  • Finished AI models: The finished models themselves, consisting of all the information, nodes and weightings that comprise the neural networks that generate the responses, are high-value assets that are expensive to reproduce.
  • Prompts and response records: The user interactions with model, including what inputs were received and what answers were generated from those inputs. This information is useful for evaluating the accuracy of the model and identifying areas for additional training.
  • Infrastructure: The needed compute, application and network configurations along with the Cluster Configuration and State. Although the infrastructure can be rebuilt, having a robust data protection strategy can save hours or even days over rebuilding a very large cluster with dozens to hundreds of nodes.

Data security 

Throughout the last several decades, one of the major goals of data protection has been the ability to use backup data for purposes beyond simple recovery. Adding security features to backup products to aid in detection and remediation of threats is a logical step. 

Anomaly detection is by far the most common security feature added to data protection products. Anomaly detection monitors backups for suspicious activity and unusual patterns to help identify potential threats such as ransomware attacks. Over the last year, all six of our primary data protection partners have added or enhanced their detection capabilities. The important criterion for our customers is to determine when and where the detection should take place as some products do it in-line with backups and others perform post backup detection in the cloud.

With the acquisition of Laminar, Rubrik is introducing Data Security Posture Management (DSPM) into its data protection product in order to help organizations facing increasing cyber threats and stricter compliance requirements. By embedding DSPM as part of the backup process, data classification rules can be enforced at an additional point in the data life cycle. We expect other data protection vendors to follow and integrate similar technologies.

While it's not a feature, backup data governance is being reevaluated at nearly every organization. Data protection governance helps define data retention, disposal, security and lifecycle management. Nearly every client WWT works with is evaluating how long to maintain backup retention and how to protect the backups from data exfiltration as well as reducing their windows for legal discovery. 

Evolution of isolated recovery environments

An isolated recovery environment (IRE) is a secure, segregated environment designed to restore critical systems and data following a cyberattack or disaster. The primary goal is to enable recovery in a space that is disconnected from potential threats. 

Initial IREs were deployed as cyber recovery "vaults" where data was completely isolated and required the use of intricate firewall rules and separate hardware. IREs today include traditional vaults but now encompass cloud-native recovery environments that are more flexible and scalable. Cloud-native IREs allow organizations to deploy to cloud infrastructures (e.g., AWS, Azure) for better recovery agility. 

Improvements in granular recovery are allowing affected organizations to specify recovery of files, databases or applications at a point-in-time of their choosing. This significantly reduces downtime, ensuring quicker resumption of business operations after an attack.

Defense in depth for data protection

The National Institute of Standards and Technology (NIST) defines Defense in Depth (DiD) as "The application of multiple countermeasures in a layered or stepwise manner to achieve security objectives. The methodology involves layering heterogeneous security technologies in the common attack vectors to ensure that attacks missed by one technology are caught by another."[ii] 

We are assisting our customers to implement DiD at a macro level for their data protection by applying multiple layers of security controls to ensure the protection, integrity, and availability of backups. Key areas for data protection DiD are:

  • Implementing role-based access control (RBAC) to limit who can access, modify or restore backups.
  • Enforcing multi-factor authentication (MFA) for anyone accessing backup systems, especially for users who have admin rights.
  • Enabling multi-user authorization for deleting or modifying backups, roles and policies to ensure rogue administrators cannot damage backups.
  • Segmenting your backup network from your primary production network to limit the impact of malware, ransomware or intrusions spreading across the organization.
  • Strong adherence to the 3-2-1-1-0 rule, particularly for immutable backups and zero errors.

Other areas that can be added include:

  • Backup anomaly detection
  • Event logging and alerts
  • Regular backup testing
  • Data retention policies and compliance

Implementing defense in depth for backup and recovery requires a layered approach that protects backups at every stage to ensure data remains safe from physical disasters, cyberattacks and insider threats.

Mergers and acquisitions

We're seeing an acceleration of mergers and acquisitions (M&A) in the data protection space, starting with the upcoming merger of Cohesity and Veritas. M&A is being driven by a variety of factors including expanding security roles within data protection and innovations in public cloud backups. Veeam has acquired Alcion, an AI powered security platform, and Coveware, a cyber incident company, as they look to expand the scope of their offerings beyond just backup and recovery.

Adding to the security-oriented M&A, Rubrik acquired Laminar, a leader in Data Security Posture Management. On the cloud front, we saw Commvault acquire Appranix to help them create their new Cloud Rewind product for cyber resilience.  Commvault also recently acquired Clumio to bolster their footprint and technical capabilities within AWS. 

The most surprising acquisition was Salesforce's purchase of Own Company to provide the capability to natively protect customer data. 

The last decade we saw the rise of Rubrik and Cohesity and witnessed a dramatic shift to hyperscale architectures for data protection. We believe, in this decade, we will see solutions that are not predicated on hardware-based requirements and are being acquired by the market leaders who see opportunities to improve their existing capabilities or develop a new go-to-market with innovative technologies. We anticipate further consolidation in this space, particularly where security and cyber resilience features can be added to help enterprises withstand cyber-attacks.
 


[i] https://www.indexengines.com/cybersense

[ii] https://csrc.nist.gov/glossary/term/defense_in_depth