Nutanix Benefit 4: Granular and Efficient Snapshots

Nutanix.dev - Nutanix Benefit 4 Granular and Efficient Snapshots

Table of Contents

View all current content in this series and make sure you don’t miss upcoming installments: Nutanix Top 10 Benefits Series.

In our previous posts we showed how the Nutanix distributed architecture is well-suited for business critical apps and databases. The architecture is the foundation that all other features are built on. In this next entry, we’ll focus on how granular and efficient Nutanix snapshots speed clone creation times and make restores a breeze. First, let’s define what a snapshot is and what a snapshot is not. A snapshot is a reference to the state of a system at a given time. However, while a snapshot can be used to take a backup of a system, it is not a backup. 

Granular and efficient snapshots are the foundation of Nutanix Data Protection. Nutanix provides VM-centric snapshots at the scope of a single vdisk instead of the larger LUN or container level. To understand the advantages of Nutanix snapshots, you must first understand the different types of snapshots available today.

The two most commonly used types of snapshots in enterprise IT are copy-on-write (CoW) and redirect-on-write (RoW); however, these two snapshot implementations are not created equal. Each implementation has a set of pros and cons. Nutanix chose RoW snapshots for several important reasons.

First, reduction of the number of reads and writes. RoW redirects updates to protected blocks to a new location and then updates a pointer in metadata to reference that location. This results in one write operation. When restoring from a RoW snapshot, the system does a lookup to see where the data is located and reads it directly. 

CoW copies any protected blocks to be updated to a separate snapshot space, incurring one read operation and two write operations. When a restore operation is performed, the system will need to examine each snapshot in the chain until it finds the data to restore. This adds overhead and increased time to snapshot restores. 

Secondly, RoW snapshots are more suitable for long-term snapshot storage. When stacking snapshots over time, it takes less overhead to traverse a metadata chain than it does to traverse full copies. This means the system spends less CPU and I/O on RoW compared to a similar system using CoW snapshots.

The last aspect of snapshot implementations to consider is at what granularity the system protects and restores data. Storage Arrays typically work at a LUN or Volume level and have no understanding of the data that is being stored. When leveraging virtualization, this broad scope results in a simultaneous snapshot of dozens of virtual machines. This leads to inefficient practices and workarounds, such as creating a LUN for every VM. In addition,  expert level knowledge of LUNs, volumes, fan-in/fan-out ratios, queue depths, and more becomes very important to manage all of this complexity. 

How Nutanix Implements Snapshots and Clones

There is a better way. Nutanix implements an application-centric approach with vDisk based snapshots leveraging RoW. When AOS takes the initial snapshot of a VM or volume group, it creates a read-only, zero-space clone of the metadata and makes the underlying VM data immutable. These snapshots take only a few seconds to create, shrinking application and VM backup windows. As the system continues to take snapshots of changed data, updates and new writes are redirected to the new location. 

The original data in the snapshot remains unchanged and the system shares this data across the snapshots and active VM. AOS handles the snapshot process transparently, so there is no change to how applications and the virtualization stack access the VM. 

There are two levels of consistency for snapshots. Crash-consistent snapshots are instantaneous and help workloads recover from operating system (OS) or VM crashes. Application-consistent snapshots leverage the Nutanix Guest Tools and Microsoft Volume Shadow Copy Service (VSS) to complete open transactions, roll transaction logs, flush caches and freeze the file system prior to taking the snapshot. This results in a snapshot where the data is in a state the application understands and easily restored. Utilizing application-aware snapshots on Nutanix scale-out architecture shortens quiescence times resulting in more consistent performance   

Based on your RPOs and retention needs, you can schedule both VM and volume group snapshots and configure their retention locally and remotely on a once per minute, hourly, daily, weekly, or monthly basis. Additionally, you can group multiple VMs and volume groups in a Nutanix protection domain, which allows you to operate them as a single entity with one RPO. This structure is particularly useful when you’re protecting complex applications such as Microsoft SQL Server–based applications or Microsoft Exchange.

From a restore perspective, administrators can recover data as granular as an individual file, a VM or vDisk, or as large as a storage container. This flexibility allows you to restore with the exact scope you need without wasting time and resources to restore a LUN.. You can either replace the existing active VM with the crash-consistent snapshot copy or create a separate clone of a snapshot, preserving the active VM. Application owners can rest easy at night knowing that their supported applications are in a consistent state and ready to restore. All of this takes place with speed and efficiency resulting in faster restores.  

Application-granular Clones build on the Nutanix snapshot capability, giving them the same space efficiency and performance characteristics. You can use cloning for a variety of purposes, including VM deployment and recovery. Integrating the virtualization stack with functionalities like VMware vStorage APIs for Array Integration (VAAI) enables administrators to simplify VM deployment using cloning. Nutanix offers the same native functionality for AHV and also provides a quick clone plug-in for Hyper-V. You can also clone individual volume groups as rollback support before application upgrades. Because Nutanix clones build on our snapshot technology, they have the same space efficiency and performance characteristics. 

Why Does This Matter? 

Nutanix believes that enterprise infrastructure should be powerful yet simple to manage. We decided that snapshots and DR should be integrated into the core of the platform from day one. As a result, your applications and VMs are protected both locally and remotely by a high-performance, flexible snapshot architecture that is:

  • Granular – Our snapshots can be as granular as the vDisk level allowing for single file restores or as broad as a storage container allowing for broad brush restores without compromise.
  • Efficient – Our snapshots and clones are space efficient as they are metadata pointers with individual vDisk block maps. This allows for fast snapshots and restores with no need to traverse the disk chain.
  • Performant – By leveraging RoW snapshots, clones and individual vDisk block maps, we give the full performance of the platform to the applications that drive your business.
  • Effortless – Easy to configure policies automate the protection and restore of your VM’s and volume groups based on your RPO and desired recovery locations.

In the next blog, we’ll dig deeper into how replication and disaster recovery builds on top of our granular and efficient snapshots and clones. You will learn how Nutanix can help simplify the protection and recovery of your applications no matter where they run.

© 2024 Nutanix, Inc. All rights reserved. Nutanix, the Nutanix logo and all Nutanix product, feature and service names mentioned herein are registered trademarks or trademarks of Nutanix, Inc. in the United States and other countries. Other brand names mentioned herein are for identification purposes only and may be the trademarks of their respective holder(s). This post may contain links to external websites that are not part of Nutanix.com. Nutanix does not control these sites and disclaims all responsibility for the content or accuracy of any external site. Our decision to link to an external site should not be considered an endorsement of any content on such a site. Certain information contained in this post may relate to or be based on studies, publications, surveys and other data obtained from third-party sources and our own internal estimates and research. While we believe these third-party studies, publications, surveys and other data are reliable as of the date of this post, they have not independently verified, and we make no representation as to the adequacy, fairness, accuracy, or completeness of any information obtained from third-party sources.