What is Storage Snapshot Technology?
Storage Snapshot is proving its prevalence effective in the data storage field, as it offers data protection along with data mining and data cloning. Most of the vendors who deliver storage hardware and related software are offering snapshot technology support, as it offers advanced data protection, which is essential for mission critical businesses. Storage snapshot offers zero impact backup with minimal or zero application downtime, as it takes up frequent backups. It helps in reducing data recovery time as it takes up backup of large volumes of data in an efficient way, facilitating instant recovery from snapshot.
Conversely, while selecting a snapshot technology from a vendor, one must carefully consider the requirements and the environment of the enterprise, which is about to deploy the technology. The main aim of this article is to provide an overview of the technology and to explore the snapshot capabilities in the most efficient way.
What is a Snapshot?
Snapshot is a technological term, which denotes the state of a storage device at a particular time and preserves preserved by the action of the snapshot, for restoring data back-up functions in the event of failure of the data storage. A storage snapshot is a back-up copy, which is created at a particular point in time. Traditionally, Snapshots are available on instant basis to the other applications, involved in data analysis, data protection and data replication. The data related copy, will be constantly and consistently (removed-be) available to the applications. But the snapshot will be available as a backup copy, to perform other functions on the data.
Storage snapshot act as a superb means of data protection, in situations, where disaster recovery is on demand. Snapshots not only provide protection to data, in the event demanding data continuity, but will also enable enhanced applications availability, simplify back-up management of large volumes of data and offer virtual elimination of backup gaps, which all pave way to lowering of total cost of ownership.
Implementation of Snapshot
Various vendors offer different implementation techniques in the creation of snapshots and each technique offering storage snapshots will have its own benefit as well as drawbacks. So, it is very important to first understand the concept of storage snapshot in order to tailor an effective data protection solution for an enterprise.
Below section will be explaining the popular techniques used in storage snapshot technology.
Copy-on-Write – A storage snapshot is created by the use of the pre-designated space allocated to it. When the snapshot is created at first, the meta-data related to the original data is stored and is backed up as a copy. There is no physical copy of the snapshot which is created. So, the creation of the snapshot is approximately immediate. Then as the writes to the original volume are being lodged, the snapshot carefully tracks down the changing blocks taking place due to writes, on the original volume.
As the original data which is written to, is copied into the allocated storage pool, it is set as a snapshot copy. The original data is then overwritten and will fetch the name of “copy-on-write” technique. The duty of the copy-on-write is to move the original data block to the snapshot storage, prior to the write onto the block. So, this makes the data remain consistent in the time based snapshot. When the user requests for “read- requests” to the snapshot of unchanged data, the request is reflected to the original volume. If in case the requests are related to the changed data, then the request is diverted to the “copied” blocks in the snapshot. It is a fact that, the snapshot will contain meta-data which reports about the data blocks, which have witnessed a change, since the time of last snapshot was taken. It must be noted that the data blocks are copied only at once, into the snapshot, on first write instance basis.
The main drawback in this copy-on-write technique of snapshot is that there will be an impact of performance for sure on the original data volume, as write requests to data blocks can only be carried out, when the original data is being “copied” to the snapshot. If in case the data remains unchanged, then the read requests will be diverted to the original volume. Copy- on- write technique is space efficient as the storage space required to create a snapshot is minimal to hold only the data which is changing and the data will only be valid, when the original copy is available.
Note – Meta data is the data which gives complete information of the data stored in the disk. Let’s take a paradigm, where an image is stored as a data in the block. Meta data offers complete information related to the color of the data, dimensions of the image, its resolution report and the time and date the image was created and modified. In case if the data is related to a text document, then the meta data related to it will be related to the creation data of the document, time of the document, author of the document, when it was written and modified and a short summary about the data stored in it.
Redirect-on-Write – It is a clone to the Copy-on-write technique related to storage snapshot. But the fact is that it doesn’t deal in double writing and offers storage space and performance with efficient snapshots. In this technique, new writes are reflected to another location which is separately allocated for snapshots. The main advantage in redirecting the write is that one write takes place at a time, which is different to copy on write, where two writes can occur at a time, where one is for writing data on original copy onto the storage and the other one is for the changed data copy.
With the technique of redirect-on-write, the original copy will have the point in time data, which will be the snapshot and the changed data will be diverted to the snapshot storage. If a snapshot gets deleted, then the snapshot storage will be re-diverted back into the original volume. If multiple snapshots are created, then complexity factor will arise as access to original data, tracking of the data in snapshots and original volume, reconciliation upon snapshot deletion becomes complex. As the snapshot relies on the original copy of the data, the original set can get quickly fragmented.
Split-Mirror – In this technique of snapshot, split-mirror creates a physical clone of the storage entity, which can be a file-system or a volume, or LUN-Logical unit number for storage and it can be treated as a snapshot, which can be another entity of same kind and with the same size. The whole contents of the storage volume are copied onto a separate volume which can reside on a different storage space. But as the data copy takes place, the split mirror snapshots cannot be created instantaneously. But a clone can be made available by splitting a pre-existing mirror of the volume into two. But there will be a drawback to the original volume, as it has one fewer synchronized mirror. In addition to it, there will be a need for extra storage space as the original volume has one few synchronized mirrors. This snapshot method requires equivalent storage space like the original data and so it has a performance overhead of writing simultaneously to the mirror copy.
Log structure file architecture – In this technique of storage snapshot feature, log files are used to track the writes to the original volume. When the need to restore the data arises, then the transactions noted in the log tracks are run in the reverse way. Each of the write requests is logged to the original volume, which is much like a relational database.
Copy-on-Write with background copy – Copy-on-write with a background copy is offered by some vendors, where a full copy of the snapshot data is created. In addition to it, a background process also copies a data from original location to the snapshot storage space. This approach will offer dual benefits reaped from copy on write and split mirror methods. An instant snapshot is created by copy on write and then a background copy process is offered to perform as a block-level copy of the data. It stems up from original volume which is the ‘Source Volume’ to the snapshot storage (‘Target Volume’), which will be in sequence to create an extra mirror of the original volume.
Continuous data protection – This is also known as CDP and is a much popular technique of storage snapshotting, offered by most of the vendors. A continuous data protection is offered with a continuous backup, which will be referring to the data required for backup, subjected to change. The change is automatically captured and is stored in a separate location. An electronic journal of complete storage snapshots will also be available with CDP.
The technique of snapshot is different as it creates one snapshot, at every instance, when the data modification occurs.
Snapshot and Storage Stack
The term storage snapshot refers to the hardware and software components that provide physical storage media to the applications, running on the host operating system. Snapshot solutions can be implemented to each individual layer in the storage stack.
Typically, snapshots can be created in software and hardware based layers and can also be categorized as controller based snapshots or host based snapshots.
Usually, the controller based snapshots are supervised by the hardware vendors of data storage subsystems and can be integrated into disk arrays. As these snapshots are created at block level of logical unit number , they are not dependent on operating systems and file systems. The host based snapshots will not have any kind of reliability on the hardware which is underlying and it depends on the file system and the volume manager software. Most of the snapshots operate on the logical view of data, rather than the physical layout, which is utilized as a controller based snapshot.