Business Continuity with Veeam Replication, Failover & Failback
Veeam replication enables you to set up business continuity, avoid data loss, and failover to ready-to-go VM replicas on a secondary DR site. StoneFly enhances Veeam replication with integrated turnkey solutions and advanced data services. In this blog, we take a closer look at Veeam’s replication, failover, and failback capabilities
Table of Contents
Business Continuity with Veeam’s Built-in Data Replication
When you think business continuity, one of the first things that come to mind is how to successfully get your data offsite to a remote location and to protect it against a large-scale outage such as a natural disaster or a data center failure. This is where data backup and data replication come in.
One of the things that Veeam has built in its product is backup and replication. Replication does much more than simply replicating backup data to another storage target. Veeam’s replication functionality will actually take a source work load and generate a fully hydrated, fully functioning copy at a remote location standing-by ready-to-go at a moment’s notice; should you find yourself in one of these large-scale issues within the data center.
Differentiating Veeam Backup Copy from Veeam Replication
In the above diagram, we’ve got our production site with a basic VMware environment on the left side and we have a disaster recovery site with a similar basic VMware environment on the right. In the middle there’s a connection between these sites, it may be a WAN, MPLS network or VPN tunnel.
How Veeam Backup Copy Works
When Veeam software creates a backup copy, it’s simply taking a backup that’s already compressed and deduplicated and transferring it from site A to site B. The transferred copy remains compressed and deduped in the form of a Veeam backup file; not a VMDK or VHDX file.
The backup copy is not something a native hypervisor would run directly and it simply lives on the repository storage.
How Veeam Replication Works
In contrast, replication jobs are entirely different. Veeam replication looks at a source virtual machine (VM) and creates a fully hydrated and functional copy at a remote location. This copy is registered in the inventory and simply powered off when it’s not in use.
The idea behind replication is to enable a failover and a failback scenario. If you have a large site wide issue as a datacenter failure, an ISP outage or maybe an inbound natural disaster, you actually have the ability to failover proactively to avoid any such catastrophes and minimize downtime.
There are a certain number of similarities between the backup and replication process.
The way this process works is:
- Veeam software sends a VSS command to the actual VM (if image processing is enabled)
- After the VSS quiescing is completed, Veeam takes a snapshot of the VM.
- Once the snapshot is created, the backup software starts processing the data.
- Similar to the backup process, the data is compressed and deduplicated. Compression and deduplication are done to reduce the volume of data to be transferred from site A to site B using WAN, MPLS, or VPN.
Note: Up to this point, the backup and replication process are identical.
- Once the data is transferred to site B, it’s rehydrated and written natively as a fully hydrated VM ready-to-go at a moment’s notice.
Storage-Level Replication versus Host-Level Replication
Furthermore, as you’ll notice in the above diagram, the two environments are hosted on SAN networks. That’s because the Veeam software replicates data on a host-level as opposed to storage-level.
However, storage-level replication has some benefits over host-level replication. What Veeam’s native replication does is communicate with the host and take snapshots of VMs. While together with StoneFly, Veeam users can leverage storage level replication which essentially enables them to bypass all of the above and replicate a Logical Unit Number (LUN) instead.
The primary advantage of storage-level replication is that it’s quicker than host-level replication. Generally, the con attached to storage-level replication is that you require identical/matching arrays. However, StoneFly’s patented storage OS (StoneFusion ™) enables you to set up storage-level replication as needed by your projects.
Snapshot Retention on the DR Site
On the disaster recovery (DR) site, Veeam offers retention for snapshots. This retention is different from the point-in-time backup retention. Backup retention enables you to use any incremental backup and recover different versions of data.
As on the DR site, data is not being stored in Veeam format but as a native VM, the only way to retain it is with snapshots.
In a VMware environment, you can hold up to 28 snapshots, in Hyper-V you’ve got 47, at current versions. While 28 snapshots are a bit much, they provide several restore points with your replicas so that you can spin up different versions of data and failover.
Schedule Replication Jobs
Similar to backup frequency options, you can schedule hourly, daily, or even monthly replication jobs. You can also set up replication frequencies on a more granular level and get into minutes or even set it to continuous/real-time replication.
Continuous replication is not to be confused with continuous data protection (CDP). Even with the replication frequency set as continuous, the Veeam software still has to:
- Send the VSS call(s)
- Take a snapshot of the VM(s)
- Process the data by compressing and deduplicating it
- Transfer and write it on the target DR site
Once the above processes are completed, the VM is hydrated on the DR site, the snapshot is deleted, and the replication job is complete.
If you set replication frequency to continuous, all that means is that the Veeam software automatically restarts the replication process as soon as it completes a job. That could be five minutes, twenty minutes, or an hour depending on available environments variables with regards to how fast that process can conclude.
Permanent Failover to DR Site
An important thing to remember is that failover to a replicated VM is an intermediate step much like the instant VM recovery feature. It’s not a click and you’re done scenario. After failover, you need to finalize it and decide as a business what do you do.
Once you have failed over, you have the ability to do a permanent failover. If you do permanently failover, the Veeam software assumes that the production environment is gone. The permanent failover feature is reserved for when the datacenter suffers a major issue and you need to run your VMs at the DR site for an extended period of time.
For example, if the production datacenter was flooded or experienced hardware failure. You know that you’re going to repair it but you’ll need to run your VMs on the DR site in the meantime. That’s when you do a permanent failover, restore business operations, and run your workloads at the DR site.
After the production environment has been rebuilt, you will need to failback. So, it’s reverse replication from the DR site to the new production environment.
Failback from the DR Site Using Replicated VMs
Failback assumes that you’ve failed over to the DR site. Once your production is ready, all the changes that have been tracked at the DR site, the Veeam software resyncs them to the production site for failback.
The original VMs are still at the production site but they’re outdated because all the changes are tracked over at the DR site, the failback will sync those changes and update the source site VMs with only the delta changes.
If the original VMs are corrupted or no longer present on the production side, the failback will restore the entire VM as it exists at the DR site.
In the worst case scenario, all the VMs on the production environment are gone. Let’s say for this instance this is a short outage and you had to rebuild everything, this might have been a day or two. Rather than doing the permanent failover, you can stay in the failover state, get everything running back on the production site and then fail back to a new production environment. So, in that instance Veeam will actually rebuild all the VMs over at the production site.
An alternate option would be to physically transfer the backups of VMs from the DR site to production instead of transferring them over the WAN, MPLS, or VPN connection. Then, we use these backups as a seed and sync the delta changes to rebuild the VMs.
Committing After Failback
After a successful failback, let’s say you’re now running production, everything is smooth, all the services are running properly, the final step is then to commit the failback.
This is very crucial because if you don’t commit the failback, you’re still running in a limbo state. Until you commit the failback, there are still protective snaps over at the DR site. Veeam still thinks that you’re running in an intermediate phase. So, it’s important that once you have failed back and have verified that everything is running, make sure you commit the failback which will remove all the protective snapshots that you have on your replicas and bring you back to an operational state at the production location.
Effective Replication, Failover & Failback with Veeam and StoneFly
As a longstanding Veeam technology alliance partner (TAP), cloud service provider (CSP), and reseller, StoneFly provides Veeam-ready solutions integrated, tested, and purpose-built to support Veeam-native replication, failover, and failback features in addition to all Veeam backup and restore features.
Furthermore, StoneFly appliances, cloud products, and services not only support Veeam capabilities but also enhance them with an array of advanced data services that deliver data security, optimization, and real-time monitoring.
StoneFly’s integrated Veeam-ready products include:
- Veeam-ready backup and DR appliance (DR365V) – 4 to 60-bay iSCSI/Fibre Channel turnkey appliances with integrated data services such as air-gap, WORM, snapshots, replication, direct VM spin up, and more.
- Veeam Cloud Connect to Azure – Complete cloud backup and recovery solution with Veeam software, management gateway, and Azure cloud storage.
- Veeam Backup for Office 365 – Turnkey data protection for Microsoft Office 365 data powered by Veeam with support for on-premises and cloud storage.
- StoneFly miniBackup™ – Budget-friendly turnkey appliances with terabytes of on-premises storage for hot-tier and unlimited integrated cloud storage for long-term archiving best fit for remote office, branch office, and small business backup.
Business continuity requires you to transfer your data offsite and protect it from large-scale threats such as cyber-attacks, ransomware, natural disasters, and hardware failure.
Veeam’s native built-in replication feature enables you to:
- Create snapshots of your VMs on the production environment
- Process the data by compressing and deduplicating it
- Transfer it to the DR site and rehydrate it so that the VMs are ready-to-go at a moment’s notice on the DR site
- In the event of a datacenter outage and production failure, failover to the DR site for quick recovery and business continuity
- After the production datacenter is restored, failback by reverse replicating from the DR site
As a long standing Veeam TAP, CSP partner and reseller, StoneFly solutions are integrated, tested, and purpose-built to support all Veeam replication, backup, and restore features. Additionally, StoneFly solutions also enhance the Veeam experience with advanced data services such as air-gap, WORM, immutable snapshots, encryption, and more.
Looking to set up replication, failover and failback for your business-critical VMs?
Contact us today to discuss your requirements and projects with our Veeam engineers:
Phone: +1 510 265 1616
Connect with Us on Our Social Media