Last week I received an invitation to the latest in the vRetreat series of events. These events bring together IT vendors and a selected group of tech bloggers- usually in venues like football clubs and racetracks, but in the current circumstances we were forced online. The second of the two briefings at the May 2020 event came from Datrium.
To paraphrase their own words, Datrium was founded to take the complicated world of Disaster Recovery and make it simpler and more reliable, they call this DR-as-a-service. The focus of this vRetreat presentation was around their ability to protect an on-premises VMware virtual environment using a VMware Cloud on AWS Software-Defined-Data-Centre as the DR target.
These days the idea of backing up VMs to a cloud storage provider and then being able to quickly restore them is fairly commonplace in the market. Datrium, however, take this a step further and integrate the VMware-on-AWS model to reduce RTO but also ensure reliability by enabling easy, and automated, test restores.
When Disaster Strikes
In the event of a disaster Datrium promises a 1-click failover to the DR site through it’s ControlShift SaaS portal. One of the great benefits here is the DR site– or at least the compute side of it- doesn’t exist until that failover is initiated. This means the business isn’t paying for hardware to sit idly by just in case there’s a disaster.
The backup data is pushed up to “cheap” AWS storage and at the point the failover runbook is activated a vSphere cluster is spun up and the storage is mounted directly as an NFS datastore. VMs can then start to be powered on as soon as the hosts come online – with Datrium handling any required changes in IP addresses etc.
Whilst the system is running in this DR state, changes are monitored so that when the on-premises environment is restored failback only requires the delta change to be synchronised back from the cloud. And at this point the VMware environment on AWS is removed until the next time one is required.
Testing – Practice Makes Perfect
This ability to spin-up and decommission the entire DR site on demand enables realistic testing to be performed without risk to the production workloads. Test restores can be run, and workload-specific tests run on the test environment, but the SDDC built on AWS only exists for the duration of the test.
The Datrium platform contains runbooks, and these are not just restricted to disaster events, but can be used to automate testing. The system will, on a schedule, spin up some or all of the VMware environment in a temporary SDDC then run some specified tests and shutdown and destroy the test infrastructure when complete. The results of this testing are compiled into an audit report.
As I’ve alluded to at the top of this post, there are plenty of “Backup” and “DR” products out there servicing Enterprise IT and leveraging the public cloud to do so. Of those, I think Datrium is worth considering particularly if you are focussed on protecting a vSphere environment with a short RTO, and are interested in using VMware on AWS as a DR solution but not that keen on the not-insubstantial costs of running that DR SDDC 24/7.
Please read my standard Declaration/Disclaimer and before rushing out to buy anything bear in mind that this article is based on a sales discussion at a sponsored event rather than a POC or production installation. I wasn’t paid to write this article or offered any payment, aside from being entered in a prize draw of delegates to win a chair (I was not a winner).