Disaster Recovery Orchestration | The Importance of Orchestration

The global DRaaS (Disaster Recovery as a Service) market has witnessed significant growth in recent years, and is predicted to grow 36% between 2016 and 2024 on a compounded annual basis. This growth has predictably attracted new market entrants and changed the market dynamics.

Uptime is an operational imperative. So any form of downtime from an Exchange crash to a site wide disaster (tornado, hurricane, flood) to a ransomware infection can cost an organization dearly in terms of lost revenues and productivity. The right DRaaS solution with well-tested orchestration can dramatically reduce the amount of downtime and stress associated with these incidents.

That’s why it’s increasingly important to find objective measures to separate the contenders from the pretenders. One of the key differentiators is how solution providers deliver orchestration – the orderly recovery of a server environment during an outage. Orchestration ensures that critical servers, applications and their dependencies come online without incident. It’s important to understand exactly how your vendor plans to failover your applications, and then failback, in addition to how much customization and control you have in the orchestration process.

When it comes to unplanned downtime, an ounce of prevention is worth several pounds of cure.

When disaster strikes or critical systems crash, IT administrators have to be thoughtful about how — and in what order — they restore applications.  The order of operations is crucial for a seamless system restoration.  For example, if your environment utilizes a DHCP server to manage leases on your machines, this server would be among the first applications to be brought online, because of the importance of assigning IP addresses and providing configuration information.  You may also want your AD server to come online shortly thereafter, if not concurrently, to automate network management of user data, security, and distributed resources.

After you resuscitate these core systems you will want to restore your production workloads such as SQL Server, Exchange, and other mission-critical apps.  Then, you can boot your secondary applications. Order clearly matters, and orchestration sequencing is the means by which DRaaS solutions restore applications in a predetermined order.

Not all vendors treat orchestration equally; you have to uncover if — and how — your DRaaS vendor can deliver on this functionality.  There are four core ingredients and components of orchestration:

  • Runbooks: Most cloud recovery providers offer a simple disaster recovery runbook that describes the order in which your systems (VMs) should recover. The runbook defines a group of machines that are powered on (simultaneously) with a single command.  The real power of orchestration, however, is the ability to determine the actual order (not just a group of apps that boot simultaneously).  This is where scripting comes into play.
  • Scripting: The other half of orchestration is scripting. IT can create simple, customized scripts (basic commands) that execute more complex configuration for their runbooks. This includes everything required to execute a complete recovery.  Scripts can also be used to ensure that machines without DHCP servers can be rebooted with their proper network configuration (such as IP & Mac addresses).
  • Testing: Another key component of orchestration is the ability to test the failover process and ensure the runbook and scripts work as expected. Unfortunately, many DRaaS vendors charge for DR tests and/or require formal disaster declarations to perform these tests. Increasingly, IT administrators are looking for a self-service failover solution that puts the control back in their hands. You’ll want to test your orchestration periodically after the initial setup, system variables continuously change (e.g., when you deploy new service packs), it’s not a one-and-done activity.
  • Failback: After your production servers are running virtually, IT is freed up to rebuild your hardware in anticipation of application failback. Once the hardware has been properly configured (post disaster), then it’s time to restore applications and their operating systems. If it’s a physical machine, then you can use a USB drive or disk to recover from a pre-installation (PE) environment. If it’s a virtual machine, you can simply push the guest back to its corresponding host. All of this can be done while capturing any changes made by the users’ while working with the ‘booted’ image (during the outage).

At Infrascale, we’ve invested in orchestration to be the easiest and most customizable DRaaS solution on the market.  We enable runbooks to boot up specific VMs and groups of VMs, as well as custom/canned scripts to manage the boot sequence of applications, all based on your specific environment.  But, we’re taking this a step further. We’ve even built a simple “drag and drop” interface that lets you build out your orchestration sequencing. Users drag and drop applications and custom/canned scripts from a network tree view to create the designed workflow. We also offer unlimited testing so you can test and retest your orchestration with impunity.

As you give DRaaS solutions a closer look, it’s imperative to ask any prospective vendor how they manage the orchestration process. It’s important to go beyond simple DR runbooks to create a more comprehensive disaster recovery playbook.  When orchestration is well planned, coordinated, and tested, it can have a dramatic impact on reducing the amount of downtime for any type of micro- or macro-disaster.  And just as important, it will have a dramatic impact on your stress level, by giving you the confidence of knowing that you can recover from anything thrown your way.

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published.