Sponsored By

Disaster Recovery Reality CheckDisaster Recovery Reality Check

Survey sheds light on the state of disaster recovery efforts in the enterprise.

Gary Audin

December 13, 2018

4 Min Read
Disaster Recovery

Disaster recovery (DR) services based on the cloud have grown significantly over recent years. Enterprises have to study their operations and determine what is important to store for disaster recovery purposes, how often things need to be backed up, and how fast they will be able to recover.

 

Not only is each enterprise a bit different, but DR for each IT function within that enterprise may be different. The enterprise has to evaluate how to budget for DR and when to use the cloud (see "Ready for a Disaster?").

 

Disaster Recovery Report 2018

CloudEndure’s Disaster Recovery Survey Report resulted from a March online survey of 375 IT professionals from around the globe who are using or looking to implement disaster recovery. Although the survey covers a range of responses, this blog is focused on two measurement goals and the success of these goals. The graphics in this blog are from the report.

 

Recovery Point Objective (RPO)

The RPO is how old are the files that must be recovered from the DR site for normal operations to resume when there is an outage as the result of a hardware, program, power, AC, or communications failure. RPO can be for the entire enterprise or specific applications, and it’s measured backward in time from the instant at which the failure occurs. Once the RPO is defined, then that will dictate the minimum frequency with which backups must be made.

 

The survey reported that about one-fifth (21%) of this year’s survey respondents report RPOs of less than one minute. When compared to the 2017 survey, the number of companies expecting zero RPO has increased. RPOs of four hours or less were expected by 74% of enterprises. Unfortunately, 8% of respondents reported that they have not determined RPOs at all.

Audin_DR_1.png

In one of my consultancy projects, the customer’s RPO had to be effectively zero. The system for a Federal Reserve Bank required that we needed to recover all the money and security transfers (no matter what state the transaction was in) at the point the outage occurred. We check-pointed every stage of a transaction to avoid creating any financial errors upon recovery.

 

Another project provided business information that only changed daily. Therefore, in this case we could tolerate a RPO of hours, but less than 24 hours.

 

Recovery Time Objective (RTO)

RTO is the maximum allowable length of time that an IT function can be down after an outage occurs. The RTO is a measure of the extent to which the outage disrupts normal operations, and thus, it must be measured against the amount of revenue/profit/fines/fees lost per unit time as a result of the outage. The determined RTO will vary based on the functions and applications effected.

 

In the CloudEndure report, the majority of respondents (69%) report a RTO of four hours or less, with 6% of survey respondents having an RTO of zero. An additional 6% have an RTO goal of under one minute. A surprising 13% of respondents report being able to accept an RTO of more than 24 hours or no determined RTO at all.

Audin_DR_2.png

In the case of the Federal Reserve Bank, RTO was one minute so that transactions would be delayed but not lost. The business information service could tolerate an RTO of up to 10 minutes without any loss of revenue or reduced profit.

 

There have been many studies conducted to determine the cost of downtime for various applications in enterprise operations. The studies indicate that the outage cost depends on long-term and intangible effects. The costs are also based on immediate, short-term, or tangible factors (see “Cloud Out”).

 

Meeting Goals

Of course, every business wants to meet its DR goals, but they may be hampered by budget and/or poor design and execution. Less than 43% of enterprises surveyed could meet their RPO consistently. Fewer enterprises (37%) could meet their RTO consistently. Setting the goals is easy; meeting the goals is hard and may not be possible.

Audin_DR_3.png

Other Survey Results

  • 47% of the enterprises use disaster recovery for at least half of their systems.

  • In 2018, 15% of enterprises aim for five-nines (99.999%) availability or better.

  • Almost half (47%) of all those surveyed use a public cloud as their disaster recovery target site. Only 15% use physical systems, and 39% use private clouds.

  • Sadly, only 7% perform a monthly disaster recovery drill, while 28% conduct drills quarterly. Some enterprises (15%) admitted that they never conduct disaster recovery drills, so they don’t really know if the DR works as planned.

In one of my consultancy projects, we were so successful avoiding an outage that required instituting the backup, that when we tried to use the backup system (which was on site) the personnel necessary to run the backup had been reassigned without our knowledge. The backup was no longer functional. A major lesson we learned was to exercise the backup periodically to ensure it worked correctly and all the resources are available when you need them.

About the Author

Gary Audin

Gary Audin is the President of Delphi, Inc. He has more than 40 years of computer, communications and security experience. He has planned, designed, specified, implemented and operated data, LAN and telephone networks. These have included local area, national and international networks as well as VoIP and IP convergent networks in the U.S., Canada, Europe, Australia, Asia and Caribbean. He has advised domestic and international venture capital and investment bankers in communications, VoIP, and microprocessor technologies.

For 30+ years, Gary has been an independent communications and security consultant. Beginning his career in the USAF as an R&D officer in military intelligence and data communications, Gary was decorated for his accomplishments in these areas.

Mr. Audin has been published extensively in the Business Communications Review, ACUTA Journal, Computer Weekly, Telecom Reseller, Data Communications Magazine, Infosystems, Computerworld, Computer Business News, Auerbach Publications and other magazines. He has been Keynote speaker at many user conferences and delivered many webcasts on VoIP and IP communications technologies from 2004 through 2009. He is a founder of the ANSI X.9 committee, a senior member of the IEEE, and is on the steering committee for the VoiceCon conference. Most of his articles can be found on www.webtorials.com and www.acuta.org. In addition to www.nojitter.com, he publishes technical tips at www.Searchvoip.com.