IT disaster recovery planning and management
Disaster recovery success begins and ends with the basics
By Dennis C. Brewer
The lessons from the catastrophe of 9/11 are all too often replayed with every new calamity that comes along. If your company's operations are in any way reliant on data stored on computers, you cannot afford to ignore the basic tenets of disaster recovery preparation. If data is not backed up a sufficient distance out of harm's way, your company may not survive the disaster. At some point, maintaining shareholder value and meeting regulatory criteria will become paramount in the disaster recovery discussion. I have distilled the basics of disaster recovery to a baker's dozen. Use this as a checklist to gauge -- and improve -- the effectiveness of your disaster recovery plan.
Rule 0 -- Identify the critical business processes and applications, along with the hardware, software, business and IT support staff that run them, and the local and wide area networks that connect them to the end users. Your business resumption and IT recovery plan must include action items for every element you have identified.
Rule 1 -- On a daily basis own a complete replica (on disk or tape) of the "digital trio." The digital trio consists of:
- The operating system your applications run on and the current patch levels present in your production environment.
- The critical applications that run on the operating system with their current patches.
- The data.
The federal disaster recovery guidelines call for "no data loss." Bit-by-bit backup of data is expensive, so you may have to settle for the last "committed" transaction. If all you can handle now is close of business yesterday, you probably won't go out of business after an adverse event, but the headaches of reconstructing a day's business may make you wish your IT department did a better job of instantaneous data backup.
Rule 2 -- Own or have immediate access to "carbon copy" hardware needed to run the digital replicas. The best backup tapes in the world are of little value if you do not have the exact hardware when and where it is needed to quickly restore the digital trio to new or standby equipment.
Rule 3 -- Create written step-by-step procedures on how to recover the digital replicas on the carbon copy hardware. Store this documentation off site. Regrettably, you cannot count on your existing IT support staff to be present after the adverse event occurs.
Rule 4 -- Test everything often. Your tolerance of what is acceptable as a "recent" test should be low. The odds of the replicas, hardware and documentation failing in some way increase with every week since the last complete test. The business value added to the bottom line by the data resources should help define what "recent" is for your company.
Rule 5 -- Achieve the maximum practical or affordable separation between the locations used for daily operations and the stored digital replicas, recovery hardware and documentation. Same-city backup locations may no longer be sufficient. Consider going to the limits of the communications methods supported by your backup link strategy.
Rule 6 -- Respond immediately to heightened risk conditions. If Hurricane Katrina taught any lesson at all it is to "get out of town now!" Dispatch trained technical and business process staff to alternate work locations at the first clue that an adverse event may be imminent in your operation's location. The key question is: Where should your human resources be when DHS threat levels in your neighborhood are elevated to orange or red, or if another Katrina-scale weather event is headed your way?
Rule 7 -- Own, or at least identify, alternate connections, data-transmission routes and electrical power sources. Depending on where your business is located, your alternatives to diversify routes and sources may be limited. Study the options at each of your locations. If your business or branch is on the end-leg of the digital pipeline, decide if extremes such as satellite up-links are worth the investment.
Rule 8 -- Apply the "Fort Knox" concept; i.e. apply hardened physical security at one or more of your replica sites. If the digital assets are highly valued, then guards, security perimeters, lock-down procedures, and fire/weather resistant or earthquake-proof structures may be justified.
Rule 9 -- Document and exercise a total business resumption plan. Test and the plan and answer the question: Can your business process staff operate efficiently after an adverse event or under adverse conditions after the IT staff has done its recovery tasks?
Rule 10 -- Own and operate your own alternate sustainable power generation. Consider multi-fuel electric power generators. A power blackout event on the electric power grid, such as the one that began in Ohio on August 14, 2003, coupled with problems at regional gas refineries could render gas- or diesel-only backup generators useless after the first tank of fuel is gone.
Rule 11 -- Establish and test ways to continue operations in quarantine conditions. The risks of "white powder" or a highly communicable disease shutting down a building, an expressway or a city are no longer unthinkable.
Rule 12 -- Apply the resources to acquire and maintain rules 0-11 over the entire lifecycle of the critical applications.
The potential exists that over time we'll see a Sarbanes-Oxley-like regulatory approach where current government-suggested disaster recovery guidelines become stringent and auditable compliance regulations for all public companies. But even without regulation, protecting stockholder values in digital asset categories requires continuous due-diligence encompassing all of the basic disaster recovery tenets.
About the author
Dennis C. Brewer is the author of Security Controls for Sarbanes-Oxley Section 404 IT Compliance: Authorization, Authentication and Access published by Wiley. His resume includes a BSBA degree from Michigan Technological University, Novell Network Engineer Certification, and over a dozen years as an information technology specialist with the State of Michigan. He retired from his position as an IT security solutions specialist in January of 2006 from the State of Michigan, Department of Information Technology, Office of Enterprise Security and is now operating his own IT consulting practice in Laurium, Michigan.
17 Jan 2008
Disclaimer: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.