The Weakest Link in Disaster Recovery
by Alex Bakman - founder & CEO of Ecora Software
The IT disaster recovery plan has, until recently, been viewed as a static document that sits in a three-ring binder on every IT mid-level managerís shelf, doing little more than provide comfort that the IT department is ready to do its part to ensure business continuity. Collecting this information from diverse platforms and "massaging" it into meaningful information (if that is attempted) takes a tremendous number of hours and most IT departments do not devote resources to keep the information current.

As a result, all configuration data collected in these documents rapidly becomes out of date due to the one constant in the IT world: change.

Until recently, most disaster recovery plans assumed the existing IT staff would be involved in the restoration process.

For a fire in a corporate data center, this might be true. However, in a natural disaster such as a tornado or flood where the area surrounding a data center may also be affected, IT staff will initially be more concerned with their families and homes than with their work responsibilities. Sadly, September 11th taught us that the unthinkable could occur. Even if IT staff is available to assist recovery, the multitude of IT platforms and the large number of changes that occur on a daily basis limit their effectiveness to support a backup data centerís restoration efforts.

Thus, the IT disaster recovery plan needs to be continuously updated with the latest configuration settings reported in a clear, consistent manner. All changes should be easily identifiable to preserve IT decisions.
  1. There are three main reasons that detailed configuration data is not collected and kept current in many enterprises: Almost no company has enough IT staff. According to the Information Technology Association of America (ITAA), of the current US IT workforce requirement of 10 million, there are over 800,000 vacancies that cannot be filled due to the lack of trained talent. The workload increases, but hiring never keeps up.

  2. The technical competence of individual IT talent varies with training and experience. Configuration documentation may seem an "entry-level" task that most professionals seek to quickly move beyond. Disparate IT staff members often collect different types of information and the quality of their reports varies greatly. The more senior IT people are assigned to "more critical" tasks, deployed by management where they provide the most perceived value for their salaries, which average $85,000 per year ($75 per hour). The hours needed to assemble, verify, and report configuration settings could amount to tens of thousands of dollars in a larger IT shop.

  3. IT staff turnover ranges from 8-17 percent, depending on industry and geographic region. The costs of hiring and training new staff to replace lost employees is nearly triple the IT overhead cost (about $225 per hour). And when IT staff leaves, their knowledge of the corporate IT infrastructure leaves with them.
Are Backups Enough?

One of the most common reasons that detailed configuration information is not recorded is the belief that backups contain everything needed to restore systems into production.

The effectiveness of backup tapes depends upon the nature of the disaster. A system that experiences a simple power outage or hardware failure can easily be restored from backups, but restoring following a complete meltdown is another matter.

Critical information not contained in backups includes: hardware specification for each system, EEPROM settings, specific boot instructions, SCSI ID manipulation, BIOS versions, virtual memory swap space sizes, disk partition slices, space allocation considerations, recovery/re-installation prerequisite considerations, network services provided, network dependencies required for normal functioning, kernel parameters, initial system installation cluster, and configurations that affect storage devices. Typically, volume management software and RAID software is on the tape, but is useful in disk arrangement prior to reinstallation and restoration.


Harnessing artificial intelligence to build an army of virtual analysts

PatternEx, a startup that gathered a team of AI researcher from MIT CSAIL as well as security and distributed systems experts, is poised to shake up things in the user and entity behavior analytics market.

Weekly newsletter

Reading our newsletter every Monday will keep you up-to-date with security news.

Daily digest

Receive a daily digest of the latest security news.

Thu, Feb 4th