What Should Businesses Require of Data Protection Solutions?

Data protection today is one of the most innovative and fast-moving areas of Information Technology. Data protection technology developers have, in recent years, engaged in constant, vibrant “solution evolution’, continuously striving to deliver new data protection capabilities for increasingly demanding customers. In what has become a virtuous circle, enhanced data protection technologies have led to more sophisticated data protection requirements from businesses, which in turn have stimulated the development of further enhanced data protection technologies. All of this reflects the intuitive understanding that, in today’s business world, the data is the business.

The major difference between data protection today and data protection, say, three years ago, is the prevalence of disk-based solutions. Disk-based platforms redefined benchmarks and expectations for speed and reliability, and today ensure that top-level, critical applications can access the protected data sets that they need, when they need them. With the two main obstacles to traditional, tape-based backup and recovery – lack of speed and unreliability – essentially solved, end users today have a new data protection focus: application recovery.

By describing their focus in this way, they are indicating that their expectations – and, hence, their requirements – of data protection have moved on: no longer is it sufficient simply to schedule a backup, complete it within the allotted window, and confirm that is has completed satisfactorily. In effect, companies have ceased to dwell on the complexity of the backup, and they now demand from their data protection solutions what storage analyst The Taneja Group describes as verifiable and reliable recovery of the right data. They are clearly focused on recovery.

In 2006 this focus on recovery, as opposed to backup, exists in an environment of widespread understanding, approval and, frequently, adoption of Information Lifecycle Management (ILM). Nominally, ILM is defined as “a strategy comprised of people, processes and technology management to store and tap critical business data throughout their lifespan of value”. In practical, engineering terms, this often means analysing and tiering production data to ensure they are held on the storage medium commensurate with their value. For example: high-value, highly likely to be needed, current data may be stored on high speed disk initially, migrating to low-cost disk subsequently, then to a virtual tape library and, ultimately, to tape and/or optical disk for long-term retention or vaulting.

As a direct result of this way of looking at their data, I.T. teams have intuitively grasped that what they now require of their data protection solution is the ability to recover verifiably correct data corresponding to any and all of these production tiers; to create, in effect, “recovery tiers’. This is leading businesses to develop data protection strategies that involve aligning the right kind of data protection technology with the appropriate data sets to deliver an optimal recovery schema for the company and all its data. And, in just the same way as for storage tiers, recovery profiles for different recovery tiers vary by criteria that include frequency, speed, granularity, application integration and geography.

The Taneja Group has suggested that this approach may be described as Recovery Lifecycle Management (RLM) – clearly capitalising on the widespread understanding of ILM to generate immediate comprehension of this new approach to data protection. The rapid advances in data protection technologies over the last three years means that the solutions available to build the optimal recovery schema for any given business certainly now exist. The technologies include:

  • the relatively new concept of continuous data protection (CDP), for creating point-in-time recovery images of specific mission critical datasets – typically email and databases.
  • virtual tape libraries (VTL), enabling the creation of a high-speed disk-staging capability for current, but not critical data.
  • replication to a disk platform at a secondary site, ready for remote access by the primary site in the event of catastrophic data loss at the primary site.
  • and vaulting, for old data that needs to be archived and retained, perhaps in case it may one day be needed for reference, or simply for compliance reasons.

CDP is the process whereby data is captured and replicated to a separate storage location to ensure that a set of critical data is always available. The point of CDP is to minimise data loss by providing rapid data recovery, to any point in time, with minimal downtime. However, the relatively high cost of CDP means it is only truly appropriate for optimising the backup and recovery of genuinely critical data – for Exchange email servers, for example. The particular value of CDP is its ability to restore “hot’ production data – say, up to seven days old – to any point within a specific time interval.
But customers also require of their data protection solutions suitable levels of protection for “warm’ data – say, up to six months old.

Space-efficient delta snapshots enable production data to be restored to a pre-defined moment in time with guaranteed data integrity, at a granularity of perhaps every two hours. These snapshots can be retained for, say, three months.

Finally, businesses require protection for their “cold’ data – up to a year old, perhaps. Clearly this can be restored from backup tapes, giving a granularity of perhaps a day, week or month, depending on the policy for taking full or incremental backups. The two main problems organisations faced when physical tape predominated are well-documented: media failure or robotic failure. The introduction of virtual tape libraries (VTL) eliminated these particular errors and also enabled backups and the corresponding restore processes to take place up to four times more quickly.

The following diagram summarises the Recovery Lifecycle Management data protection strategy, and shows how it may be enacted through a suitably-aligned product portfolio: hot data is protected by CDP, delta snapshots provide protection for warm data and VTL storage takes care of cold data. Ultimately, after a process of Redundant Data Elimination (RDE), a “single instance store’ may also be created for long term retention.

Although many IT vendors actively promote “return on investment’ on storage and data protection purchases, many disk platforms for data protection have been purchased over the last three years in order to guarantee the backup and restore process, with return on investment considered a secondary issue – certainly secondary to ensuring business continuity. However now those platforms are in and operational, IT departments are looking to see a return on investment for these purchases as they would for any other major IT purchase.

As a result, return on investment has become a requirement of installed data protection solutions as much as it has for new purchases. The Taneja Group believes that 2006 and 2007 will be all about businesses trying “to assemble reliable recovery infrastructures that make their disk backup investments worthwhile”, and envisions “recovery tiers [taking up] residence alongside storage tiers in the industry vernacular.” It’s a vision that we at FalconStor share – as might be expected having developed a comprehensive family of products that enable recovery lifecycle management.

Summary
It could easily be argued that, a few short years ago, it was vendors that set the agenda for data protection: customer requirements were more than likely to be constrained by the solutions available. But the advent of disk-based backup and recovery and the market penetration of disk-based platforms have led to a more symbiotic relationship than usually exists between IT vendors and business customers: today, solutions and customer requirements seemingly advance together to provide far more than data protection solutions. In fact, today, the right solution portfolio and the right recovery schema performs far more than just the age-old function of data protection: it provides real business advantage and return on investment. And that’s probably actually more than customers expect of data protection.

Don't miss