Industry News
A "Virtual" Storage Revolution: Self-Protecting Storage Systems
Virtualization Signals a Major Shift in the Industry Toward Software Value and Hardware Commoditization
Sep. 12, 2005 04:00 AM
Digg This!
Page 1 of 2
next page »
Virtualization is arguably one of the most prominent “buzz” technologies in the computing industry today – with more than $4.2 billion in reported deal values for M&A activity since December 2000. It’s a key component in just about any next-generation architecture due to its ability to break down the sheet metal bonds between systems and base resources and applications on logical instead of physical parameters. It has also signaled a major shift in the industry toward software value and hardware commoditization.
Virtualization when used in a storage context is assumed to be virtualization of primary storage capacity for the purposes of sharing. But there’s a trend afoot that’s taking virtualization to the next level. Along with advances in disk technology/performance, component cost reductions, and emerging technologies in grid computing, virtualization is helping to drive some traditional independent data protection applications such as backup, disaster recovery, and hierarchical storage management (HSM) toward evolutionary obsolescence.
This article will examine the mechanics – and more importantly the business potential – of virtualizing data protection and integrating it more tightly with primary storage.
Virtual Standstill Leading to Virtual Extinction
Over the last 50 years, many virtualization concepts have been applied to data storage systems. So why is it that IT organizations are still waiting and searching for data storage and data protection systems that are more shareable, less manually intensive to install, configure, and operate, easier to scale, and easier to store and protect data with?
Because the storage vendors have been at a “virtual standstill.”
Consider today’s reality.
Storage Area Network (SAN) administrators must still manually create RAID Logical Unit Numbers (LUNs) out of groups of individual disk drives, and manually allocate these to specific servers through arcane Fibre Channel (FC) switch and host bus adapter (HBA) command line scripts. System administrators need to manually configure host-based FC failover systems, manually create host-based volumes, and create or expand file systems. And all this just to deliver primary disk storage to servers.
Virtualization in the context of data protection is almost non-existent. When new volumes or file systems are created, the backup administrator must manually create a backup configuration that includes specifying client schedules, file filters, media sets, retention and rotation schedules, etc. Backup administrators must manually eject multiple tapes from jukeboxes everyday and send them by truck off-site for disaster recovery protection and safekeeping. No virtualization here.
For companies that have invested in cold-site disaster recovery schemes, a Disaster Recovery Administrator must manage two or more sites of similar or identical equipment and keep them up-to-date with each other for years, even decades. The likelihood that the disaster recovery systems in a cold site are properly prepared for disaster over this time span is extremely low. For warm or hot-site failover, more complex replication systems and networks need to be in place to support fast failover. Replication of current data doesn’t obviate the need for performing regular nightly and weekend tape backups, so IT organizations have to do both.
When servers run out of available storage capacity, the SAN administrator and a system administrator must come to the rescue to immediately allocate more storage to the system, whatever the time of day. Alternatively, the archive administrator would have to get involved with determining which collection of files would be considered stale so that these could be written to tapes and deleted from the server, thereby freeing up space for new data.
Archive administrators occasionally inadvertently archive data that is live and very active, causing live applications to fail without access their data. Archive tapes have to be shipped to off-site storage, just like backup tapes. HSM systems and software are costly and complex to manage. And when HSM is combined with magnetic tape as a lower-cost tier of storage, it creates data accessibility/availability exposure because of tape’s inherent lower reliability and poor access times, especially when tapes are stored off-site.
What happens when backup, archive, or HSM tapes are lost or stolen? Most tapes are written in standard formats that encourage readability among multiple backups and archiving applications, making it even easier to gain unauthorized access to data with almost any application.
In the data management software space, backup, replication, archiving, and HSM packages are often unaware of each other’s tape sets and replicas, which leads to inefficiencies in IT operations time, as well as capital equipment costs related to as much as 15x over-replication of the same files. This hasn’t changed in 50 years.
So, have we achieved the goals of a virtualized data storage environment yet? With the technology that’s been available to date, we’re not even close!
Virtual Tour of the Ideal Storage System
Let’s stop for a moment and just visualize the elements of the ideal storage system:
Imagine a system that provides primary storage to your clients and applications. In a virtual environment, clients and applications would be completely unaware of the specific storage system that gives them storage capacity – and the location of their data could transparently change from time to time to react to I/O performance bottlenecks, hardware failures, and even site disasters!
Imagine a system where all of your data is continually and transparently protected, and its history perfectly maintained over months, years, even decades, both locally and at an off-site location – all without operator intervention; tapes and trucks; independent backup servers, backup software licenses, tape drives, tape jukeboxes; and stacks and stacks of tapes.
Imagine a system where complete recovery from a site disaster could be initiated from anywhere with access to a Web browser. Within minutes of a complete site disaster, all clients and applications regain access to their critical data from live systems at a surviving site.
Imagine a system where clients and applications never get an “out of disk space” error message. In addition, the system automatically knows which files are inactive, and it would automatically migrate inactive files from a more costly, high-performance disk storage tier to a more cost-effective disk storage tier. All data, regardless of its location among virtual tiers of disk storage, would remain transparently accessible to all clients and applications.
Page 1 of 2
next page »
About Dave TherrienDave Therrien is the founder and CTO of ExaGrid Systems (www.exagrid.com). He is also the author of 'Self-Protecting Storage - Simplifying Your Data Storage Infrastructure' (http://www.exagrid.com/pdfs/Self-Protecting_Storage.pdf).