Tape Indexing Breathes Life Into Tape Storage

Here’s an observation that can be tagged “mixed blessings”: foot dragging on the part of techno-lagging attorneys has shielded (and in some cases continues to shield) their clients from the full potential weight of eDiscovery requests. For example, even after years of discussion, the legal profession didn’t formally recognize the obligation to produce metadata in response to discovery requests before the Federal Rules of Civil Procedure amendments adopted at the end of 2006. More outrageously, some attorneys are still gaming to avoid eDiscovery all together, as Magistrate Judge John M. Facciola (U.S. District Court, Washington D.C.) pointed out in his keynote presentation at LegalTech earlier this year.

Only a few years ago certain courts had ruled that data stored on tape could be considered “inaccessible” because it was so expensive to review it, and thus data stored on tape did not always need to be reviewed when answering an eDiscovery request (for example, the Zublake decisions). More recently, however, the legal profession is becoming aware of advances which make tapes faster and cheaper to review, like technology for rapid disaster recovery.

What IT person doesn't look forward to working with historic data?
What IT person doesn't look forward to working with historical data?

There are still a number of fine distinctions being made in this area of law, and the specific tape handling practices of different companies can render their tapes more or less “accessible.” (Ironically, companies that archive backup tapes indefinitely, which sounds like a safe practice, may be exposing themselves to a greater burden in eDiscovery, not to mention the extra cost of storing outdated tapes.) But broadly speaking, few companies storing information on tape can categorically rely on “inaccessibility” to rule out the risk of being required to review their tapes during eDiscovery any more. For more about the law concerning inaccessibility, including California’s burden-shifting rules, I recommend this article by Winston & Strawn attorneys David M. Hickey and Veronica Harris.

Fortunately, two prongs of innovation are shrinking the issues surrounding eDiscovery and tape. The first prong, which happens to be the subject of this blog post, comes in the form of new tape indexing and document retrieval technology. The second solution, which involves substituting hard drives in place of tape, will be the subject of a future post.

To learn more about the current state of eDiscovery technology in the realm of tape, I recently spoke with Jim McGann, Vice President of Marketing at Index Engines. Index Engines’ solution comes in the form of an appliance (a hardware box pre-loaded with their software) that scans a broad variety of tapes and catalogs the content. The appliance indexes tape data and de-duplicates documents within the index using the hash values of the documents. At this point users can cull (selectively retrieve) potentially responsive documents from a batch of ingested tapes without first performing an expensive, resource-intensive full restoration of each tape. And because Index Engines can ingest all of the common tape storage formats, users don’t need to run or even possess the original software used to write to the tapes.

From a longer-term strategic perspective, Index Engines’ users can approach their tape stores incrementally, taking a first pass through their tapes in response to a particular discovery request, then add to their global tape index as new discovery requests are fielded. They can embark upon a proactive tape indexing campaign that will give them enhanced early case assessment capabilities. Users may also opt to extract important data that is not immediately needed but resides on old or degraded tape.

For companies with thousands or tens of thousands of tapes, indexing can allow significant numbers of tapes to be discarded since many individual tapes typically contain data which is almost entirely repeated on other tapes or has lasted past the end of its retention period – not to mention the corrupted or blank tapes which are being carefully stored nonetheless.

All of this makes Index Engines an extremely affordable (at least by Enterprise standards) alternative to restoring and reviewing tapes individually.

I asked Jim McGann whether Index Engines resembled dentists who teach patients good dental hygiene and, if successful, will wind up putting themselves out of a job. If Index Engines’ appliances succeed in indexing, de-duplicating, and extracting all of the stored tape in existence, while ever more affordable hard drive storage replaces tape storage, won’t the company be out of a job?

Jim pointed out that, for certain organizations which currently rely on tape storage, substituting hard drives for tape drives is simply not a viable option. Costs associated with re-routing system data and human work flows, as well as the risk of downtime during a transition, mean that many organizations won’t switch even after disk drives become less expensive. And Index Engines takes away much of the cost incentive for switching that would otherwise be driven by eDiscovery and compliance requirements. Finally, Jim says, Index Engines can be used to index almost all of the information customers have, not just tape data, which enables users to find non-tape information that must be reviewed for eDiscovery.

The other approach to the problem of tape storage (to be explored in more detail in a future blog post) involves near-line hard drive solutions. Leading hard drive storage vendors such as Isilon Systems claim that their “near line” solution is priced nearly as low as tape while offering higher performance and reliability. But advocates of tape, including the Boston-area based Clipper Group (in a whitepaper offered on tape drive vendor SpectraLogic’s web site) claim that the total costs of ownership of disk storage, taking into account factors such as floor space requirements and electricity, is still many, many times higher than tape.

So, as tapes look like they will be around for some time to come, companies with tapes will continue to need technologies like Index Engines’. And most will not be able to avoid discovery of tapes for much longer, if they are even still able to do so, thanks in part to the availability of these technologies.

PREVIOUS POST: The Evolution of eDiscovery Analytics Models, Part II: A Conversation with Nicholas Croce

3 Replies to “Tape Indexing Breathes Life Into Tape Storage”

  1. As you point out technical innovations are shrinking the technical impediments surrounding eDiscovery and backup tapes. Solutions from Index Engines, Renew Data and others have literally out-innovated the law. This is happening because enterprises are seeking and purchasing solutions than can help shrink the smoldering pile of risk known as their backup tapes. This is happening because technologists constantly strive to see if and how something can be done while attorneys and the law tend to be more reactive by nature. These technologies have left a wake in the safe harbor and with more companies backing up to disk I expect that the issue of backup in E-Discovery will continue to be hotly debated.

    Tapes won’t go away, but the ability to claim them as inaccessible has all but disappeared.
    Within the Enterprise Vault customer community we have seen interest in archiving and expiring backup tapes. These customers may have 500, 5,000 or more legacy backup tapes on hold for litigation. However, they recognize that when using backup tapes as a hold mechanism they are retaining more information than necessary which can increase risk and cost. To address this problem customers have started loading backup tapes into the archive. Once the tapes have been ingested into the historical vault they are indexed, future proofed, hashed and single instanced. Customers can then run a discovery search for data that is relevant to the legal hold. The responsive data will be placed under hold while the remaining data is then expired or assigned a retention/destruction policy. Archiving technology is as much about retention as it is disposition. This allows the customer to cull down a complete backup tape to the specific elements required for litigation.

    In the past backup was the only option for discovery in many cases. Today backup is for recovery while archiving is for discovery. Together these technologies should work together to prevent you from having to consider going to backup tapes for E-Discovery.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s