De-mystifying Machine Learning

I was a little surprised to see a post in a respected tech publication just the other day about how unfathomable machine learning is, and how unknown its impact is going to be. Agreed, machine learning is still unfamiliar to many people, and its potential is enormous. But maybe I can help demystify it a little by sharing some of my own experience applying machine learning in a real life situation.

I really dug into machine learning a few years back working on a marketing campaign concerning the use of analytics during the discovery phase of lawsuits. I got hands-on by downloading the somewhat-famous Enron emails, which I popped into a MySQL database server, and did a little poking around in them using Tableau. But what really helped me understand the power of machine learning was studying emerging e-discovery technology, culminating in a conversation with data scientist and entrepreneur Nicolas Croce (see the interview here).

documents

Before I share what I learned, first some background for those who aren’t already familiar with what the legal profession calls “discovery”. Discovery is the process by which lawyers are permitted to obtain evidence, including documents and electronic records, from their opponents. This is permitted under civil and criminal law so that the lawyers for both sides can assemble evidence that courts need to make good decisions. In a major legal action discovery can involve literally millions of documents and equivalent types of records (images, emails, database entries, etc.). Both sides must review these documents to identify which are important and why.

Continue reading “De-mystifying Machine Learning”

Even the biggest law firms mess up—that’s why we need contract automation

One of the innovators that the ABA journal has profiled as part of its “Legal Rebels” project is contract drafting guru Ken Adams. Ken is author of A Manual of Style for Contract Drafting, a book that has become a standard reference work, and he’s a leading advocate of standardizing and automating contract drafting.

The ABA Journal profile includes the following video that shows Ken deconstructing two extracts from the merger agreement used in Oracle’s proposed acquisition of Sun Microsystems. Oracle is represented by the premier law firm Latham & Watkins, but it’s clear from Ken’s analysis that they, like other law firms, draft contracts that are embarrassingly problematic from the standpoint of clarity, efficiency, and risk.

The amount of fat and gristle that Ken trims from just a single page of the contract, which we watch him do in an onscreen markup towards the middle of the video (at minute 4:52 of the 8 1/2 minute video), is simply startling.

Interestingly, Ken clearly doesn’t expect all lawyers to suddenly become better drafters. Instead, he says in the video that a combination of standardized language and automation can transform contract drafting into an inexpensive commodity task. At least in theory, lawyers could outsource most of their drafting workload to either an in-house or (as Ken suggests) an SaaS system offering document-assembly templates for a wide range of business contracts. That would allow lawyers to quickly create high-quality first drafts, leaving them more time to concentrate on those tasks that add real value and are worth their hourly rates, namely devising strategy and assisting in negotiations.

Reduce e-discovery cost and disruption with Early Case Assessment

In previous posts (What is Discovery?, Evolution of eDiscovery Analytics, Tape Indexing Breathes Life Into Tape Storage) I’ve talked about early case assessment or “ECA.” ECA happens when a company looks at the information that might be used as evidence in a legal dispute “early,” which is to say, within days of when it appears that there may be a dispute. This tactic may sound like common sense, but it actually runs counter to the tendency of both lawyers and business people to put things off. Traditionally, many lawyers have focused on getting their court papers together first, then doing a detailed investigation of the facts of a case only when they are required to do so by the court. This delay in getting to the bulk of the documents that might be involved hasn’t exactly met with resistance from business people who generally prefer doing the jobs they were hired to do rather than getting sidetracked by a lawsuit.

Photo credit: RBerteig's photostream

Photo credit: RBerteig's photostream

ECA has one huge advantage: getting key facts straight early may enable legal counsel to either 1, negotiate a settlement agreement–or 2, ask the court to dismiss a case because no evidence exists to support the lawsuit.

From a business standpoint earlier is usually better when it comes to ending lawsuits because it means lower costs and fewer distractions for the business people involved.

We’re talking tens of thousands of dollars for even the least expensive lawsuits, up to many millions of dollars for bigger ones. And besides attorneys fees (typically hundreds of dollars an hour per attorney, sometimes with multiple attorneys billing for months at a time) and a wide variety of expenses, lawsuits suck time away from business people who would otherwise be productively generating value for their companies. And they are distracting, too – people get emotionally involved in conflicts, even more than they get sucked into sports or reality TV shows, and start thinking about that instead of how to build a better widget or motivate their teams.

When I was a young associate working for a big San Francisco firm I had the privilege of receiving an informational interview from someone who at the time was one of the senior in-house counsel at a giant multi-national engineering firm. He told me that his ideal outside counsel was someone who could settle a case just weeks after it reached his desk, because cutting to the chase, settling a dispute for what it is worth in business terms, and eliminating the attorneys fees and distraction were what mattered to him.

Early case assessment ideally is like getting to checkmate at the very beginning of the game. Ideally you find out: we win or we lose. Then you settle. Attorneys do that too. A little known fact among non-attorneys (contrary to the TV shows) is that very few lawsuits get as far as a trial. Most settle before trial. You show the other side why you are going to win, or you say “hey, we’re not going to admit anything, but here’s a big check so that we can both move on.” The real question is, how soon can we settle? And the answer is usually “when both sides think they have all of the facts.”

So ECA saves money to the extent that it can authoritatively establish facts – including not only “smoking guns” but the presence or absence of evidence and the quality of that evidence – which enables early, appropriate settlements. When ECA works properly the side producing documents can say with authority: these are the facts; let’s settle on this basis.

So what can a company do to improve its ability to do ECA? Use technoloy, of course. The good news is that these days most of what must be reviewed is electronically stored information such as emails, spreadsheets, word documents, and databases, which lends itself to automated review. And although there is a substantial upfront cost to automate the process, once in place automated document handling is faster and less expensive and it can be more accurate than manual review.

A company that is serious about ECA will put in place two key technology components.

The first piece of technology is search, or what is sometimes called “document discovery” technology, which systematically checks or “crawls” the company’s computer systems for stored information, and categorizes it, so that searches can be run to find all relevant documents.

This identification and categorization process is also useful to meet a company’s duty to preserve information that might be relevant to the lawsuit once the possibility of a lawsuit is known. A court may rule against a company in a lawsuit solely because the company failed to quickly find and protect key information before it was altered or deleted in the ordinary course of buisness.

The second piece of technology is conceptual search or clustering which enables vast amounts of documents to be quickly analyzed with a minimum of costly human effort. It’s not “early” case assessment if hundreds or thousands of attorney hours must be put in before meaningful conclusions can be drawn about the strength of a lawsuit and the ability to settle or file for dismissal.

Because ECM can save a great deal of time, money, and distraction, organizations expecting significant e-discovery and/or compliance obligations should prioritize search and clustering technologies within their IT roadmaps.

e-Discovery document review: what should counsel outsource?

Earlier this week I blogged about placing the locus of control for e-discovery decisions in the right hands to ensure that the decisions made pass muster in court. To illustrate the potential impact of moving the locus of control for certain decision to an outsource partner let’s compare the document review solutions offered by H5 and Inference Data.

Gold standard counsel or expert linguists - who should take the lead? Photocredit: jeffisageek’s photostream

Gold standard counsel or Linguist - who should decide?

Both H5 and Inference enable users to improve results and potentially save vast amounts of money by teaching sophisticated software how to do document review faster and more accurately than human reviewers can. And the more the review process can be reliably automated, the more money is saved down the road because the amount of manual review is reduced. This all assumes that the software is trained correctly, of course. Which frames a locus of control question: Who’s best at training the software?

Last month I attended a webinar presented by H5. One thing that struck me as distinctive about H5 is their standard deployment of a team of linguists to improve detection of responsive documents from among the thousands or millions of documents in a document review. During the webinar I submitted a question asking what it is their linguists do that attorneys can’t do themselves. One of their people was kind enough to answer, more or less saying “These guys are more expert at this query-building process than attorneys.” Ouch.

I’ve long prided myself on my search ability (ask me about the time I deployed a boolean double-negative in a Westlaw search for Puerto Rico “RICO” cases) and I’m sure many of my fellow attorneys are equally proud. However, I know people (or engineers, anyway) who are probably better at search than I am, and I know one or two otherwise blindingly brilliant attorneys who are seriously techno-lagged. More importantly, attorneys typically have a lot on their plates, and search expertise on a nitty-gritty “get the vocabulary exactly right” level is just one of a thousand equally important things on their minds, so it’s not realistically going to be a “core competency.” So I can see the wisdom in H5’s approach, although I wonder how many attorneys are willing to admit right out loud that they are better off outsourcing this competency.

The other side of the H5 coin is represented by Inference Data, which offers a tightly designed software solution which enables attorneys themselves to become the locus of control for search. For counsel with the proper training and technical aptitude, this strikes me as a killer combination, placing the locus of control — teaching the software to find the right documents — in the hands of the attorney who is the “gold standard” subject matter expert.

I can see where, depending on a number of different factors, either solution might be better. I encourage anyone facing this choice to make an informed decision about which approach leads to the best results rather than relying on their knee-jerk reaction.

e-Discovery outsourcing 101: who makes which decisions?

Because e-discovery is complex, and the penalties for screwing it up are significant, the following choice should be considered periodically by attorneys, clients and IT people involved in e-discovery: “Do we do this piece of the project with the people we have already, or do we add people to our payroll who do this, or do we bring in an outside partner to do this?” This is when the IT people reading this post will start muttering the cliché truism “Build or Buy?” which means choosing between “do it ourselves” and finding a pre-packaged solution.

In a generalized “leadership” or “management” frame of mind the basic choice is: “Do, Delegate, or Dump.” I am fond of characterizing this choice as the assignment of the locus of control for decision-making, where an important consideration is who will do the best job of making the decisions once given that responsibility.

  • “Do” = Must I make a particular set of decisions myself – are those decisions an essential part of my role in the organization, and am I the one with the right information and motivation to make these decisions?
  • “Delegate” = Can someone else do just as well, or perhaps a better job, with making this set of decisions, especially after making these decisions became an essential part of their role?
  • “Dump”: should we even be in the business of making these decisions at all, or can we just drop that issue off of our plates somehow?

For example, one can dump having a company picnic to save money. One can’t dump bookkeeping, however, even in a very small company. But even in a very small company a leader can usually delegate or outsource primary responsibility for bookkeeping and expect to get good results while focusing on core competencies of the business such as production, customer relationships, and motivating team members.

Ultimately the choice boils down to this: Do I want to possess and maintain expertise in making certain decisions, at a certain level of granularity, as a core competency? If yes, then I must make it a core competency, which means investing the time, attention, and education it takes to do it right. If no, then I should bring in someone else who has that core competency and who is invested in doing it right.

In e-discovery, answering the question of what can be outsourced — or where to place the locus of control for decision-making — gets even more interesting since courts hold attorneys personally responsible not only for delivering high-quality document production results but for understanding and directing the process by which results are achieved. So the question becomes: Will attorneys generate better document production results when they personally control more of the process (for example, by personally, hands on the keyboard, deriving and executing search methodology)? Or, will they generate better results by collaborating more with outsourced experts, directing and supervising but delegating more of the hands-on decisions?

More than a few attorneys reading this might find that the choice is not as cut and dried as they think. In my next post I’ll explore this choice by applying the core competency / locus of control standard to competing document review automation solutions from Inference Data and H5.

IT crapshoot: cost-cutting is costly in disaster recovery, archiving for e-discovery

Disaster recovery and archiving are key zones of interaction for IT and Legal Departments. When a lawsuit is filed and an e-discovery production request is received, a company must examine all of its electronically stored information to find documents that are relevant to that lawsuit. Court battles may arise regarding the comprehensiveness of the examination, the need to lock down potentially important documents and metadata, and the cost of identifying, collecting, preserving, and reviewing documents — all of which are related to the way in which data is stored.

Photo credit: Josie Hill

Photo credit: Josie Hill

With this in mind, I recently sought out Jishnu Mitra, President of Stratogent, a specialized application hosting and disaster recovery services provider, to obtain his perspective on disaster recovery best practices and the relationship between disaster recovery and e-discovery. Key points he made include:

  • effective disaster recovery sites are “hot” sites that can be used for secondary purposes rather than remaining idle;
  • “cold” sites are unlikely to get the job done and are not cheap;
  • efforts to keep IT budgets down by delaying or limiting disaster recovery, or by limiting archiving, can backfire;
  • budget-conscious IT departments are more likely to use archiving features built-in to their software of choice;
  • many IT and Legal personnel have a habit of being disrespectful towards one another and doing a poor job of communicating with one another;
  • more crossover Legal-IT people are needed.

Bruce: Can you provide a little background about Stratogent’s domain expertise?

Jishnu: We offer end-to-end application hosting services, including establishing the hosting requirements and architecture, hardware and software implementation, and proactive day-to-day application management including responding to any issues that arise. Most of the time we are tasked with building a full data center, not the building itself, of course, but a complete software and hardware hosting framework. We aren’t providers of any specific business application (like salesforce.com does). We design, deploy and operate all the layers on which modern business applications are hosted including the application’s framework e.g. .NET, Java or SAP Basis.

Our customers include multi-office companies, who require applications shared between offices, and web-based application SaaS (“software as a service”) companies. The scope is typically quite complex – we don’t build or manage general web sites or blogs — that’s a commodity market and too crowded. We build and manage custom application infrastructures for enterprises or for complex applications that require a range of IT skills to manage. Our customers hire us because they don’t want to budget to hire all of the people they would need to do this internally, or when they are deploying a new application that is beyond the current reach of their IT team. For example, if a company wants to start using a new-to-them ERP [“Enterprise Resource Planning”] application like SAP or (say) a Microsoft based enterprise landscape that needs to scale, we can multiplex our internal pool of talent to give their application 24-7 attention far cheaper than the company can hire and retain the specialized employees they need to do it themselves.

Bruce: So you supply the specialized competencies needed to build and operate complex application environments so that your customers can focus on their core competencies? Then their core competencies don’t need to include what you do in order for them to succeed.

Jishnu: Yes. They know what they want, they conceptualize what they want, but not the hardware they need and the infrastructure software. We can go in from the very beginning saying, “Here’s how you set up a highly available, clustered server farm for your social networking app,” and so on and so forth. We know how to customize it and set it up. They also don’t have our expertise in negotiating with hardware vendors, or in capacity planning, etc. Plus there’s the build phase, loading OSes, etc. We essentially give them over the course of our engagement the entire hosting framework on which the app runs and then take care of it for the long run.

Once we get their hosting framework to a steady state, they get to run with it for two, five or longer number of years with no or limited failure. So their role is conceptualizing on day 1 and then we become a partner organization worrying about how to realize that dream, handling inevitable IT break-fix issues and managing changes over the entire life span of that system. Disaster recovery usually becomes part of that framework at some point.

Bruce: Can you give me some broad idea of the scope of disaster recovery work that you do?

Jishnu: Disaster recovery is not a separate arm of our business. It’s very integral to the hosting services we provide. We build disaster recovery sites at different levels of complexity. It can go from a small customer up to a really large customer. And over time Stratogent gets into innovative approaches to deal with disaster recovery. The philosophy of Stratogent is that we’re not trying to sell a boxed solution to all the customers. It’s more of a custom solution, not a mass market product. We say we will architect and host your solution – and as architect we always add very specific elements for our customer, not just one solution for everyone.

The basic approach, even for small customers, is to choose a convenient and correct location for the disaster recovery site and use a replication strategy based on whatever they can afford or have tolerance to accept. As much as possible a disaster recovery site should be up and running and ready to go at a flip of the switch. They can use the excess capacity at their disaster recovery site at quarter end to run financial reports or for other business purposes, plus it can be used for application QA and staging systems. They can be smart about it, and keep it on, so that they can have confidence in it.

Of course a disaster recovery solution like this can’t be built in just a month or two – to do it right requires creativity and diligence. In one recent instance when asked to do it ”right now”, we had to go with a large vendor’s standard disaster recovery solution for our customer. Everyone knows that this does not get us anything beyond the checkmark for DR, so the plan is to go to a Stratogent solution over time, build a hot alternate site on the East coast, and sunset the large vendor’s standard disaster recovery arrangement.

Bruce: Given the importance of disaster recovery for a number of reasons, how seriously are companies taking it?

Jishnu: Everybody needs it, but it suffers from “high priority, low criticality”, and the problem rolls from budget year to budget year. Some unpleasant trigger like an outage, or an impending audit instigates furious activity in this direction, but then it goes on to the back burner again. In the recent instance, although disaster recovery was scheduled for a later phase for technical reasons, for SOX compliance the auditor demanded a disaster recovery solution by year end or our customer would fail their audit. So we went out and obtained a large vendor’s standard disaster recovery solution, which met the auditors’ requirements but isn’t comparable to a “hot” disaster recovery site.

The way disaster recovery solutions from some of the large vendors work is this: they have huge data centers where their customers can use equipment should a disaster happen. Customers pay a monthly fee for this privilege. When a disaster strikes, customers ship their backup tapes out there, fly their people out there, and start building a disaster recovery system from scratch. And by the way, if you have trouble here’s the menu for emergency support services for which they will charge you more. And in 95 out of 100 cases it just doesn’t work, but is a monumental failure when you need it most. These are “cold” sites that have to be built from the ground up. It takes maybe 72 hours to get them up, rather to be asserted as “up”. Then, as someone like yourself with application development experience knows, it takes weeks to debug and get everything working correctly. And when you’re not actually using them, standard disaster recovery services are charging you an incredibly high amount of money for nothing except the option of bringing your people and tapes to their center, and then good luck.

Bruce: You mentioned running quarterly financials, QA, and staging as valuable uses for the excess capacity of “hot” disaster recovery sites. Could this excess capacity also be used for running e-discovery processes when the company is responding to a document production request?

Jishnu: Possibly, but I haven’t seen it done yet in a comprehensive manner. The problem is you still need to have the storage capacity for e-discovery somewhere. The e-discovery stuff is a significant chunk of storage, maybe tier 2 or 3, which demands different storage anyway, so it makes sense to keep the e-discovery data in the primary data center because its easy and faster to copy, etc. That said, it is very useful to employ the capacity available in the secondary site for e-discovery support activities like restoring data to an alternate instance of your application and for running large queries without affecting the live production systems.

Bruce: Do you deploy disaster recovery solutions that protect desktop drive, laptop drive, or shared drive data?

Jishnu: As I have said, our disaster recovery solutions are part of whatever application frameworks we are hosting. We as a company don’t get into the desktop environment, the local LANs that the companies have. We leave that to local teams or whatever partner does classic managed services. We do data centers and hosted frameworks. We don’t have the expertise or organizational structure to have people traveling to local sites, answering desktop-related user queries, etc. But any time it leaves our customer’s office and goes to the internet, from the edge of the office on out it’s ours.

Bruce: But when archiving is part of the customer’s platform hosted by you, it gets incorporated in your disaster recovery solution?

Jishnu: Yes.

Bruce: Is Stratogent involved when your customers must respond to e-discovery and regulatory compliance information retrieval requests?

Jishnu: Yes. For example, we recently went through and did what needed be done when a particular customer asked for all the documents in response to a lawsuit. We brought in a consultant for that specific archiving system as well. Our administrators collaborated with the consultant and 2 people from the customer’s IT department. It took a couple of weeks to provide all the documents they asked for.

Bruce: Was the system designed from the outset with minimizing e-discovery costs in mind?

Jishnu: Unfortunately no. In this case archiving for e-discovery was an afterthought and was grafted on to the application later and a push-button experience wasn’t in the criteria when designing this particular system. But it woke us up. We realized this could get worse.

Bruce: So how do you do it differently now that you’ve had this experience?

Jishnu: Here we recommended to our customer that we upgrade to the newest version of the archiving solution and begin using untapped features that allow for a more push-button approach. Keep in mind that e-discovery products weren’t as popular or sophisticated as you see them now.

Bruce: Aren’t there third-party archiving solutions also?

Jishnu: There are several third-party products and you see the regular enterprise software vendors coming out with add-ons. We’re especially looking forward to the next version of Exchange from Microsoft, where for us the salient feature is archiving and retention. Only because email is the number one retrieval request. On most existing setups getting the information for a lawsuit or another purpose takes us through an antiquated process of restoring mail boxes from tape and loads of manual labor. It’s pretty painful, it takes an inordinate amount of time to find specific emails, its not online, it takes days. For this reason we’re looking forward to Exchange 2010 which has features built INTO the product itself. Yes, some other vendors have add-on products that do this also.

Bruce: And I assume you’re familiar with Mimosa, in the case of Exchange?

Jishnu: – Like Mimosa, yes. But when it’s built-in the customer is more likely to use it. By default customers don’t buy add-ons for budgetary reasons. It’s so much easier if the central product has what we need, and that is in fact happening a lot these days. I won’t be surprised if products in general evolve so that compliance and regulatory features get considered integral parts of the software and not someone else’s problem.

Bruce: Do you have other examples of document retrieval from backups or archives?

Jishnu: Actually there are three scenarios where we do document retrieval. Scenario one, which we discussed, is e-discovery. Scenario 2 is when we have seen retrieval requests in acquisitions, mergers and acquisitions, and we had to pretty much get information from all sorts of systems, a huge pain.

Scenario 3 is SaaS driven. For many of our customers, the bulk of their systems are either on-premises or hosted by Stratogent, but some of our customers use SalesForce.com or one of many, many small or industry specific SaaS vertical solutions. In one recent case, one of these niche vertical SaaS vendors, because of some of the issues in that industry, was about to go out of business. We had to go into emergency mode and create an on-premise mirror, actually more like a graveyard for the data, to keep it for the future, to enable us to fetch the data from that service. We figured out a solution for how to get all the customers’ data and replicate and keep it in our data center and continuously keep it up to date. Fortunately the vendors were cooperative and allowed access through their back door to allow us to achieve this. I call this “the SaaS fallback” scenario. SaaS is a great way to quickly get started on a new application, but BOY, if anything happens, or if you decide you aren’t happy, it becomes a data migration nightmare and worse than an on-premises solution because you have no idea how it’s being kept and have to figure out how to retrieve it through an API or some other means.

Bruce: In e-discovery and other legal-driven document recovery scenarios, how important is collaboration between IT and Legal personnel, or should I say, how significant a problem is the lack of this collaboration?

Jishnu: I’ve seen the divide between IT and legal quite often. Calling it a divide is actually being polite; at worst both parties seem to think the others are clueless or morons. It’s a huge, huge gap. And I have also seen it playing out not just in traditional IT outfits, but also product based companies when I was principal architect at Borland. When attorneys came to talk to engineering about IP issues, open source contracts or even patent issues, there was no realization among the techies that it was important. In fact legal issues were labeled “blockers” and the entire legal department was “the business prevention department”. And there is exactly the opposite feeling in the other camp with how engineering leaders don’t “get it” and how talking to anybody in development or IT was like talking to a wall. The psychological and cultural issues between IT and legal have been there for a while. In some of the companies that have surmounted this issue, the key seems to be having a bridge person or team acting as an interpreter to communicate and keep both sides sane. Some technical folks I know have moved on to play a distinctly legal role in their organizations and they play a pivotal role in closing the gap between legal and IT.

Catch-22 for e-discovery standards?

Lawyers are waiting for judges… who are waiting for lawyers. What’s a client to do?

E-discovery law is always going to be out of step with information technology because of the way in which the law develops.

photo credit: Drew Vigal / CC BY-NC 2.0

The legal system is very rules oriented, including many purely procedural rules. This is so partly because the legal system is an adversarial process, in which disputes between opposing parties are resolved by a neutral third party judge. The legal system attempts to create a fair process by giving everyone advance notice of the rules that will be observed, then sticking to those rules. Because the process is considered so important, a great deal of time and money may be spent during a lawsuit to resolve disputes over whether one party or the other has broken the procedural rules. Serious breaches of purely procedural rules by one party can lead a judge to award victory to their opponent for no other reason than failing to follow procedure. A judge may also find that party’s attorneys personally liable for their participation in the rule-breaking.

Legal procedural rules are established by legislatures, like the US Congress and state assemblies; by court administrative bodies, like the Federal Rules of Civil Procedure and (initially) the Federal Rules of Evidence; and by judges by way of decisions (also known as rulings or opinions) they write to explain how they resolved a dispute between parties to a lawsuit.

Recommendations concerning legal procedures may be made by expert committees, which for e-discovery typically means The Sedona Conference and TREC Legal Track. But although judges regularly mention legal experts’ recommendations in their opinions – such mentions are typically known as “dicta” – both the recommendations and dicta lack the authority of law.

In an ideal world clearly defined, efficient, and flexible rules would be developed and followed for identifying and disclosing electronic data – like a well developed API (Application Programming Interface) in the software realm, or a useful IEEE  standard, which is akin to what TREC may ultimately develop. Unfortunately, as with many APIs and even internationally recognized standards, the reality in law is a quite a bit messier than the ideal.

Legislators and judges are not experts in the realm of electronically stored information. Nor do they wish to be; nor should their role be expanded to require such expertise. Furthermore, judges tend to be reactive rather than proactive. Not only do they have very full agendas already, but the fundamental character of their role is to wait until they are presented with a dispute that clearly and narrowly addresses a particular issue before they make a ruling that takes a position on that issue. This is known as “judicial restraint”.

Two new influences are incrementally changing the legal rules concerning e-discovery. The first influence is the ripple effect from recent changes to the Federal Rules of Civil Procedure and Evidence which clarify how electronically stored information is to be reviewed and disclosed. Attorneys and judges must attempt to follow these new rules, and judges are now deciding new e-discovery cases interpreting these rules for the first time in various situations. The second influence is the rising prominence of two prestigious but unofficial advisory committees, the aforementioned Sedona Conference and TREC Legal Track, which have researched and published detailed findings and recommendations concerning e-discovery best practices. These recommendations are increasingly being cited by judges in their rulings

Notwithstanding the commendable efforts of those who have been working skillfully and hard to improve e-discovery standards, those standards remain quite broad and subject to a wide range of interpretations. As a result, a wide range of technologies and processes are thriving in the e-discovery realm. And the standards that exist at present do not offer the means by which to compare most of the popular technologies and processes currently in use.

Two change agents are likely to sharpen e-discovery standards and narrow the field of e-discovery technologies and processes.

The first change agent is the quality control movement which is being promoted by vendors such as H5 and Inference and recommended by both TREC and Sedona (which I’ve written a little about here). Their thinking is that e-discovery, like any other mass-scale process – and here we’re talking about reviewing documents on the scale of tens-of-thousands to millions – must adhere to well-defined quality standards, just as manufacturers that produce thousands or more of consumer product components must adhere to well-defined quality standards.

The second and most important change agent will be ground breaking legal rulings by judges. When judges eventually rule that new standards must be observed, then lawyers and their clients will follow (for the most part). Until then lawyers are almost certain to stick to the status quo. In fact, when I discuss this topic with e-discovery solution vendors I always hear that their attorney customers aren’t interested in anything that the courts haven’t already accepted.

But therein lies the Catch-22: Judges are waiting for lawyers to present an issue while the lawyers are waiting for the judges to rule so they don’t have to present the issue. Because of judicial restraint, judges only rule on issues that have been unambiguously presented to them in the course of a lawsuit. But until the courts have ruled that certain technical standards are required, the lawyers won’t advise their clients to rely on those standards. And until a lawyers’ client relies on a standard in a situation which puts that standard at issue in a lawsuit, judges can’t rule that the standard is legally valid. So round and round it goes.

The log jam breakers will be 1) further amendments to the procedural rules, based no doubt on the recommendations of TREC and Sedona and the vendors and lawyers that participate in those efforts, and possibly 2) a “civil rights campaign” approach. The latter is a scenario like we saw in 1950’s Supreme Court school desegregation decisions in which clients stepped forward at some personal risk to offer their personal circumstances as “test cases.” By adopting and sticking to a certain principle, even though it created a controversy, they could bring about court rulings that effectively changed the law.

In a corporate setting the latter approach may only be possible with C-level and board support because of the risk involved. Opportunities would have to be identified and pursued in which it appeared that the company could save more in the long run by relying on e-discovery innovations, such as quality measures and automation, than it risked in the short run by going to court. In addition, the innovations would have to be fundamentally sound, albeit untested in court, while controversial enough that an opposing party was unwilling to accept them without a court challenge. (Ironically, whenever a client adopts an e-discovery innovation that doesn’t lead to controversy, it results in no judge’s ruling and thus no legal precedent for other clients’ attorneys to follow.)

Both of these log jam breakers are bound to move slowly. This means years of small, incremental change before definitive technology standards result. Of course, the complexity of information management continues to increase and the cost of e-discovery is bound to continue to rise along with it. For these reasons, whatever each of us does, in our individual and representative roles, to support standards that lead to increased efficiency and reduced cost, the better off we’ll all be.