A Three Page Order Hitting De-Duplication & Litigation Holds

September 28, 2009

From the plains of Kansas comes another short and powerful order by Magistrate Judge David Waxse.  On Constitution Day, Judge Waxse was very busy in White v. Graceland College Ctr. for Prof’l Dev. & Lifelong Learning, Inc., 2009 U.S. Dist. LEXIS 85849 (D. Kan. Sept. 18, 2009).  For an earlier discussion of this case, please see The Express Way to Your Hard Drive and What Happens When the Requesting Party Does Not State a Form of Production?.

The Court ordered the parties to complete the following discovery by October 1, 2009:

1)       Defendant’s production of de-duplicated data.  The Court required the Defendants to “provide their computer expert and technician used for the de-duplication process to explain and describe this process, including the technique and criteria used by the process, to Plaintiff’s expert.” White, 1-2.

2)       A meet and confer between experts on a search protocol for PST’s to find three email messages. White, 2.

3)       A search to be performed with the both experts present on the PST data. White, 2.

4)       Defendants were to perform an immediate privilege review. White, 2.

5)       Non-privileged search results were to be produced to Plaintiff counsel immediately. White, 2.

6)       Defense production of the litigation hold notice issued at the beginning of the case.  White, 2-3.

De-Duplication: Separating the Clones from the Twins

Clone Cows

If you have never heard of de-duplication, or have heard of it but have been afraid to ask what it is, it is defined as “the process of separating duplicative email messages, word processing documents, and other computer files from your electronic file collection.”  Michael R. Arkfeld, Arkfeld on Electronic Evidence, Glossary, G-5 (2nd. Ed.).

As one can imagine, de-duplication requires technology to reduce multiple identical digital copies (image a true DNA clone) down to one.  Now imagine electronically stored information such as an Excel file that has been emailed from a Defendant to a Plaintiff.  This file is not an exact copy (thus like a twin), because the metadata can be different by who opened the file last, or other smaller changes that make it a different file, thus something you might not want to “de-dupe.” 

I can imagine the Defendants’ expert explaining they used a specific software tool and “de-duped” by metadata fields or MD5 Hash Values or some other defensible process.  It will be interesting to see what is next reported in this case. 

Wave-DeDupe-Advance

 Litigation Hold Notice Production

LetterA Court generally requires production of a litigation hold letter if there have been issued of spoliation (See, The Holding Pattern: Lessons Learned on Litigation Holds).  In this case, the Court required the production of the litigation hold notices and that such production was agreed not be a waiver of the attorney-client privilege or work product doctrine.  White, 3. 

Tools that generate and track litigation hold notices are now commercially available and will make such productions easier in future litigation.


The Plumbing of a Motion to Compel

June 15, 2009

In a product defect case about brass plumbing fittings, the Defendants fought a motion to compel electronically stored information relevant to class certification, because of undue burden and cost.  They lost…in large part. In re Zurn Pex Plumbing Prods. Liab. Litig., 2009 U.S. Dist. LEXIS 47636, 1 (D. Minn. June 5, 2009).

Paper Clogging the Drain

The Plumbing of a Motion to CompelThe Court initially required the parties at the beginning of the litigation to only focus on hard copy documents out of the concern the electronically stored information would drive up the cost.  2.  The Court stated:

“ESI may prove to be relevant to the first stage of discovery, we cannot meaningfully make that prediction now, and require the parties to engage in what could be vastly more expensive, and yet utterly futile, discovery.” In re Zurn Pex Plumbing Prods. Liab. Litig., 2.

 

The Court further stated on electronically stored information:

“[S]hould the parties uncover voids in the information disclosed in hard copy form, they are . . . at liberty to press for further discovery including electronically stored information.” In re Zurn Pex Plumbing Prods. Liab. Litig., 2.

The Defendants’ Electronically Stored Information

Deposition testimony revealed the Defendants’ electronically stored information includes employee email folders and files on the network system.  Additionally, the ESI of former employees’ were archived on DVD.  There was also a separate network drive that contained un-segregated folders from over 600 employees.  In re Zurn Pex Plumbing Prods. Liab. Litig., 2-3.

Enter the Litigation Hold and Search Terms

The Defendants produced a list of individuals who were issued a litigation hold. In re Zurn Pex Plumbing Prods. Liab. Litig., 3.

The Plaintiffs in turn requested searches across the different databases with 26 key words.

The Defendants opposed the request as being “not necessary for class certification and that the request was overly broad and would be extremely costly.”  In re Zurn Pex Plumbing Prods. Liab. Litig., 3. 

The Defendants’ Arguments Going Down the Drain

Arguments Down the Drain

The Defendants argued that 1) the first Court Order did not allow for any ESI productions and 2) producing ESI would be unduly burdensome and costly. 

The first Court order did NOT foreclose electronic discovery.  In re Zurn Pex Plumbing Prods. Liab. Litig., 4.  The Court stated the first Order allowed for electronic discovery if there were “voids in the information disclosed in hard copy form.” Id.  The Court noted email messages showed possible gaps over fittings failures and aggressive water compared to the paper documents.  Id. 

The purpose of civil discovery is to allow parties to “obtain the factual information needed to prepare a case for trial.” In re Zurn Pex Plumbing Prods. Liab. Litig., 5.  Despite the discovery being limited to class action certification, the Court stated there was a presumption in favor of production the Defendants’ electronically stored information.  In re Zurn Pex Plumbing Prods. Liab. Litig., 4-5.

Cost Counts

The Defendants were able to stop a complete rout with their cost arguments. 

The Defendants had approximately 61 gigabytes of data, which would have totaled around 27 million pages if printed.  The Defendants estimated searching all of the databases to cost around $1,150,000 and take seventeen weeks to perform.  That did not include collection and processing. In re Zurn Pex Plumbing Prods. Liab. Litig., 6-7.

The Court Order

The Court did not buy the Defendants’ burdensome arguments, especially considering a lawyer who was not an expert on searches made the claim. In re Zurn Pex Plumbing Prods. Liab. Litig., 6.

The Court did try to limit false hits from the search terms by limiting the number of search terms to 14 specific terms.  The Defendants’s searches were limited to the network drives, custodian emails and the DVDs of the former employees.  In re Zurn Pex Plumbing Prods. Liab. Litig., 6-7. 

Bow Tie Thoughts:

The parties would have been better served addressing electronically stored information at the beginning of the lawsuit, instead of delaying it for over a year.  Both sides would have been in a better position to judge class certification or settlement options if they knew the big picture.  Moreover, the Defendants had over a 600 person company.  Using email would be a fact of life.

The collection, processing, data reduction and review of 61GB would not be cheap, but $1,150,000.00 for the searches alone over 17 weeks sounds inflated. 

The perceived cost might be inflated by thinking of electronically stored information as the same as reviewing 27 million pieces of paper.  ESI can be reduced by targeted collections, pre-discovery tools to reduce the data set that are then fully searchable in a litigation support software.  With paper review, lawyers, contract attorneys and paralegals are digging through boxes of paper only searchable by the human eye. One can see where a seventeen week estimate comes from such a brut force search. 

Processing costs, which include data reduction to cull down electronically stored information, is approximately $600 to $1000 a gigabyte, depending on the vendor’s services.  Attorney review in a litigation support software is reduced by having to review less ESI, such as not reviewing any email newsletters, spam and focusing on specific date ranges.  With that said, estimating the cost to search 61GB that also did NOT include collection and processing to be over a million dollars seems high on the facts presented.


The e-Discovery Chase: Strategies to Reduce Electronically Stored Information in Discovery Requests

December 30, 2008

All of our tools and toys that generate electronically stored information can frustrate law firms with high e-Discovery costs.  Lawyers who choose to have their discovery productions printed can have nearly 10,000 times more paper than 10 years ago.[1]

It is difficult to visualize how “big” ESI can be. WIRED magazine recently gave examples of data size most people can comprehend[2]:

1 Terabyte: a Hard Drive with 260, 000 songs

20 Terabytes: All the photos uploaded to Facebook each month

120 Terabytes: All the data from the Hubble Space Telescope

530 Terabytes: All the videos on Youtube

1 Petabyte: Data processed by Google’s servers every 72 minutes

12 Exabytes: All human produced information[3]

The data explosion can drive up e-Discovery costs for collection, processing and review to rival that of fielding an America’s Cup team.  And like the America’s Cup, if only billionaires can afford access to Federal Courts, then justice is not being served.

Danger of Costs from Overly Broad Requests for Production

Given the sheer volume of ESI, discovery requests need to be focused to avoid opening a Pandora’s Box of data.  For example, a third party request for production demanded the following:

The content of any and all electronic files, e-mail messages (with attachments), Instant Message communications and/or other communication created any time between August 20, 2001 to July 20, 2007 and maintained by Yahoo! related to account holder Jacqueline Hone’s subscription with Yahoo!, Yahoo! mail and/or Yahoo! Messenger.[4]

Even if the court had allowed production of this electronically stored information, how is it a victory to review SIX YEARS worth of email and instant messages?  The cost to collect the information, process it and review time would make the venture questionable.

Lawyers need to employ strategies to reduce the volume of ESI to control costs.  These strategies including requesting what you need, utilizing trusted vendors for targeted collection, and using technology for pre-processing analysis, effective processing and intelligent review.

The Discovery Request

There is no discovery production to review without first a discovery request.  Lawyers should avoid discovery requests such as, “All email to or from the Plaintiff from 1999 to 2008.”  Such requests are rarely granted when challenged, cut “due to the fact” and use “because” due to the fact they are overbroad, unduly burdensome, and often amount to a mass fishing expedition.[5] Moreover, getting 800,000 email messages from one individual would be a Pyrrhic victory considering review costs.

Lawyers should consider timeframes, form of production, specific individuals and other factors to create a narrowly tailored request. This may reduce motion practice and help control production and review costs.

A party responding to such an overly broad request should object and force the demanding party to focus their requests for “all email” or “all information related to the website.”  The petabyte explosion of ESI has not eliminated the discovery requirement that requests be narrowly tailored and reasonably calculated.

Targeted Collection

Information defensibly collected off hard drives should be focused on relevant or responsive electronically stored information.  If representing a contractor in a construction defect case, targeting collection to the housing project at issue will collect less information to review than a mass copying of hard drives.  However, care should be given not to under collect ESI, creating a risk for re-collection if something is missed.  Moreover, this should not undercut any preservation duties.  One might have to mirror image a hard drive to preserve all the data if the facts call for it, but what is collected for review should be focused.

Pre-Processing Analysis

Software tools have been developed for “non-linear” review for “pre-processing.”[6] What this translates into is looking at email messages before they are processed[7] or prepared for a review system[8].

“Pre-Processing” allows a look at email strings and a determination of what needs to be processed for linear review in a product like CT Summation iBlaze or Lexis Nexis Concordance.  This stage can eliminate email messages that are spam or newsletters based on their domains, such as “ebay.com” or “nytimes.com” to reduce what is ultimately loaded into a review platform.

Effective Processing

Processing of electronically stored information is the extraction of metadata and full text in a format readable in a review platform.  The number of native files ESI processing software can process are in excess of 500 to 700 native file formats and growing.

Processing can be “brute force” and literally just included everything that was collected.  However, just because preservations duties may require the mirror image of a hard drive, that does not require an entire hard drive be processed for production.  Processing can be focused with key words, such as names of parties or witnesses, date ranges and other criteria to narrow the information for review.  Such “smart processing” can reduce costs by shortening review time by narrowing the production to a focused production.

Making the Most of Review: Not just a box of paper

Reviewing ESI is not the same war room experience as digging through boxes of paper.  Conference rooms do not need to be commandeered for months with contract attorneys plowing through fields of paper for the smoking gun document.  Technology can help reduce such backbreaking work.

Law firms can avoid the haphazard approach to review with assigning reviewers “review sets.”  A “review set” is saved database search, that can be a DOCID, date range and keyword(s).  The litigation team generally knows some of the basic terms and dates relevant to the lawsuit, and searching based on these terms for review helps focus review.  Reviews can also be set for specific individuals, document types or almost any term appearing in the database.

Review tools allow for issue coding (and also vary by product).  Coding for causes of action, specifically by elements of causes of action, allows for thoughtful review and case preparation.  A reviewer can also consider admissibility issues during the coding stage, identifying any problems before trial.

These are just a few strategies for review.  There are many others.  The main point is not to treat fully searchable electronically stored information as a box paper.  Searching and organizing based on key words can help focus review into what is relevant.

Don’t Drown in Petabytes: Search to Reduce ESI

All of our cool iPhones, laptops and Wii’s are engines of electronically stored information.  Technology has created an excessive amount of ESI in litigation, but technology can solve the problem in reducing ESI so law firms can focus on practicing law, not drowning in data.  


[1] John Bringardner, WIRED, “Winning the Lawsuit,” page 112, July 2008

[2] “The Petabyte Age,” WIRED, July 2008, pp 106-107

[3] http://en.wikipedia.org/wiki/Exabyte

[4] Hone v. Presidente U.S.A., Inc., 2008 U.S. Dist. LEXIS 55722 ( N.D. Cal. July 21, 2008 )

[5] See, Hone v. Presidente U.S.A., Inc., 2008 U.S. Dist. LEXIS 55722 (N.D. Cal. July 21, 2008 ), Quinby v WESTLB AG 2006 WL 59521, 1 (S.D.N.Y) (Jan. 11, 2006), and Thompson v Jiffy Lube International, Inc, 2006 WL 1174040, 3 (May 1, 2006).

[6] Companies such as Metalincs both make products for this purpose.

[7] Processing products include IPRO’s eCapture, Needle Finder, LAW Pre-Discovery from Lexis or CT Summation Discovery Cracker, to name a few.

[8] Review platforms as hosted solutions such as Concordance FYI, CT Summation CaseVantage or desktop solutions such as CT Summation iBlaze or Concordance