Excel-ing at eDiscovery (Guest Post on Everlaw)

spreadsheetMany litigation support professionals think Excel files are the bane of discovery.

Attorneys with a fixation of wanting a Bates Stamp on electronically stored information demand Excel files be converted from their native application to a static image, causing one Excel file to explode into a 500 page PDF.

To learn more about a recent case with an Excel file that would not open, and advantages of reviewing Excel files in native file format, check out my guest post on Everlaw.

Even A Judged Questioned Why Ask for Permission to Use Predictive Coding

HighFiveI do not normally want to high five Federal judges, but Judge Ronald Buch, a Tax Judge in Texas, sure deserved one after his Dynamo Holdings opinion.

The discovery dispute can be summed up as a battle over backup tapes that had confidential information. The Requesting Party wanted the tapes; the Producing Party wanted to use predictive coding to produce what was relevant, because the cost for reviewing the material for privilege and relevancy would cost $450,000 with manual review. Dynamo Holdings v. Comm’r, 2014 U.S. Tax Ct. LEXIS 40 (Docket Nos. 2685-11, 8393-12. Filed September 17, 2014.)

The Requesting Party wanted the backup tapes to analyze metadata on when ESI was created. Moreover, the Requesting Party called “Predictive Coding” an “unproven technology.” The Requesting Party attempted to address the Producing Party’s cost concern with a clawback agreement. Dynamo Holdings, at *3.

After an evidentiary hearing with experts on the use of predictive coding, the Court granted the Producing Party’s motion to use predictive coding. Judge Buch had a “dynamo” quote on the entire issue of asking to use predictive coding:

 “And although it is a proper role of the Court to supervise the discovery process and intervene when it is abused by the parties, the Court is not normally in the business of dictating to parties the process that they should use when responding to discovery. If our focus were on paper discovery, we would not (for example) be dictating to a party the manner in which it should review documents for responsiveness or privilege, such as whether that review should be done by a paralegal, a junior attorney, or a senior attorney. Yet that is, in essence, what the parties are asking the Court to consider–whether document review should be done by humans or with the assistance of computers. Respondent fears an incomplete response to his discovery. If respondent believes that the ultimate discovery response is incomplete and can support that belief, he can file another motion to compel at that time. Nonetheless, because we have not previously addressed the issue of computer-assisted review tools, we will address it here.

Dynamo Holdings, at *10-11.

It is so refreshing to see a Judge address the issue of requesting to use a specific technology. No one does a motion to compel asking for permission on what lawyers should do document review. Moreover, no moving party asks permission to use visual analytics, de-duplication, or any of the other outstanding technology available to conduct eDiscovery.

The opinion ends with that if the Requesting Party believed the discovery response was incomplete, then a motion to compel could be filed, which is exactly the way the process should work. The issue should not be “can we use this technology,” whether the production is adequate or not, which requires evidence of a production gaps or other evidence that not all responsive information was produced.

Well done Judge Buch.

 

Don’t Stipulate to Not Follow the Form of Production Rules

No-jackintheboxHere is my advice: NEVER agree to a stipulation to produce native files when “it is more practical to do so” and agree to productions in PAPER, PDF’s, or TIFFs. Melian Labs v. Triology LLC, 2014 U.S. Dist. LEXIS 124343 (N.D. Cal.Sept. 4, 2014).

That is what happened in Melian Labs v. Triology LLC. It reads like a personal Sum of All Fears for anyone who has spent years working with ESI, because the Court denied motions to compel email and spreadsheets in native files with metadata, because of the parties’ Rule 26(f) stipulation.

The Requesting Party challenged the Producing Party’s email production, because the production was “7 large PDF image documents, which each appear to be a  compilation of ESI improperly collected and produced” and the collection was not forensically sound.  Melian Labs, at *4.

The Producing Party claimed it did not need to forensically collect discovery. Further, the Producing Party claimed the email was printouts directly from Gmail or Microsoft. Id. 

ScreamingMad

The Court stated that the Requesting Party was complaining about the form of production and not that the production was incomplete. As the parties had agreed to the form of production being paper, PDF’s, or TIFFs,  Fed. R. Civ. P. 34(b)(2)(E) did not control, but instead the parties’ stipulation. Melian Labs, at *4-5.

The Court stated the following on the stipulation and the discovery dispute:

Instead, it states that ESI may be produced in paper, PDF or TIFF. That producing the documents in a searchable format would ease Triology’s review does not render Melian’s production deficient. Triology fails to articulate why metadata is important to emails, when every email should contain the information sought on the face of the document. To the extent that emails have this information cut off or it is not apparent from the face of the email (i.e. as may be the case with BCC), Triology is entitled to the complete email with the agreed upon metadata, and Melian must provide it upon request.

Melian Labs, at *5 (Emphasis added).

Based on the above, the Court DENIED the moving party’s request  to compel the production of all emails in a searchable or native format is denied. Id. 

Things get equally frustrating for spreadsheets.

The Producing Party admitted reading some of the spreadsheets were difficult and produced those ones in native file upon request. Melian Labs, at *5-6. However, the Producing Party stood their ground that they agreed to produce native files “when it is more practicable to do so.” Id. 

The Court held again that the joint stipulation controlled and that the Producing Party could NOT be compelled to produce spreadsheets in native file format. Melian Labs, at *6 (emphasis added). However, the Court did end with a subtle suggestion the Producing Party would produce spreadsheets that were “easily readable without seeking court intervention” in the event of any disputes over the readability/legibility of spreadsheets. Id. 

Bow Tie Thoughts

I hate stipulations like this one. Producing in native file format is the only practical option when it comes to ESI. If I were a Federal Judge, I would summarily deny any such Rule 26(f) stipulations that called for the production of ESI as paper, PDFs, or TIFFs, on the grounds that the agreement would violate Federal Rule of Civil Procedure Rule 1 by unnecessarily driving up costs. The fact the parties end up shackled to a bad form of production agreement should serve as a warning label to never let this happen again.

I worked on a case where over 2000 documents were produced as 8 non-searchable PDF’s. To say it was a mess was an understatement. We were able to correct the production gamesmanship with a useable production, but it made document review very difficult until it was corrected.

There is value in native files, because it provides objective coding for databases and expedites review, both of which further the goals of Rule 1. Moreover, there is amazing review technology that empowers lawyers to identify communication patterns, key players, and other useful information. Printing Gmail as PDF’s dramatically undercuts that ability to make use of these tools.

That being said, the Requesting Party could have very forceful deposition questions on what was done to preserve ESI, the steps to identify responsive discovery, and what methodology was employed to ensure substantive and embedded metadata was not destroyed after the duty to preserve triggered. Additional questions could be asked to identify formulas in printed spreadsheets of Excel files. This might force the production of un-produced native file spreadsheets.

Discovery must be collected in a defensible manner. Many people call this process “forensically sound.” This might not mean that every computer has a mirror image made of it, but a targeted collection. Using targeted collection tools can also be done in a defensible manner, with many great technology options to use. That being said, I would question the adequacy of a product done where the collection process was printing email as PDF’s. It might be justified in a small case, but if in Federal Court with high stakes, I would strongly encourage having a mirror image done of the subject computers.

Nebraska, Where Proportionality is Alive and Well in Discovery

Nebraska stampOne lesson from United States v. Univ. of Neb. at Kearney, is that maybe you should take depositions of key parties and use interrogatories to find out relevant information to your case before asking for over 40,000 records that contain the personal information of unrelated third-parties to a lawsuit.

The case is a Fair Housing Act suit involving claims that students were prohibited or hindered from having “emotional assistance animals in university housing when such animals were needed to accommodate the requesting students’ mental disabilities.” United States v. Univ. of Neb. at Kearney, 2014 U.S. Dist. LEXIS 118073, 2 (D. Neb. Aug. 25, 2014).

A protracted battle over the scope of discovery broke out between the parties. The Defendants argued the search, retrieval, and review for responsive discovery was too expansive and would have been unduly burdensome. Kearney, at *5-6. As the Government’s search requests included “document* w/25 policy,” you can see the Defendant’s point on having broad hits to search terms. Kearney, at *20.

The Government’s revised search terms would have 51,131 record hits, which would have cost $155,574 for the Defendants to retrieve, review, and produce the responsive ESI. Kearney, at *5-6. This would have been on top of the $122,006 already spent for processing the Government’s requests for production. Kearney, at *7.

The Court noted that the Government’s search terms would have required production of ESI for every person with disability, whether they were students or contractors. Kearney, at *6-7. The Government argued the information was necessary, and justified, in order to show discriminatory intent by the Defendants. Id.

The Defendants wanted the scope of the discovery requests narrowed to the “housing” or “residential” content, which would have resulted in 10,997 responsive records. Kearney, at *7.

The Government did not want to limit the scope of discovery and recommended producing all the ESI subject to a clawback agreement [notice not a protective order] for the Government to search the ESI. The Defendants argued such an agreement would violate the Family Educational Rights and Privacy Act by disclosing student personal identifiable information without their notice and consent. Kearney, at *8.

Motion practice followed with the Defendant requesting cost shifting to the Government for conducting searches, the use of predictive coding software, and review hosting fees. Kearney, at *8-9.

The Court ordered the parties to answer specific discovery questions, which the Government did not answer, on “information comparing the cost of its proposed document retrieval method and amount at issue in the case, any cost/benefit analysis of the discovery methods proposed, or a statement of who should bear those costs.” Kearney, at *9.

The Court was not keen on the Government outright searching the personal data of others unrelated to the case. As the Court stated:

The public and the university’s student population may be understandably reluctant to request accommodations or voice their concerns about disparate or discriminatory treatment if, by doing so, their private files can be scoured through by the federal government for a wholly unrelated case. The government’s reach cannot extend that far under the auspices of civil discovery; at least not without first affording all nonparties impacted with an opportunity to consent or object to disclosure of information from or related to their files.

Kearney, at *18-19.

The Court stated it would not order the production of over 51,000 files with a clawback order. Moreover, the cost to review all of the ESI exceeded the value of the request. Kearney, at *19.

The Court did not accept the Government’s claim that it needed to conduct an expansive search. Kearney, at *19-20. The Court stated the following on the fundamentals of civil discovery:

Searching for ESI is only one discovery tool. It should not be deemed a replacement for interrogatories, production requests, requests for admissions and depositions, and it should not be ordered solely as a method to confirm the opposing party’s discovery is complete. For example, the government proposes search terms such as “document* w/25 policy.” The broadly used words “document” and “policy” will no doubt retrieve documents the government wants to see, along with thousands of documents that have no bearing on this case. And to what end? Through other discovery means, the government has already received copies of UNK’s policies for the claims at issue.

Kearney, at *20.

The Court further stated that “absent any evidence that the defendants hid or destroyed discovery and cannot be trusted to comply with written discovery requests, the court is convinced ESI is neither the only nor the best and most economical discovery method for, and depositions should suffice—and with far less cost and delay.” Kearney, at *21.

Bow Tie Thoughts

This case has significant privacy interests, but at its core the issue is one of proportionality. What was the cost of discovery and its benefit? In the end, the cost of expansive search terms that impacted the third party rights of others, outweighed the benefit of the discovery to the case.

The fact we have amazing search technology that can search electronic information does not mean we can forget how to litigate. The use of “search terms” cannot swallow the actual claims of a case.

It is heartening to see a Court say no to the data of unrelated third parties being enveloped into a discovery production. While there are many ways to show discrimination, requesting the electronically stored information, protected by Federal and most likely state law, of third parties should give any Court pause.

The use of predictive coding to focus the scope of discovery, or visual analytics to identify relevant information, or clustering to organizing similar information is fantastic technology to expedite review. However, the fact that technology exists still means lawyers have to use requests for admissions, interrogatories, and have requests narrowly tailored for responsive ESI.

 

Stuck in the Predictive Coding Pipeline

ExxonMobil Pipeline had a problem in discovery: their discovery responses were overdue. The requests for production was served in November 2013 and due after one extension in January 2014. The Plaintiffs rightly brought a motion to compel.

The Defendants had enough discovery to give most eDiscovery attorneys a migraine with a nosebleed: 16 separate lawsuits, with 165 discovery requests in one case, a total of 392 requests in all the related cases, and 83 custodians with approximately 2.7 million electronic documents. Other discovery going back to 1988 had over 63,000 paper documents that were scanned and to be searched with keywords. Additionally, there were approximately 630,000-800,000 documents that had to be reviewed for responsiveness, confidentiality, and privilege. The Defendants had produced 53,253 documents consisting of over 191,994 pages. United States v. ExxonMobil Pipeline Co., 2014 U.S. Dist. LEXIS 81607, 5-8 (E.D. Ark. June 9, 2014).

pipeline

The Defendants suggested using predictive coding in light of the large volume of discovery, but the Plaintiff the United States did not agree with the use of predictive coding (at least since the filing of the motions). ExxonMobil Pipeline, at *6. Moreover, the parties did not seek relief from the Court on the use of predictive coding, other than to order the parties meet and confer. ExxonMobil Pipeline, at *6-7.

The Defendants explained that using traditional review with 50 attorneys that document review could be completed by the end of June 24 and production by the end of August 2014. ExxonMobil Pipeline, at *6.

The United States disagreed with the Defendants assumption of lawyers only reviewing 250 documents/files a day. Moreover, the Defendants did not raise concerns about document review when they entered an agreed upon scheduling order in October 2013. ExxonMobil Pipeline, at *6-7.

The Court acknowledged that the Defendants had a large volume of discovery to review. Moreover, it was unclear if the parties had agreed to a review methodology before the Court issued its order. Regardless, the Court ordered the Defendants to complete their review and production by July 10, 2014, absent good cause. ExxonMobil Pipeline, at *7-8.

Bow Tie Thoughts

Most attorneys do not think about document review strategies at the beginning of a case. They should. Discovery is the backbone of civil litigation. Unless you know the information you have to review, strategies to maximize efficiency, and reviewing for claims or defenses, document review can be a nightmare experience.

This case does not go into why the Defendants sought agreement from the Plaintiff on the use of predictive coding. I do not agree with that strategy, unless a specific review protocol was ordered at the Rule 16 conference that the producing party wanted to change.

The issue with a document production is whether or not the production is adequate. Lawyers should agree to the subject matter of the case, custodians, data ranges, and other objective information that goes to the merits of the lawsuit. When lawyers start asking each other for permission on whether they can use predictive coding, visual analytics, clustering, email threading, or any other technology, civil litigation becomes uncivil. Case in point: the Plaintiffs argued the Defendants could review more than 250 documents a day in this case. Such disputes turn into an academic fight over how much lawyers can read and analyze in a 9-hour workday. The end result of such motion practice would be a Judge ordering lawyers to read faster.

My advice is to focus on the merits and not derail the case with a fight over what review technology can be used. Fight over whether the production is adequate, not what whether you can use predictive coding.

How Not to Be Progressive: Court Rejects Predictive Coding Not Agreed to By Parties

RichardNixonStampFighting over discovery search methodology makes me think of President Richard Nixon’s resignation speech: “Always remember, there are those who hate you. And the only way to keep them from winning is to hate them right back. And then you destroy yourself.”

Attacking a party who used predictive coding to reduce discovery review time to save money and time, only to result in a larger collection of ESI produced upon the requesting party, means the requesting party now has more to review. This is the end result of Progressive Cas. Ins. Co. v. Delaney.   

I have never been a fan of agreeing to the use of predictive coding in ESI Protocols. I do not believe such agreements are required at all by the Federal Rules of Civil Procedure. The issue is whether or not a production is adequate. That requires the requesting party offering evidence that the production is somehow delinquent with facts. That discussion does not happen without first having a production.

Parties do not have a veto power over which review application is used by a reviewing party and what features they may or may not use.  The requesting party should not attack a review methodology, unless there is a production to challenge. The entire discussion is premature and the issue is not ripe for the Court.

In the case at bar, the original dataset was narrowed by search terms to 565,000 “hits” from the original 1.8 million dataset. This search term methodology had been agreed to in an ESI Protocol by the parties. After one month of document review, the producing party realized that it could take 6 to 8 months to manually review the narrowed dataset for responsiveness or privilege. The party unilaterally decided to use predictive coding instead. Moreover, after telling the other side about their change in technology, motion practice followed. Progressive Cas. Ins. Co. v. Delaney, 2014 U.S. Dist. LEXIS 69166.

The Requesting Party wanted the Producing Party to 1) produce the 565,000 culled from the 1.8 million data set using the parties’ agreed-upon search terms subject to a clawback for privileged documents, or 2) the Producing Party apply the predictive coding methodology to the entire 1.8 million record dataset. Progressive, at *12 and *15.

The Producing Party did not want to do either approach, specifically concerned that searching the entire dataset would result in a larger privilege review.

The Court noted what has been judge-made law on using technology assisted review: Courts require the producing party to provide the requesting party with “full disclosure about the technology used, the process, and the methodology, including the documents used to “train” the computer.” Progressive, at *27-28, citing Da Silva Moore 2012 U.S. Dist. LEXIS 23550 (S.D.N.Y. Feb. 24, 2012).

The Court ordered the producing party to follow the original agreed to protocol and produce the “hit” documents to the Requesting Party within fourteen days without further review. Progressive, at *30.

The Court stated that following the Requesting Party’s protocol, the cost of review would be shifted to them. Progressive, at *31. Moreover, the Requesting Party believed they had the manpower to complete the review within one month. Id. 

The Requesting Party could apply privilege filters before production and produce a privilege log. Id. 

Bow Tie Thoughts

I do not encourage clients to have ESI Protocols that limit their ability to review discovery efficiently. I also strongly argue against the idea you need approval from the opposing party on what search methodology you can use to identify responsive discovery. It is the producing party’s right to use keywords, concept search, visual analytics, email threading, clustering, find similar, or any other form of technology-assisted review. That includes predictive coding. The requesting party does not get a veto power over what technology the producing party can use. The requesting party has “the burden of proving that a discovery response is inadequate.” Abt v. Jewell, 2014 U.S. Dist. LEXIS 50766, 12-14 (D.D.C. Apr. 11, 2014), citing Barnes v. D.C., 289 F.R.D. 1, 6 (D.D.C. 2012) and Equal Rights Ctr. v. Post Props., Inc., 246 F.R.D. 29, 32 (D.D.C. 2007).

You cannot prove a discovery response is inadequate if you do not have a discovery response. The entire idea of attacking the use of predictive coding is premature if there is no production to be reviewed.

The ugly wrinkle in this case is the search and production methodology mandated in the ESI Protocol agreed to by the parties. I strongly encourage parties to not agree to actual technological steps to be used, because it limits the ability to conduct discovery in a cost effective manner. These agreements are often enacted without the advice of an eDiscovery Consultant who knows how the actual technology works.

What parties should discuss and codify in an ESI Protocol is the subject matter of the lawsuit. Who are the key players? What are the relevant date ranges? What are the terms of art used by the parties? What is the data actually making up the case? Those, and many others, are the topics parties need to agree to, not whether you can use visual analytics to identify date ranges or data clustering to determine what was relevant.

The use of predictive coding has been twisted because lawyers think they need permission to use it. They do not. The burden is on the opposing side to prove a production was inadequate, not the fact the producing party used one form of technology-assisted review over another.

 

Lessons From Drafting Overly Broad Requests

Paint_BrushesDrafting discovery is an art. While painting in oils or pastels is certainly more colorful than drafting requests in Times New Roman or Ariel, both require thought. And like any masterpiece, drafting a request for production can have its challenges.

A Requesting Party demanded an opposing party produce “[a]ll email and text messages sent or received on Mayo email and text messaging accounts.”

The Magistrate Judge found the request to be overly broad.

However, the Producing Party had produced responsive discovery with redactions, thus the Requesting Party challenged the overbroad ruling as unwarranted and moot. Elkharwily v. Mayo Holding Co., 2014 U.S. Dist. LEXIS 53890, 8-10 (D. Minn. Apr. 18, 2014).

The Producing Party countered that discovery produced with redactions and claims of privilege were subject to the future consideration of the Magistrate Judge. Id.

The Court stated that the Magistrate Judge had “expressly excluded from his order any determination regarding redactions or assertions of privilege, reserving consideration of those topics. As a result, notwithstanding the magistrate judge’s conclusion of overbreadth, it appears that Elkharwily has no actual objection to a ruling by the magistrate judge regarding Requests for Production 1 or 2.” Id. 

The Requesting Party also challenged an overbroad ruling for a request for production of, “[a]ll documents, notes, communications, emails and text messages relating to or to any of the claims or defenses in this action prepared, sent or received by [various entities, departments and individuals].” Elkharwily, at *9-10.

The Producing Party stated that they had produced all responsive discovery, as such, the Court found there was not an objection to the Magistrate Judge’s ruling. Elkharwily, at *10.

Bow Tie Thoughts

Reviewing the requests for production and challenging the objections make me think of two different issues: drafting a narrowly tailored request and being able to prove a production was inadequate.

As to the first issue, a request for production should include more than identifying the sender and all communications. That could be overly broad on its face, given the number of emails and texts sent in a day. Attorneys are well served to ask for specific communications between individuals, date ranges, and on specific subjects. The trick is not having the request become too narrow, but you cannot simply ask for everything, because you might actually get everything. That could make document review a nightmarish quagmire, because you actually got what you asked for.

The second issue requires a different strategy. If you think a production is inadequate, because there “should be more” emails or text messages, you have to prove it. This is not a metaphysical discussion of whether something exists or not, but you need some evidence of production gaps. This can include missing files that the requesting party has that should have been produced (such as part of an email chain), if not outright days or weeks of missing email. If there is such a production gap, a party can demonstrate by affidavit that the production is not complete.