Stuck in the Predictive Coding Pipeline

ExxonMobil Pipeline had a problem in discovery: their discovery responses were overdue. The requests for production was served in November 2013 and due after one extension in January 2014. The Plaintiffs rightly brought a motion to compel.

The Defendants had enough discovery to give most eDiscovery attorneys a migraine with a nosebleed: 16 separate lawsuits, with 165 discovery requests in one case, a total of 392 requests in all the related cases, and 83 custodians with approximately 2.7 million electronic documents. Other discovery going back to 1988 had over 63,000 paper documents that were scanned and to be searched with keywords. Additionally, there were approximately 630,000-800,000 documents that had to be reviewed for responsiveness, confidentiality, and privilege. The Defendants had produced 53,253 documents consisting of over 191,994 pages. United States v. ExxonMobil Pipeline Co., 2014 U.S. Dist. LEXIS 81607, 5-8 (E.D. Ark. June 9, 2014).

pipeline

The Defendants suggested using predictive coding in light of the large volume of discovery, but the Plaintiff the United States did not agree with the use of predictive coding (at least since the filing of the motions). ExxonMobil Pipeline, at *6. Moreover, the parties did not seek relief from the Court on the use of predictive coding, other than to order the parties meet and confer. ExxonMobil Pipeline, at *6-7.

The Defendants explained that using traditional review with 50 attorneys that document review could be completed by the end of June 24 and production by the end of August 2014. ExxonMobil Pipeline, at *6.

The United States disagreed with the Defendants assumption of lawyers only reviewing 250 documents/files a day. Moreover, the Defendants did not raise concerns about document review when they entered an agreed upon scheduling order in October 2013. ExxonMobil Pipeline, at *6-7.

The Court acknowledged that the Defendants had a large volume of discovery to review. Moreover, it was unclear if the parties had agreed to a review methodology before the Court issued its order. Regardless, the Court ordered the Defendants to complete their review and production by July 10, 2014, absent good cause. ExxonMobil Pipeline, at *7-8.

Bow Tie Thoughts

Most attorneys do not think about document review strategies at the beginning of a case. They should. Discovery is the backbone of civil litigation. Unless you know the information you have to review, strategies to maximize efficiency, and reviewing for claims or defenses, document review can be a nightmare experience.

This case does not go into why the Defendants sought agreement from the Plaintiff on the use of predictive coding. I do not agree with that strategy, unless a specific review protocol was ordered at the Rule 16 conference that the producing party wanted to change.

The issue with a document production is whether or not the production is adequate. Lawyers should agree to the subject matter of the case, custodians, data ranges, and other objective information that goes to the merits of the lawsuit. When lawyers start asking each other for permission on whether they can use predictive coding, visual analytics, clustering, email threading, or any other technology, civil litigation becomes uncivil. Case in point: the Plaintiffs argued the Defendants could review more than 250 documents a day in this case. Such disputes turn into an academic fight over how much lawyers can read and analyze in a 9-hour workday. The end result of such motion practice would be a Judge ordering lawyers to read faster.

My advice is to focus on the merits and not derail the case with a fight over what review technology can be used. Fight over whether the production is adequate, not what whether you can use predictive coding.

How Not to Be Progressive: Court Rejects Predictive Coding Not Agreed to By Parties

RichardNixonStampFighting over discovery search methodology makes me think of President Richard Nixon’s resignation speech: “Always remember, there are those who hate you. And the only way to keep them from winning is to hate them right back. And then you destroy yourself.”

Attacking a party who used predictive coding to reduce discovery review time to save money and time, only to result in a larger collection of ESI produced upon the requesting party, means the requesting party now has more to review. This is the end result of Progressive Cas. Ins. Co. v. Delaney.   

I have never been a fan of agreeing to the use of predictive coding in ESI Protocols. I do not believe such agreements are required at all by the Federal Rules of Civil Procedure. The issue is whether or not a production is adequate. That requires the requesting party offering evidence that the production is somehow delinquent with facts. That discussion does not happen without first having a production.

Parties do not have a veto power over which review application is used by a reviewing party and what features they may or may not use.  The requesting party should not attack a review methodology, unless there is a production to challenge. The entire discussion is premature and the issue is not ripe for the Court.

In the case at bar, the original dataset was narrowed by search terms to 565,000 “hits” from the original 1.8 million dataset. This search term methodology had been agreed to in an ESI Protocol by the parties. After one month of document review, the producing party realized that it could take 6 to 8 months to manually review the narrowed dataset for responsiveness or privilege. The party unilaterally decided to use predictive coding instead. Moreover, after telling the other side about their change in technology, motion practice followed. Progressive Cas. Ins. Co. v. Delaney, 2014 U.S. Dist. LEXIS 69166.

The Requesting Party wanted the Producing Party to 1) produce the 565,000 culled from the 1.8 million data set using the parties’ agreed-upon search terms subject to a clawback for privileged documents, or 2) the Producing Party apply the predictive coding methodology to the entire 1.8 million record dataset. Progressive, at *12 and *15.

The Producing Party did not want to do either approach, specifically concerned that searching the entire dataset would result in a larger privilege review.

The Court noted what has been judge-made law on using technology assisted review: Courts require the producing party to provide the requesting party with “full disclosure about the technology used, the process, and the methodology, including the documents used to “train” the computer.” Progressive, at *27-28, citing Da Silva Moore 2012 U.S. Dist. LEXIS 23550 (S.D.N.Y. Feb. 24, 2012).

The Court ordered the producing party to follow the original agreed to protocol and produce the “hit” documents to the Requesting Party within fourteen days without further review. Progressive, at *30.

The Court stated that following the Requesting Party’s protocol, the cost of review would be shifted to them. Progressive, at *31. Moreover, the Requesting Party believed they had the manpower to complete the review within one month. Id. 

The Requesting Party could apply privilege filters before production and produce a privilege log. Id. 

Bow Tie Thoughts

I do not encourage clients to have ESI Protocols that limit their ability to review discovery efficiently. I also strongly argue against the idea you need approval from the opposing party on what search methodology you can use to identify responsive discovery. It is the producing party’s right to use keywords, concept search, visual analytics, email threading, clustering, find similar, or any other form of technology-assisted review. That includes predictive coding. The requesting party does not get a veto power over what technology the producing party can use. The requesting party has “the burden of proving that a discovery response is inadequate.” Abt v. Jewell, 2014 U.S. Dist. LEXIS 50766, 12-14 (D.D.C. Apr. 11, 2014), citing Barnes v. D.C., 289 F.R.D. 1, 6 (D.D.C. 2012) and Equal Rights Ctr. v. Post Props., Inc., 246 F.R.D. 29, 32 (D.D.C. 2007).

You cannot prove a discovery response is inadequate if you do not have a discovery response. The entire idea of attacking the use of predictive coding is premature if there is no production to be reviewed.

The ugly wrinkle in this case is the search and production methodology mandated in the ESI Protocol agreed to by the parties. I strongly encourage parties to not agree to actual technological steps to be used, because it limits the ability to conduct discovery in a cost effective manner. These agreements are often enacted without the advice of an eDiscovery Consultant who knows how the actual technology works.

What parties should discuss and codify in an ESI Protocol is the subject matter of the lawsuit. Who are the key players? What are the relevant date ranges? What are the terms of art used by the parties? What is the data actually making up the case? Those, and many others, are the topics parties need to agree to, not whether you can use visual analytics to identify date ranges or data clustering to determine what was relevant.

The use of predictive coding has been twisted because lawyers think they need permission to use it. They do not. The burden is on the opposing side to prove a production was inadequate, not the fact the producing party used one form of technology-assisted review over another.

 

Lessons From Drafting Overly Broad Requests

Paint_BrushesDrafting discovery is an art. While painting in oils or pastels is certainly more colorful than drafting requests in Times New Roman or Ariel, both require thought. And like any masterpiece, drafting a request for production can have its challenges.

A Requesting Party demanded an opposing party produce “[a]ll email and text messages sent or received on Mayo email and text messaging accounts.”

The Magistrate Judge found the request to be overly broad.

However, the Producing Party had produced responsive discovery with redactions, thus the Requesting Party challenged the overbroad ruling as unwarranted and moot. Elkharwily v. Mayo Holding Co., 2014 U.S. Dist. LEXIS 53890, 8-10 (D. Minn. Apr. 18, 2014).

The Producing Party countered that discovery produced with redactions and claims of privilege were subject to the future consideration of the Magistrate Judge. Id.

The Court stated that the Magistrate Judge had “expressly excluded from his order any determination regarding redactions or assertions of privilege, reserving consideration of those topics. As a result, notwithstanding the magistrate judge’s conclusion of overbreadth, it appears that Elkharwily has no actual objection to a ruling by the magistrate judge regarding Requests for Production 1 or 2.” Id. 

The Requesting Party also challenged an overbroad ruling for a request for production of, “[a]ll documents, notes, communications, emails and text messages relating to or to any of the claims or defenses in this action prepared, sent or received by [various entities, departments and individuals].” Elkharwily, at *9-10.

The Producing Party stated that they had produced all responsive discovery, as such, the Court found there was not an objection to the Magistrate Judge’s ruling. Elkharwily, at *10.

Bow Tie Thoughts

Reviewing the requests for production and challenging the objections make me think of two different issues: drafting a narrowly tailored request and being able to prove a production was inadequate.

As to the first issue, a request for production should include more than identifying the sender and all communications. That could be overly broad on its face, given the number of emails and texts sent in a day. Attorneys are well served to ask for specific communications between individuals, date ranges, and on specific subjects. The trick is not having the request become too narrow, but you cannot simply ask for everything, because you might actually get everything. That could make document review a nightmarish quagmire, because you actually got what you asked for.

The second issue requires a different strategy. If you think a production is inadequate, because there “should be more” emails or text messages, you have to prove it. This is not a metaphysical discussion of whether something exists or not, but you need some evidence of production gaps. This can include missing files that the requesting party has that should have been produced (such as part of an email chain), if not outright days or weeks of missing email. If there is such a production gap, a party can demonstrate by affidavit that the production is not complete.

Social Media Request for Production That Got It Right

Requesting social media relevant to a lawsuit should be done as standard operating procedure now. However, some attorney have a difficult time with narrowing their requests beyond, “Produce your Facebook profile.” Such fishing expeditions are summarily denied. See, Tompkins v. Detroit Metro. Airport, 278 F.R.D. 387 ( E.D. Mich. 2012), Salvato v. Miley, 2013 U.S. Dist. LEXIS 81784, 3-4 (D. Fla. 2013) and Potts v. Dollar Tree Stores, Inc., 2013 U.S. Dist. LEXIS 38795, at *6-7(M.D. Tenn. Mar. 20, 2013).

Here is a case where the requesting party got it right. The Court ordered the producing party to respond to the following request for production:

“[A]ny notes, diaries, logs, journals, letters, electronic mail, text messages, calendars, Facebook postings, tweets, or other social media messages that relate or refer to your employment with the GDRTA, your alleged serious health condition, or your activities on days when you requested FMLA leave.

Wilkinson v. Greater Dayton Reg’l Transit Auth., 2014 U.S. Dist. LEXIS 64522, 9 (S.D. Ohio May 9, 2014).

The Court stated that the request sought information about heath conditions that gave rise to the FMLA requests. Id.

In my opinion, the request was narrowly tailored for the issues that are the subject of the lawsuit. This is the sort of thought attorneys need to exercise when requesting social media. A party cannot simply ask for the opposing side’s password to Tumblr or Instagram. A request has to be for information relevant to the lawsuit, not a fishing trip across the adverse party’s Facebook Wall for whose status updates they “liked.”

 

Native File for Cloud Email

A producing party claimed that it was “impossible” to produce Gmail in Native File Format because, “Gmail account user and Google does not permit its users to copy e-mails and documents in native format.” As such, the Defendant produced the gmail emails by forwarding them to the Plaintiff. Sexton v. Lecavalier, 2014 U.S. Dist. LEXIS 50787, 5-8 (S.D.N.Y. Apr. 11, 2014)

The Plaintiff disagreed and countered the emails could be produced by either 1) downloading the messages to a program like Outlook or 2) Gmail e-mails that have been displayed [6] in their “original” format by clicking “show original” on the Gmail website and subsequently saved as PDF files. Sexton, at *6.

The Plaintiff’s declaration explained in detail methods for exporting email. Id.

The Court stated that even though producing “cloud” email might be difficult, that did not “absolve him of his obligation to produce documents in a reasonably useable format.” Sexton, at *6.

The Court proceeded to recount the value of metadata and production methods to prevent the degradation of valuable metadata. Sexton, at *7, citations omitted. The Court stated the following on the Defendant’s production obligations:

In the context of a request for native e-mails stored with a third-party provider, a functionally native format that preserves relevant metadata, such as those formats that can be generated through an e-mail client like Microsoft Outlook or through e-discovery collection software, should generally suffice.

Sexton, at *7-8.

Bow Tie Thoughts

Attorneys have to get used to the reality that collecting cloud email will require the use of either a computer forensic expert. There are multiple ways to collect cloud email, with one method being to download the email to Outlook, followed by the email then being defensibly collected. Other applications can collect directly from the cloud. Regardless of the method used, the custodian should not do the collection. Self-collection often results in self-selection. Self-selection can end with relevant email not being produced.

 

Don’t Argue Gmail is Not Reasonably Accessible

Can a producing party argue for cost-shifting to the defendant based on proportionality for the production of work-related emails on a plaintiff’s gmail account? Owens v. Clear Wireless LLC, 2014 U.S. Dist. LEXIS 26698, 4-6 (D. Minn. Mar. 3, 2014).

Short answer is no, because the emails are not inaccessible to the producing party.

The general rule in over the cost of producing discovery falls on the producing party. Owens, at *4, citing Oppenheimer Fund, Inc. v. Sanders, 437 U.S. 340, 358, 98 S. Ct. 2380, 57 L. Ed. 2d 253 (1978).

Discovery can be limited if “the burden or expense of the proposed discovery outweighs its likely benefit, considering the needs of the case, the amount in controversy, the parties’ resources, the importance of the issues at stake in the action, and the importance of the discovery in resolving the issues.” Owens, at *5, citing Fed. R. Civ. P. 26(b)(2)(C)(iii). Cost-shifting can be imposed if the data sought by the producing party is not reasonably accessible. Id, citing Zubulake v. UBS Warburg LLC, 216 F.R.D. 280, 283-84 (S.D.N.Y. 2003).

The Plaintiff’s emails in a gmail account were accessible by the Plaintiff. Data must be not reasonably accessible in order for any cost-shifting to be considered. Owens, at *6.

Bow Tie Thoughts 

Email messages stored in a Cloud, whether it is Yahoo, Gmail, or some other service provider, are not “not reasonably accessible” to the producing party.

Computer forensic experts have determined multiple ways to capture cloud-based email. This can include having the email messages downloaded to a computer in Outlook, to applications that preserve the email from the cloud. Just because a lawyer needs to retain someone to perform the work, does not make the email “not reasonably accessible.”

If You Agree to Produce Excel as Native Files, Don’t Produce Tiffs

Judge Paul Cherry wrote a masterful opinion involving an EEOC motion to compel production of agreed upon production formats. The case is an excellent example of the issues very alive over the form of production, understanding technology, educating the Court with expert affidavits, and the value of the meet and confers.

The EECO requested in writing that ESI be produced as Tiffs with Concordance Load Files (strangely referenced as near-native) and spreadsheets/databases in native format. The Defendants agreed to the production formats. EEOC v. SVT, LLC, 2014 U.S. Dist. LEXIS 50114, 3-4 (N.D. Ind. Apr. 10, 2014).

The Defendant produced spreadsheets in Tiff format and employment applications as single-page, non-unitized PDF and TIFF format without a load file. SVT, LLC, at *4.

The Defendants claimed that they had produced their discovery “pursuant to industry standards.” SVT, LLC, at *4.

The EEOC took the very brilliant step of having their Litigation Support Manager explain by affidavit the issues with the Defendant’s production. The Litigation Support Manager explained that the static images of the spreadsheets were “unusable because they cannot be searched or manipulated for analysis.” SVT, LLC, at *4-5.

The Litigation Support Manager further explained how exporting the data from the Defendant’s Kronos system as spreadsheets was not unduly burdensome, as the application had the built-in functionality to run searches, reports and export the data. SVT, LLC, at *5-6.

The Litigation Support Manager further explained that the static images of policies, handbooks, and contracts were “bulk scanned” without any logical unitization or a load file. The production lacking any document breaks made the production unusable. SVT, LLC, at *5-6.

The Court explained that the EEOC could request the Defendants to produce in specific forms pursuant to Federal Rule of Civil Procedure Rule 34(b)(1)(C). In this case, TIFF’s and a load file for some ESI and native file for spreadsheets. SVT, LLC, at *8.

The Defendants produced the ESI without a load file. That did not happen.

The Court rejected the Defendants’ argument that the third-party cloud application used by the Defendant was outside of their “control.” The Court noted testimony from Defendant’s HR representatives on how they could generate reports in Excel from the application. SVT, LLC, at *9-11.

The Defendant did produce the spreadsheets as native files during the briefing of the motion. However, the Court granted the EEOC’s motion AND further ordered the parties to conduct an in-person meet and confer. All discussions over the production had been by written communications, without an in-person meeting. Given the disputes that came up, the Court went so far as saying the Litigation Support Manager was “uniquely qualified” to participate in a meet and confer with the opposing side. SVT, LLC, at *19-20.

Bow Tie Thoughts

Reviewing Excel spreadsheets converted to non-searchable TIFFs is about as fun as looking at the sun with binoculars. Heck, if it happened to me, you would think I was hunting bear from the motion to compel I would write.

Battles over the form of production and Rule 26(f) conferences will not go away, even as we debate amending the proposed amendments to the Federal Rules of Civil Procedure.

Form of production disputes tend to happen because 1) the lawyers are simply fighting for the sake of fighting or 2) they do not know how to conduct eDiscovery. When both reasons occur at the same time, things can get very unpleasant.

The meet and confer can be a very effective time for the parties to come to an understanding about the tech issues in the case. This almost always requires having an eDiscovery attorney or litigation support professional at the meeting to help focus on the solutions to the technology issues.

The EEOC should be commended on having their Litigation Support Manager provide an affidavit that educated the Judge on the form of production and technology to review ESI. Such individuals are extremely important internally to successfully manage cases and can be mission critical in working with the other side.