Proportionality in Email Searches

Proportionality analysis is making its way into judicial opinions with outright citing the Federal Rules of Civil Procedure. US District Court Judge Robert Bryan in Washington limited additional searches of email based on proportionality and overbroad search requests. Moore v. Lowe’s Home Ctrs., LLC (W.D.Wash. Feb. 19, 2016, No. 14-1459 RJB) 2016 U.S. Dist. LEXIS 20630, at *14-16.

globe-489519_1280

The Plaintiff argued that the Defendant’s email searches were inadequate, because the Plaintiff had emails in their possession that the Defendant did not produce. Moore, at *14. The Plaintiff made the next logical argument that 1) the Defendant should demonstrate that it has done a diligent search; and 2) Defendant should conduct additional searches using terms requested by Plaintiff. Id.

The Plaintiff argued that the new searches should have been conducted without the Plaintiff’s first and last name on each of the witnesses’ email accounts. Id.

That sounds like a totally normal course of action. Unfortunately, the Court did not agree.

The Defendant argued they had reviewed 21,000 emails from 17 custodians at a cost of $48,074. The Defendant further argued that the relevant emails from 2012 were likely deleted. Furthermore, the Defendant had conducted the searches, review, and interviews to find similar claims to that of the Plaintiff. Moore, at *14-15.

Here is the crux of the Defendant’s argument: Using 88 new search terms that included annoy*, bull, click*, dad, date*, hand, rack, rod, box, would result in hundreds of thousands of irrelevant emails. Moore, at *15.

Judge Bryan held:

Plaintiff’s request for email searches is overly broad and not proportional to the case. Plaintiff refers to a multi-plaintiff case to support her assertion that Defendant should conduct searches without using Plaintiff’s name. Searching without the use of Plaintiff’s name would not be proportional in this single-plaintiff case. While the additional search terms could possibly yield some relevant results, Plaintiff has not provided specifics about what Plaintiff reasonably expects such a search to show, and Plaintiff has not shown that this information could not be found through other means. For example, Plaintiff has not shown that she would be unable to uncover the same information by asking additional questions of witnesses already scheduled to be deposed. As to this discovery issue, Plaintiff’s motion should be denied.

Moore, at *15-16.

Bow Tie Thoughts

I understand the Judge’s order, but would have recommended a meet and confer over search terms. In my experience, having ONLY 21,000 emails from 17 custodians is a very tiny amount of data. I have worked on cases with 6 to 10 people generating over 300,000 emails on more than a few cases. Given the fact the case involved a single plaintiff, it is very likely that is a reasonable amount of email. A lot would turn on the amount of time the email communications covered.

It is not clear if either side had experts explaining the search terms in affidavits. I agree 88 search terms that include terms like “hand” would generate overly broad results, unless it was limited, such as by specific dates or individuals. It might have been a better plan to bring a motion to compel search strings opposed to individual terms. The results could have been validated with a search term efficiency report, so the attorneys would know how many results would be generated by each search. For there, the parties could negotiate down to what would be reasonable search string requests. However, it is tough to know if that would be the right plan without more information on the case.

Advertisement

Proportionality in Asymmetrical Discovery

The Plaintiffs and Defendants in a SEC case highlight the importance of proportionality between asymmetrical parties. In such cases, one side has all of the electronically stored information for discovery requests; the other side does all the requesting. However, the smaller party can have an extremely high burden reviewing any produced ESI, especially if searchable features have been removed.

Magistrate Judge Leda Dunn Wettre in City of Sterling Heights Gen. Emples. Ret. Sys. v. Prudential Fin., (an opinion not for publication) did a great job balancing the proportionality interests between a motion to add additional search terms and custodians to the dispute.

Balancing Custodians

The Plaintiffs sough to add between 22 to 45 additional custodians for the Defendants to add to their discovery search. The Requesting Party made a strong argument for the additional custodians, including a chart of the custodians with the factual basis for expanding the scope of discovery. The parties had already agreed to 66 custodians. City of Sterling Heights Gen. Emples. Ret. Sys. v. Prudential Fin., 2015 U.S. Dist. LEXIS 110712, *4-5; 8.

The Court denied adding the majority of the additional custodians, explaining, Neither plaintiffs nor the Court can know with certainty, of course, whether searches of the additional custodians’ ESI will yield unique, noncumulative documents.” Sterling Heights, at *9. The Court further explained that it was satisfied that the vast majority of the custodians likely had duplicate information. Id.

Judge Wettre drove home her proportionality analysis with the following: the Court is cognizant of the sizeable costs to Prudential involved in the searching, review and production of information from each additional custodian. Although Prudential is a large corporation with substantial resources, the Court should not be – and is not-insensitive to these costs.” Id.

The Court found that the Plaintiffs had provided enough information that there was a “fair inference” that there could be more custodians with additional information. Id. Judge Wettre explained in terms of proportionality that:

Although the number of agreed custodians is already substantial, the resources and personnel at Prudential devoted to the Verus audit and related issues also seem to have been immense. Therefore, it is not surprising that more than 66 Prudential employees may have been heavily involved in the issues relating to this case and may thus have relevant, noncumulative information. Moreover, allowing plaintiffs a moderate number of additional custodians does not seem disproportionate to the size and scale of this action. The Court understands that there are electronic de-duping tools that may be utilized to limit defendants’ review and production of duplicative documents, reducing some of the burden on Prudential of producing information from additional custodians.

Sterling Heights, at *10.

The Court explained that permitting the plaintiffs to select an additional 10 custodians “would balance fairly plaintiffs’ rights to relevant discovery against the costs and burden to defendants of providing that discovery.” Sterling Heights, at *10-11.

Search Terms

The question of adding four new search terms was decided swiftly. The Defendants challenged adding more search terms, claiming they had already produced 1.5 million pages of discovery. Sterling Heights, at *11. The Plaintiffs responded that over half of the 1.5 million pages were “completely unusable redacted pdfs of Excel spreadsheets.” Id.

The Court held that the four search terms appeared designed to target relevant information. Moreover, the Plaintiff noted that if the Defendant had produced a hit count that showed the terms had an “egregiously large” number of “hits,” the Plaintiffs would have considered narrowing the terms. Id.

Judge Wettre allowed the additional terms and explained:

The Court does not have before it information on which it is persuaded that it should deny these four additional terms because they would produce an unduly large number of results likely to be irrelevant to this case. While the Court does recognize that defendant has agreed to a large number of search terms, that is not sufficient basis in and of itself to deny plaintiffs the four additional search terms they seek.

Sterling Heights, at *12.

Bow Tie Thoughts

If proportionality cases were rock concerts, this case has a few “gavel drop” moments. It is great to see a Judge who incorporated proportionality throughout the entire opinion.

Proportionality arguments should not be made out of thin air. The Plaintiffs made a noble effort providing a chart with each additional custodian explaining the factual basis for expanding the scope of discovery for each individual. While they only got 10 additional custodians, this was an excellent way to explain to a Court the value of adding custodians to decide the merits of a case.

The search term arguments for both parties could have been stronger if the proposed search terms were supported by affidavits from expert witnesses. To be fair, the arguments might have been, but the tone of the opinion sounds like the arguments did not include expert affidavits. There are many lawyers who think that because they can conduct case law research that “search terms” in discovery are the same thing. That is a dangerous assumption.

I strongly encourage lawyers to work with eDiscovery consultants to help identify the concepts to identify electronically stored information. Advanced analytics from clustering of similar files to email threading, to visual analytics, to predictive coding all can help identify responsive files. Lawyers should think beyond “search terms” to concepts in order to search discovery.

Finally, I feel greatly for the Plaintiffs who had to review non-searchable PDF’s of Excel files. There are ways to redact Excel files that would require agreements between the parties, but it can be done. This would keep Excel files in native format and avoid spreadsheets exploding into multiple page nightmares.

In closing, a hat tip to the Judge and both parties on their well argued positions and the opinion.

 

Focus on the Merits to Find What is Relevant, Not Search Terms Alone

Responding to a discovery request marries the practice of law to search technology. Rule 26 Conferences in Federal Court often have parties spending a significant amount of time exchanging “search terms” to determine the most effective discovery protocol for a case.

I think focusing on “search terms” alone is the wrong focus. Parties in a meet and confer should focus on search concepts, such as who are the relevant individuals, date ranges, and core terms. However, the focus should be on how to identify what is relevant to the claims and defenses in a lawsuit and not the minutia of “search terms.”

Mature business male conducting a meeting

Consider the case of In re Lithium Ion Batteries Antitrust Litig. The parties negotiated and developed a search protocol using search terms. The Court summarized the ESI protocol as follows[1]:

  1. The producing/responding party will develop an initial list of proposed search terms and provide those terms to the requesting party;
  2. Within 30 days, the requesting party may propose modifications to the list of terms or provide additional terms (up to 125 additional terms or modifications); and
  3. Upon receipt of any additional terms or modifications, the producing/responding party will evaluate the terms, and
  4. Run all additional/modified terms upon which the parties can agree and review the results of those searches for responsiveness, privilege, and necessary redactions (Proposed Search Term Protocol § B4), or
  5. For those additional/modified terms to which the producing/responding party objects on the basis of overbreadth or identification of a disproportionate number of irrelevant documents, that party will provide the requesting party with certain quantitative metrics and meet and confer to determine whether the parties can agree on modifications to such terms. Among other things, the quantitative metrics include the number of documents returned by a search term and the nature and type of irrelevant documents that the search term returns. In the event the parties are unable to reach agreement regarding additional/modified search terms, the parties may file a joint letter regarding the dispute.

The Plaintiffs recommended if there were disputed search terms after “quantitative metrics evaluation, the parties would then conduct a randomized qualitative sampling.” This would be done by a “random number generator” that would “generate a statistically valid number of ordinal positions of the identified documents,” and the “randomly selected documents can be viewed by the Requesting Party immediately after the appropriate privilege check.”[2]

Nothing left to chance - Business Strategy

The Defendants were not keen on this plan, because the sampling protocol would result in production of irrelevant information.[3]

The Court agreed with the Plaintiff, stating that the “point of random sampling is to eliminate irrelevant documents from the group identified by a computerized search and focus the parties’ search on relevant documents only.”[4]

Judge Donna Ryu explained the Court’s holding on the fact that keywords can be “overinclusive” and can find a large numbers of irrelevant documents in addition to relevant ones. [5] Invoking the Moore v. Publicis Groupe opinion, the Court stated the goal of “quality control test[ing]” is “to assure accuracy in retrieval and elimination of ‘false positives.'”[6] 

The Court pointed out the Plaintiff’s common sense argument: a random sample that shows that a search is returning a high proportion of irrelevant documents is a bad search and needs to be modified to improve its precision in identifying relevant documents.[7]

The Court further stated that the proposed sampling procedure was designed to prevent irrelevant documents from being reviewed and would obviate motion practice over search terms.[8]

The Defendants’ concerns did not fall on deaf ears. The Court explained that the Defendants could remove any irrelevant files from the random qualitative sample for any reason as long as the removed files were replaced with an equal number of randomly generated files.[9]

The Court order became very specific on the following points[10]:

The parties agreed that the procedure for qualitative sampling shall apply only after exhaustion of the quantitative evaluation process.

Irrelevant documents in the sample shall be used only for the purpose of resolving disputes regarding search terms in this action, and for no other purpose in this litigation or in any other litigation; those irrelevant documents, as well as any attorney notes regarding the sample, shall be destroyed within fourteen days of resolution of the search term dispute, with such destruction confirmed in an affidavit by counsel. 

In addition, the court held that access to the random sample shall be limited to one attorney from each law firm designated co-lead class counsel for Direct Purchaser Plaintiffs and Indirect Purchaser Plaintiffs (total of six attorneys).

Plaintiffs could invoke the random sampling process with respect to no more than five search terms per defendant group. 

A defendant family would run one combined search for up to five disputed terms, rather than creating separate samples for each disputed term. The parties were ordered to meet and confer regarding the sample size, as well as the overall limit on the number of sample documents generated per defendant family.

Bow Tie Thoughts 

This search protocol was very specific on sampling. Moreover, it also highlights how complex “search” can be in litigation. However, it also highlights the danger of only using “search terms” in discovery.

“Search terms” are recognized as easily being both over and under inclusive. As such, there is no meet and confer that will ever determine every possible search term. If there were, ESI Protocols would like the Napoleonic Code of Discovery.

The issue with discovery is determining how to find the ESI that is relevant to the claims and defenses of a lawsuit. The first steps include determining the key players in the litigation, the date ranges, how they communicated, and terms of art that they used. The context of communications should go beyond “search terms,” to what are the concepts at issue in the lawsuit so today’s eDiscovery software can truly be used as “technology-assisted review” to help lawyers find responsive ESI.

 Footnotes 

[1] In re Lithium Ion Batteries Antitrust Litig., 2015 U.S. Dist. LEXIS 22915, 48-49 (N.D. Cal. Feb. 24, 2015).

[2] Id., at *49.

[3] Id., at *55.

[4] Id., at *54.

[5] Id., at 54, citing Moore v. Publicis Groupe, 287 F.R.D. 182, 191 (S.D.N.Y. 2012).

[6] Id., at 54, citing Moore at 191, citing William A. Gross Constr. Assocs., Inc. v. Am. Mfrs. Mut. Ins. Co., 256 F.R.D. 134, 136 (S.D.N.Y. 2009).

[7] Id., at *55.

[8] Id.

[9] Id., at *55.

[10] Id., at 55-56.

Search Term Fights Over Proportionality

Attorneys have been fighting over search terms for years. Many times this fight is without expert advice, search efficiency reports, or any evidence to support arguments for or against proportionality.

I also think fighting over “search terms” is actually the wrong fight. The focus should be on search concepts and leveraging advanced analytics to identify relevant ESI.

binary-man-1280

Take the recent whistle blower case of Lutzeier v. Citigroup Inc. The Plaintiff sought an order to compel additional searches with the following terms:

(1) “Executive training” and/or “leadership development training program”; 

(2) “PEP” and/or “program expenditure proposal” and/or “internal control”,

(3) “OCC,” “office of comptroller of currency,” “FRB,” “federal reserve board,” and/or “consent order”;

(4) “Insufficient assurance”; and

(5) “Whistleblower,” “retaliate,” “retaliation,” “SOX,” “Sarbanes Oxley,” and/or “Dodd Frank.”

Lutzeier v. Citigroup Inc., 2015 U.S. Dist. LEXIS 11727, 20-21 (E.D. Mo. Feb. 2, 2015).

The Defendant argued that the search terms were “so common and generic” that the results would include a “significant volume of irrelevant documents that it is not sufficient to justify the additional burden.” Lutzeier, at *21. To be fair to the Defendant, they had a strong argument that on their face the Plaintiff’s search terms look broad.

The Defendant argued that the search terms for “Fred,” “Lutzeier,” “LOIS,” “COSMOS,” and “Champney” would produce the relevant ESI. Moreover, adding the proposed terms would add an additional 555,909 files, therefore the burden “greatly outweighs the likelihood that these searches will yield additional documents not already captured by Defendants’ search protocol.” Lutzeier at *21, citing Fed.R.Civ.P. 26(b)(2)(C)(iii)).

The Court agreed with the Defendants that the search terms were generic, excepted at the search term “consent order.” As such, the Court denied the Plaintiff’s request for additional searches, with the exception of one additional search term. Lutzeier at *21-22.

I do not like second guessing Courts, but I really do not like this result. We have many types of advance analytics for searching that go far beyond “search terms.” We can see communication patterns; identify date ranges; and key players making relevant communications. Moreover, the issue of finding responsive information is not one of “search terms,” but of “search concepts.” What sort of information supports a party’s claims or defense? What is relevant to that case? This goes beyond determining specific words to use, but specific concepts to find relevant ESI.

One of the challenges I see from discussing eDiscovery with litigators, is that many lawyers think that because they can perform legal research that they are competent to conduct advance searches of electronically stored information. While a podiatrist is competent to treat an ingrown toenail, that doctor is not competent to perform brain surgery (and the brain surgeon is not the right doctor to perform surgery on a broken ankle). Each is an expert in their respective fields. The same applies to lawyers and eDiscovery experts.

My friends who are eDiscovery experts would have suggestions on the Plaintiff’s search terms and counter arguments to the Defense objections (after sufficient education on the facts of the case). I’d wager they would develop search strings to narrow the Plaintiff’s search terms based on the Defendant’s search terms.

Maybe the parties had expert reports supporting their positions, but I cannot tell from the Court order (the fact the Defendant had an exact count of ESI files shows a factual argument was made to the Court, perhaps based on a search efficiency report). Regardless, there are ways to leverage the technology we have to find relevant and responsive ESI. I believe other options could have been used in this case, but would need more information to recommend a strategy.

18 Missing Email Messages is not like 18 Minutes of Missing Watergate Tape

Vintage Reel-to-Reel Tape PlayerRosemary Woods was not involved in this document production.

In motion practice over the adequacy of a production, the Plaintiffs were able to show that the Producing Party did not produce 18 email messages that were produced by a third-party.

As such, the Plaintiff sought production of search efficiency reports that was conducted as an audit and investigation by two law firms for the Producing Party. The goal was to identify the missing discovery from the production, opposed to the specifics of the Producing Party’s discovery efforts. Freedman v. Weatherford Int’l, 2014 U.S. Dist. LEXIS 133950, at *7-8 (S.D.N.Y.Sept. 12, 2014).

Judge Francis stated that “suggested remedy is not suited to the task,” because only three of the eighteen email messages would have been identified by the search terms used in the audit investigation. Freedman, at *8.

EmailAtSybolsThe Plaintiff’s argument focused on that the information produced by third-parties should have been produced by the Producing Party, opposed to whether the requested searches would have identified the missing information. Freedman, at *9. According to the Defendants, only one unproduced document would have been identified from the requested searches. Id.

The Producing Party had reviewed “millions of documents” and produced “hundreds of thousands” of documents that totaled nearly 4.4 million pages. Citing to the “proportionality rule” from the Federal Rules of Civil Procedure, the Court invoked the maxim the Rules “do not require perfection.” Freedman, at *9, citing Moore v. Publicis Groupe, 287 F.R.D. 182, 191 (S.D.N.Y. 2012).

The Court stated it was “unsurprising that some relevant documents may have fallen through the cracks;” however, the remedy sought would unlikely cure any production defects. As such, the Court denied the production of the search report. Freedman, at *9-10.

Bow Tie Thoughts

Judge Francis’ opinion puts very proportional view of challenging production adequacy: you first have to demonstrate a production is inadequate and then demonstrate the remedy makes sense. Proving three email messages were missing out of thousands does not justify conducting searches that would not produce any missing information. The remedy has to be proportional to problem.

Proportionality is a balancing of interests. Do 18 missing emails out of thousands justify new searches? Well, if those 18 emails were the key smoking gun messages, maybe. Ask Rosemary Woods or Lois Lerner about missing information.

The message I take away from Judge Francis’ opinion, is that if you have emails from a third-party that show a production deficiency, you have to offer a proportional remedy. If the offer is for additional searches that would only produce one or two of the missing emails, that is not a proportional remedy. Alternatively, if you proffer search terms that would have generated “hits” on 17 out of 18, maybe that would tip the scales in favor of running additional searches. At the end of the day, the effort sought has to have value to the case and not be an experiment in futility.

Nebraska, Where Proportionality is Alive and Well in Discovery

Nebraska stampOne lesson from United States v. Univ. of Neb. at Kearney, is that maybe you should take depositions of key parties and use interrogatories to find out relevant information to your case before asking for over 40,000 records that contain the personal information of unrelated third-parties to a lawsuit.

The case is a Fair Housing Act suit involving claims that students were prohibited or hindered from having “emotional assistance animals in university housing when such animals were needed to accommodate the requesting students’ mental disabilities.” United States v. Univ. of Neb. at Kearney, 2014 U.S. Dist. LEXIS 118073, 2 (D. Neb. Aug. 25, 2014).

A protracted battle over the scope of discovery broke out between the parties. The Defendants argued the search, retrieval, and review for responsive discovery was too expansive and would have been unduly burdensome. Kearney, at *5-6. As the Government’s search requests included “document* w/25 policy,” you can see the Defendant’s point on having broad hits to search terms. Kearney, at *20.

The Government’s revised search terms would have 51,131 record hits, which would have cost $155,574 for the Defendants to retrieve, review, and produce the responsive ESI. Kearney, at *5-6. This would have been on top of the $122,006 already spent for processing the Government’s requests for production. Kearney, at *7.

The Court noted that the Government’s search terms would have required production of ESI for every person with disability, whether they were students or contractors. Kearney, at *6-7. The Government argued the information was necessary, and justified, in order to show discriminatory intent by the Defendants. Id.

The Defendants wanted the scope of the discovery requests narrowed to the “housing” or “residential” content, which would have resulted in 10,997 responsive records. Kearney, at *7.

The Government did not want to limit the scope of discovery and recommended producing all the ESI subject to a clawback agreement [notice not a protective order] for the Government to search the ESI. The Defendants argued such an agreement would violate the Family Educational Rights and Privacy Act by disclosing student personal identifiable information without their notice and consent. Kearney, at *8.

Motion practice followed with the Defendant requesting cost shifting to the Government for conducting searches, the use of predictive coding software, and review hosting fees. Kearney, at *8-9.

The Court ordered the parties to answer specific discovery questions, which the Government did not answer, on “information comparing the cost of its proposed document retrieval method and amount at issue in the case, any cost/benefit analysis of the discovery methods proposed, or a statement of who should bear those costs.” Kearney, at *9.

The Court was not keen on the Government outright searching the personal data of others unrelated to the case. As the Court stated:

The public and the university’s student population may be understandably reluctant to request accommodations or voice their concerns about disparate or discriminatory treatment if, by doing so, their private files can be scoured through by the federal government for a wholly unrelated case. The government’s reach cannot extend that far under the auspices of civil discovery; at least not without first affording all nonparties impacted with an opportunity to consent or object to disclosure of information from or related to their files.

Kearney, at *18-19.

The Court stated it would not order the production of over 51,000 files with a clawback order. Moreover, the cost to review all of the ESI exceeded the value of the request. Kearney, at *19.

The Court did not accept the Government’s claim that it needed to conduct an expansive search. Kearney, at *19-20. The Court stated the following on the fundamentals of civil discovery:

Searching for ESI is only one discovery tool. It should not be deemed a replacement for interrogatories, production requests, requests for admissions and depositions, and it should not be ordered solely as a method to confirm the opposing party’s discovery is complete. For example, the government proposes search terms such as “document* w/25 policy.” The broadly used words “document” and “policy” will no doubt retrieve documents the government wants to see, along with thousands of documents that have no bearing on this case. And to what end? Through other discovery means, the government has already received copies of UNK’s policies for the claims at issue.

Kearney, at *20.

The Court further stated that “absent any evidence that the defendants hid or destroyed discovery and cannot be trusted to comply with written discovery requests, the court is convinced ESI is neither the only nor the best and most economical discovery method for, and depositions should suffice—and with far less cost and delay.” Kearney, at *21.

Bow Tie Thoughts

This case has significant privacy interests, but at its core the issue is one of proportionality. What was the cost of discovery and its benefit? In the end, the cost of expansive search terms that impacted the third party rights of others, outweighed the benefit of the discovery to the case.

The fact we have amazing search technology that can search electronic information does not mean we can forget how to litigate. The use of “search terms” cannot swallow the actual claims of a case.

It is heartening to see a Court say no to the data of unrelated third parties being enveloped into a discovery production. While there are many ways to show discrimination, requesting the electronically stored information, protected by Federal and most likely state law, of third parties should give any Court pause.

The use of predictive coding to focus the scope of discovery, or visual analytics to identify relevant information, or clustering to organizing similar information is fantastic technology to expedite review. However, the fact that technology exists still means lawyers have to use requests for admissions, interrogatories, and have requests narrowly tailored for responsive ESI.

 

How Not to Be Progressive: Court Rejects Predictive Coding Not Agreed to By Parties

RichardNixonStampFighting over discovery search methodology makes me think of President Richard Nixon’s resignation speech: “Always remember, there are those who hate you. And the only way to keep them from winning is to hate them right back. And then you destroy yourself.”

Attacking a party who used predictive coding to reduce discovery review time to save money and time, only to result in a larger collection of ESI produced upon the requesting party, means the requesting party now has more to review. This is the end result of Progressive Cas. Ins. Co. v. Delaney.   

I have never been a fan of agreeing to the use of predictive coding in ESI Protocols. I do not believe such agreements are required at all by the Federal Rules of Civil Procedure. The issue is whether or not a production is adequate. That requires the requesting party offering evidence that the production is somehow delinquent with facts. That discussion does not happen without first having a production.

Parties do not have a veto power over which review application is used by a reviewing party and what features they may or may not use.  The requesting party should not attack a review methodology, unless there is a production to challenge. The entire discussion is premature and the issue is not ripe for the Court.

In the case at bar, the original dataset was narrowed by search terms to 565,000 “hits” from the original 1.8 million dataset. This search term methodology had been agreed to in an ESI Protocol by the parties. After one month of document review, the producing party realized that it could take 6 to 8 months to manually review the narrowed dataset for responsiveness or privilege. The party unilaterally decided to use predictive coding instead. Moreover, after telling the other side about their change in technology, motion practice followed. Progressive Cas. Ins. Co. v. Delaney, 2014 U.S. Dist. LEXIS 69166.

The Requesting Party wanted the Producing Party to 1) produce the 565,000 culled from the 1.8 million data set using the parties’ agreed-upon search terms subject to a clawback for privileged documents, or 2) the Producing Party apply the predictive coding methodology to the entire 1.8 million record dataset. Progressive, at *12 and *15.

The Producing Party did not want to do either approach, specifically concerned that searching the entire dataset would result in a larger privilege review.

The Court noted what has been judge-made law on using technology assisted review: Courts require the producing party to provide the requesting party with “full disclosure about the technology used, the process, and the methodology, including the documents used to “train” the computer.” Progressive, at *27-28, citing Da Silva Moore 2012 U.S. Dist. LEXIS 23550 (S.D.N.Y. Feb. 24, 2012).

The Court ordered the producing party to follow the original agreed to protocol and produce the “hit” documents to the Requesting Party within fourteen days without further review. Progressive, at *30.

The Court stated that following the Requesting Party’s protocol, the cost of review would be shifted to them. Progressive, at *31. Moreover, the Requesting Party believed they had the manpower to complete the review within one month. Id. 

The Requesting Party could apply privilege filters before production and produce a privilege log. Id. 

Bow Tie Thoughts

I do not encourage clients to have ESI Protocols that limit their ability to review discovery efficiently. I also strongly argue against the idea you need approval from the opposing party on what search methodology you can use to identify responsive discovery. It is the producing party’s right to use keywords, concept search, visual analytics, email threading, clustering, find similar, or any other form of technology-assisted review. That includes predictive coding. The requesting party does not get a veto power over what technology the producing party can use. The requesting party has “the burden of proving that a discovery response is inadequate.” Abt v. Jewell, 2014 U.S. Dist. LEXIS 50766, 12-14 (D.D.C. Apr. 11, 2014), citing Barnes v. D.C., 289 F.R.D. 1, 6 (D.D.C. 2012) and Equal Rights Ctr. v. Post Props., Inc., 246 F.R.D. 29, 32 (D.D.C. 2007).

You cannot prove a discovery response is inadequate if you do not have a discovery response. The entire idea of attacking the use of predictive coding is premature if there is no production to be reviewed.

The ugly wrinkle in this case is the search and production methodology mandated in the ESI Protocol agreed to by the parties. I strongly encourage parties to not agree to actual technological steps to be used, because it limits the ability to conduct discovery in a cost effective manner. These agreements are often enacted without the advice of an eDiscovery Consultant who knows how the actual technology works.

What parties should discuss and codify in an ESI Protocol is the subject matter of the lawsuit. Who are the key players? What are the relevant date ranges? What are the terms of art used by the parties? What is the data actually making up the case? Those, and many others, are the topics parties need to agree to, not whether you can use visual analytics to identify date ranges or data clustering to determine what was relevant.

The use of predictive coding has been twisted because lawyers think they need permission to use it. They do not. The burden is on the opposing side to prove a production was inadequate, not the fact the producing party used one form of technology-assisted review over another.

 

Conferring on Keywords & A Musical Judge

You know someone is taking a hit when a judge works in references to Pink Floyd’s  “Sorrow” and Simon and Garfunkel’s “Sounds of Silence.” Sound the Division Bell, because there is a Bridge Over Troubled Water.

Embed from Getty Images

Magistrate Judge Jonathan Goodman did exactly that in Procaps S.A. v. Patheon Inc., 2014 U.S. Dist. LEXIS 35225 (S.D. Fla. Mar. 18, 2014) because of an attorney responding to opposing counsel’s emails with cryptic messages on search terms and their Spanish translations because of a party located in Columbia. Some messages were simply not answered. As such, a motion to compel was filed over search terms. At the hearing, the non-moving party represented they had been in contact with their custodians about search terms.

The short story of what happened is that lead attorney did not get any input on search terms from his clients’ custodians before the moving party filed their motion to compel. Procaps S.A., at 10-11.

Even though the attorney broke their silence over conferring on search terms, the Court ordered the following:

Nevertheless, to the extent that there is any doubt about whether this Court is imposing the requirement and to generate a ruling on which a fees award can be based, the Court grants Patheon’s motion and requires Procaps to have its counsel obtain search word input from all the ESI custodians. The Court may enter further orders on the search term methodology and/or the specifics of the search term list after receiving additional information from Procaps about the search term protocol.

Procaps S.A., at *14.

The Court awarded the prevailing party $3,750 in costs (down from $5000). Procaps S.A., at *16. The Court ordered the lead attorney to pay $1,000 of the award, because “his non-responsive and/or vague email responses triggered this discovery dispute.” Procaps S.A., at *17.  The Court further noted statements on the firm’s website about attorneys with eDiscovery experience who were involved in the case and “urged (though not required) to explore which other attorneys (besides lead trial counsel) caused, or helped  cause, this discovery motion and to determine whether those other attorneys (rather than the firm itself) should pay all or some of the $2,750 fees award now allocated for law firm responsibility (or some of the $1,000 awarded against lead trial counsel).” Procaps S.A., at *17-18.

Bow Tie Thoughts

I respect Jonathan Goodman’s command of the law and use of song references to make his point on an attorney’s silence in conferring over search terms.

Attorneys cannot be another brick in the wall of un-cooperation over Rule 26(f) conferences. eDiscovery requires parties to both 1) confer with the opposing side on the subject matter of the case and 2) conduct client/custodian interviews to determine how technology is used, terms of art, methods of communication and anything else relevant to determine “search terms.”

Discussing “search terms” is not dead because of “predictive coding.” Attorneys still have to know the subject matter of the case in order to use ANY type of advanced analytics so the analysis has content. Lawsuits are about specific issues which have to be defined in order for “technology assisted review,” “visual analytics” or any other form of data clustering to make sense. These topics must be discussed for in order to identify relevant ESI, so efforts are not lost on the dark side of the Moon.

Guess What? Cooperation Does Not Mean Privilege or Relevancy Are Dead

Here is the big lesson from the latest Biomet opinion over predictive coding:

The Steering Committee wants the whole seed set Biomet used for the algorithm’s initial training. That request reaches well beyond the scope of any permissible discovery by seeking irrelevant or privileged documents used to tell the algorithm what not to find. That the Steering Committee has no right to discover irrelevant or privileged documents seems self-evident.

United States District Court Judge Robert Miller, In re Biomet M2a Magnum Hip Implant Prods. Liab. Litig., 2013 U.S. Dist. LEXIS 172570, at *3 (D. Ind. 2013).

One word: Good.

Cooperation does not mean attorney work product is eviscerated when discussing predictive coding. Moreover, if ESI is not relevant, why drive up discovery costs in reviewing it?  Furthermore, Federal Rule of Civil Procedure Rule 26(b)(1) does not allow a requesting party to find out how the producing party used ESI before its production. Biomet, at *4.

The opinion goes on to discuss Biomet’s position that it had produced all discoverable documents to the Steering Committee. However, this is where Judge Miller made a judicial warning: Biomet did not need to identify its seed set, but the “unexplained lack of cooperation in discovery can lead a court to question why the uncooperative party is hiding something, and such questions can affect the exercise of discretion.” Biomet, at *5-6.

The Court held it would not order Biomet to disclose its seed set, but did “urge” them to “re-think its refusal.” Biomet, at *6.

Bow Tie Thoughts

There is no good answer to the issue in this case. Technology issues should be worked out by experts in a non-combative way when it comes to production formats, scope of data, date ranges, custodians and other objective factors in conducting a search. Courts really do not want to get sucked into it. However, one issue since Da Silva Moore v Publicis Groupe & MSL Group is the idea that parties need to have transparent process that both sides agree to for predictive coding. I do not think the Federal Rules of Civil Procedure require such disclosures at all. Moreover, it intrudes into attorney work product.

What is the answer? I would offer a requesting party to demonstrate there is a production gap or otherwise show how the production is deficient. This easily escalates into a quagmire over discovery about discovery. No body wins when that happens.

As for a producing party, I would not take a position that could incur the wrath of a Court if the requesting party later demonstrates a production was deficient.

Triangulating Discovery Productions

Judge William Orrick summed up a basic truth of eDiscovery: In the age of electronically-stored information (“ESI”), production of all relevant, not privileged and reasonably accessible documents in a company’s custody and control is easier said than done. Banas v. Volcano Corp., 2013 U.S. Dist. LEXIS 144139, at *5 (N.D. Cal. Oct. 4, 2013).

ChartDividersThe Defendant in Banas had produced 225,000 documents in a rolling production through “triangulation.”

This methodology involved selecting subsets of employees likely to have relevant information, those who sent or received information, those who could have been involved in the case and those “most likely” to have relevant information. Banas at *4.

The Court stated this approach could have been reasonable, but two problems emerged: 1) The methodology was never discussed or agreed to with the Plaintiff; and 2) Multiple deponents did not have their email searched prior to their depositions. Banas at *4-5.

The Plaintiff also had a hard drive that contained ESI that was not produced by the Defendant. Id. 

The Court ordered the supplement search and production of ESI from the deponents whose ESI had not been searched. Banas at *6.

The Court stated that the Defendant’s search methodology was not unreasonable or designed to conceal information. Banas at *7. However, as the production was conducted on a rolling basis, the Plaintiff could not have been immediately aware of any production gaps. Banas at *6.

As such, supplemental discovery was reasonable.

The Court also highlighted the Northern District of California’s model order requiring parties to meet and confer over the search of ESI prior to responding to a discovery request. This is one of the first opinions to reference the model order. Following the model order is highly advisable for anyone in the Northern District of California. It also has very good best practices for any attorney to consider in a case with electronically stored information.

Bow Tie Thoughts

Identifying relevant custodians and their electronically stored information requires using technology and strategy. Many can feel like it is trying to find a teardrop in the ocean.  

Analyzing communication patterns, clustering email based on domain names or conducting searches based on date ranges and subject matter are just a handful of ways to identifying ESI that could support a parties claims or defenses. Running searches based on discovery requests is another.

I recently had a product demonstration of Kroll Ontrack’s eDiscovery.com Review. Below you can see the features of this product can help search for responsive ESI.

Banas v. Volcano Corp., also has a very important message about the meet and confer process. Parties really should discuss what information is subject to the lawsuit, relevant custodians and search methodologies. While I do not agree with the idea of discussing what tools parties should use, because it can cause unnecessary fighting, agreeing on how ESI will be identified certainly does not hurt between educated attorneys.