What Did We Learn About eDiscovery in 2014?

As 2014 draws to a close, it is time to reflect on the cases from this year in eDiscovery. One of the biggest trends I took away from caselaw in 2014, is that more Judges have a greater understanding of eDiscovery, resulting in practical opinions.

Here are the practice areas I found to be the most interesting in 2014, which can be heard in full on my 2014 eDiscovery Year in Review on iTunes or Buzzsprout (Presented by Paragon):

Application of Proportionality Analysis

Judges Questioning Why The Court Was Asked Permission to Use Predictive Coding

We still have Form of Production issues eight years after the 2006 eDiscovery Amendments to the Federal Rules of Civil Procedure

The Importance of Documenting Services for Taxation of Costs

What will 2015 hold for us in the world of electronic discovery? I think we will see proportionality analysis focus on the value of the information sought in relation to the case and not solely just the cost of the discovery. Parties will have to explain how the information is useful, such as how it relates to a claim, opposed to merely saying, “It is expensive.” This will require counsel to focus on the merits of the case and how the requested discovery will help advance the litigation.

I personally hope litigants stop asking Judges for permission to use predictive coding. No one asks, “Can I de-dup the data? Is it ok to use clustering? May I please use conceptual search in addition to keywords?”

The issue with all productions is whether or not the production is adequate. In my view, parties going to war over predictive coding as a means to review electronically stored information is asking the Court to issue an advisory opinion. The time to fight is when the there actually is a dispute because a production is lacking, instead of engaging in arguments of how much a human being can read in an hour compared to a computer algorism.

To learn more on the issues from the past year, please check out my 2014 eDiscovery Year in Review audio podcast on iTunes or Buzzsprout.

I want to thank Paragon for sponsoring the 2014 eDiscovery Year in Review. Please check out their website and recent blog post on the Convergence of eDiscovery and Information Security to learn more about their services.

Why Deviate from Native Files in a Case Management Order?

ConjectureThere are Case Management Orders that show parties spent a lot of time considering eDiscovery issues. There are the ones that show a lack of thought. There are ones that are mixed.

This one shows a lot of forethought, but I am puzzled by the form of production.

Technology Assisted Review is Good for You and Me

There is nothing magical about using Technology Assisted Review. There is also no rule requiring specific technology to find responsive electronically stored information. The issue is always one of whether a production was adequate.

The Case Management Order in Green v. Am. Modern Home Ins. Co., states the following on Technology Assisted Review:

  1. Technology Assisted Review in Lieu of Search Terms. In lieu of identifying responsive ESI using the search terms and custodians/electronic systems as described in Sections II.C & II.D above, a party may use a technology assisted review platform to identify potentially relevant documents and ESI.

Green v. Am. Modern Home Ins. Co., 2014 U.S. Dist. LEXIS 165956, 4 (W.D. Ark. Nov. 24, 2014).

I would argue such a decree in a Case Management Order is unnecessary under the Federal Rules of Civil Procedure and case law, but such a specific order should preemptively end any question on whether predictive coding, data analytics, “find similar,” conceptual search, and any other available search technology can used in the case.

The Form of Production

I am not a fan of converting native files to TIFFS and conversion to OCR, absent the need to redact confidential or privileged information. That is exactly what this order proscribed, minus spreadsheets:

  1. Format. All ESI, other than databases or spreadsheets, shall be produced in a single- or multi-page 300 dpi TIFF image with a Concordance DAT file with standard delimiters and OPT file for image loading. The documents shall also be processed through Optical Character Recognition (OCR) Software with OCR text files provided along with the production. Extracted Text shall be provided for all documents unless it cannot be obtained. To the extent a document is redacted, OCR text files for such document shall not contain text for the redacted portions of the document. Each TIFF image will be assigned a Bates number that: (1) is unique across the entire document production; (2) maintains a constant length across the entire production padded to the same number of characters; (3) contains no special characters or embedded spaces; and (4) is sequential within a given document. If a Bates number or set of Bates numbers is skipped in a production, the Producing Party will so note in a cover letter or production log accompanying the production. Each TIFF image file shall be named with the Bates Number corresponding to the number assigned to the document page contained in that image. In the event a party determines that it is unableto produce in the format specified in this section without incurring unreasonable expense, the parties shall meet and confer to agree upon an alternative format for production.
  1. Metadata. To the extent that any of the following metadata fields associated with all applicable documents are available, the Producing Party will produce those metadata fields to the Requesting Party: file name, file size, author, application date created, file system date created, application date last modified, file system date last modified, date last saved, original file path, subject line, date sent, time sent, sender/author, recipient(s), copyee(s), and blind copyee(s). For emails with attachments, the Producing Party will indicate when a parent-child relationship between the message and the attachment exists. A Producing Party shall also produce a load file with each production with the following fields: Starting Bates; Ending Bates; Begin Attach; End Attach; and Source (custodian/location from which document was collected). If any metadata described in this section does not exist, is not reasonably accessible, is not reasonably available, or would be unduly burdensome to collect or provide, nothingin this ESI Order shall require any party to extract, capture, collect or produce such metadata.

Green, 4-7.

The order does included extracted text, but why go to the trouble of requiring production as TIFFs in the first place? The statement about OCR could be misconstrued to requiring OCRing the TIFFs when any searchable information is already available on the form of extracted text, thus OCRing is both redundant and adds cost. The only reason to OCR a TIFF is because it needs to be redacted, because producing extracted text would inadvertently produce the redacted content.

Most review applications today do a great job of ingesting native files and allowing users to review in near-native. If the native file needs to be accessed, most applications allow for reviewing the native within the review application or a copy downloaded for review in the native application.

Requiring conversion to static images is not the default of Federal Rule of Civil Procedure Rule 34. I do not recommend requiring conversion to TIFF for production, unless there is a substantial amount of redactions that must take place.

There are many types of metadata, from embedded, to substantive, to system. The above order reflects metadata as it was objective coding, seeking specific information. While all useful information, I would encourage parties to think in more terms of types of metadata, in addition to how the information should appear in a review application.

Spreadsheets in Native File Format

The order stated the following on spreadsheets:

  1. Spreadsheets. Absent special circumstances, Excel files, .csv files and other similar spreadsheet files will be produced in native format (“Native Files”). Native Files will be provided in a self-identified “Natives” directory. Each Native File will be produced with 6a corresponding single-page TIFF placeholderimage, which will contain language indicating that the document is being produced as a Native File. Native Files will be named with the beginning Bates number that is assigned to that specific record in the production. A “NativeLink” entry for each spreadsheet will be included in the .DAT load file indicating the relative file path to each native file on the Production Media. Native Files will be produced with extracted text and applicable metadata fields if possible and consistent with Section III.A.2 above. For documents that contain redacted text, the parties may either apply the redactions directly on the native file itself or produce TIFF image files with burned-in redactions in lieu of a Native File and TIFF placeholder image. Each Producing Party will make reasonable efforts to ensure that Native Files, prior to conversion to TIFF, reveal hidden data from redacted Native Files that are produced as TIFF image files and will be formatted so as to be readable. (For example, column widths should be formatted so that numbers do not appear as “#########”.) Under these circumstances, all single-page TIFF images shall include row and column headings.

Green, at *8-9.

I am glad the default for spreadsheets did not deviate from the Rule 34. I am curious if any of my case manager friends would agree with the order requiring TIFF placeholders and renaming the native files.

The past year has seen parties become more detailed in their case management orders regarding electronically stored information. This is a good thing. However, I strongly encourage parties to not deviate from the Federal Rules of Civil Procedure without reason, leverage the search abilities of their review applications, and make sure the case management order helps the case comply with Federal Rule of Civil Procedure Rule 1.

Even A Judged Questioned Why Ask for Permission to Use Predictive Coding

HighFiveI do not normally want to high five Federal judges, but Judge Ronald Buch, a Tax Judge in Texas, sure deserved one after his Dynamo Holdings opinion.

The discovery dispute can be summed up as a battle over backup tapes that had confidential information. The Requesting Party wanted the tapes; the Producing Party wanted to use predictive coding to produce what was relevant, because the cost for reviewing the material for privilege and relevancy would cost $450,000 with manual review. Dynamo Holdings v. Comm’r, 2014 U.S. Tax Ct. LEXIS 40 (Docket Nos. 2685-11, 8393-12. Filed September 17, 2014.)

The Requesting Party wanted the backup tapes to analyze metadata on when ESI was created. Moreover, the Requesting Party called “Predictive Coding” an “unproven technology.” The Requesting Party attempted to address the Producing Party’s cost concern with a clawback agreement. Dynamo Holdings, at *3.

After an evidentiary hearing with experts on the use of predictive coding, the Court granted the Producing Party’s motion to use predictive coding. Judge Buch had a “dynamo” quote on the entire issue of asking to use predictive coding:

 “And although it is a proper role of the Court to supervise the discovery process and intervene when it is abused by the parties, the Court is not normally in the business of dictating to parties the process that they should use when responding to discovery. If our focus were on paper discovery, we would not (for example) be dictating to a party the manner in which it should review documents for responsiveness or privilege, such as whether that review should be done by a paralegal, a junior attorney, or a senior attorney. Yet that is, in essence, what the parties are asking the Court to consider–whether document review should be done by humans or with the assistance of computers. Respondent fears an incomplete response to his discovery. If respondent believes that the ultimate discovery response is incomplete and can support that belief, he can file another motion to compel at that time. Nonetheless, because we have not previously addressed the issue of computer-assisted review tools, we will address it here.

Dynamo Holdings, at *10-11.

It is so refreshing to see a Judge address the issue of requesting to use a specific technology. No one does a motion to compel asking for permission on what lawyers should do document review. Moreover, no moving party asks permission to use visual analytics, de-duplication, or any of the other outstanding technology available to conduct eDiscovery.

The opinion ends with that if the Requesting Party believed the discovery response was incomplete, then a motion to compel could be filed, which is exactly the way the process should work. The issue should not be “can we use this technology,” whether the production is adequate or not, which requires evidence of a production gaps or other evidence that not all responsive information was produced.

Well done Judge Buch.

 

Nebraska, Where Proportionality is Alive and Well in Discovery

Nebraska stampOne lesson from United States v. Univ. of Neb. at Kearney, is that maybe you should take depositions of key parties and use interrogatories to find out relevant information to your case before asking for over 40,000 records that contain the personal information of unrelated third-parties to a lawsuit.

The case is a Fair Housing Act suit involving claims that students were prohibited or hindered from having “emotional assistance animals in university housing when such animals were needed to accommodate the requesting students’ mental disabilities.” United States v. Univ. of Neb. at Kearney, 2014 U.S. Dist. LEXIS 118073, 2 (D. Neb. Aug. 25, 2014).

A protracted battle over the scope of discovery broke out between the parties. The Defendants argued the search, retrieval, and review for responsive discovery was too expansive and would have been unduly burdensome. Kearney, at *5-6. As the Government’s search requests included “document* w/25 policy,” you can see the Defendant’s point on having broad hits to search terms. Kearney, at *20.

The Government’s revised search terms would have 51,131 record hits, which would have cost $155,574 for the Defendants to retrieve, review, and produce the responsive ESI. Kearney, at *5-6. This would have been on top of the $122,006 already spent for processing the Government’s requests for production. Kearney, at *7.

The Court noted that the Government’s search terms would have required production of ESI for every person with disability, whether they were students or contractors. Kearney, at *6-7. The Government argued the information was necessary, and justified, in order to show discriminatory intent by the Defendants. Id.

The Defendants wanted the scope of the discovery requests narrowed to the “housing” or “residential” content, which would have resulted in 10,997 responsive records. Kearney, at *7.

The Government did not want to limit the scope of discovery and recommended producing all the ESI subject to a clawback agreement [notice not a protective order] for the Government to search the ESI. The Defendants argued such an agreement would violate the Family Educational Rights and Privacy Act by disclosing student personal identifiable information without their notice and consent. Kearney, at *8.

Motion practice followed with the Defendant requesting cost shifting to the Government for conducting searches, the use of predictive coding software, and review hosting fees. Kearney, at *8-9.

The Court ordered the parties to answer specific discovery questions, which the Government did not answer, on “information comparing the cost of its proposed document retrieval method and amount at issue in the case, any cost/benefit analysis of the discovery methods proposed, or a statement of who should bear those costs.” Kearney, at *9.

The Court was not keen on the Government outright searching the personal data of others unrelated to the case. As the Court stated:

The public and the university’s student population may be understandably reluctant to request accommodations or voice their concerns about disparate or discriminatory treatment if, by doing so, their private files can be scoured through by the federal government for a wholly unrelated case. The government’s reach cannot extend that far under the auspices of civil discovery; at least not without first affording all nonparties impacted with an opportunity to consent or object to disclosure of information from or related to their files.

Kearney, at *18-19.

The Court stated it would not order the production of over 51,000 files with a clawback order. Moreover, the cost to review all of the ESI exceeded the value of the request. Kearney, at *19.

The Court did not accept the Government’s claim that it needed to conduct an expansive search. Kearney, at *19-20. The Court stated the following on the fundamentals of civil discovery:

Searching for ESI is only one discovery tool. It should not be deemed a replacement for interrogatories, production requests, requests for admissions and depositions, and it should not be ordered solely as a method to confirm the opposing party’s discovery is complete. For example, the government proposes search terms such as “document* w/25 policy.” The broadly used words “document” and “policy” will no doubt retrieve documents the government wants to see, along with thousands of documents that have no bearing on this case. And to what end? Through other discovery means, the government has already received copies of UNK’s policies for the claims at issue.

Kearney, at *20.

The Court further stated that “absent any evidence that the defendants hid or destroyed discovery and cannot be trusted to comply with written discovery requests, the court is convinced ESI is neither the only nor the best and most economical discovery method for, and depositions should suffice—and with far less cost and delay.” Kearney, at *21.

Bow Tie Thoughts

This case has significant privacy interests, but at its core the issue is one of proportionality. What was the cost of discovery and its benefit? In the end, the cost of expansive search terms that impacted the third party rights of others, outweighed the benefit of the discovery to the case.

The fact we have amazing search technology that can search electronic information does not mean we can forget how to litigate. The use of “search terms” cannot swallow the actual claims of a case.

It is heartening to see a Court say no to the data of unrelated third parties being enveloped into a discovery production. While there are many ways to show discrimination, requesting the electronically stored information, protected by Federal and most likely state law, of third parties should give any Court pause.

The use of predictive coding to focus the scope of discovery, or visual analytics to identify relevant information, or clustering to organizing similar information is fantastic technology to expedite review. However, the fact that technology exists still means lawyers have to use requests for admissions, interrogatories, and have requests narrowly tailored for responsive ESI.

 

Does Proportionality Disappear If a Lawyer Says “Predictive Coding” Three Times?

The In re Bridgepoint Education case is not one about the merits of predictive coding, but one of proportionality over expanding the scope of discovery by nine months. In re Bridgepoint Educ., 2014 U.S. Dist. LEXIS 108505, 10-11 (S.D. Cal. Aug. 6, 2014). While the cost of document review and the use of predictive coding have a starring role in the opinion, let’s not forget the second discovery dispute in the case ultimately is about proportionality.

The Defendants claimed that expanding the scope of discovery by nine months would increase their review costs by 26% or $390,000 (based on past review efforts in the case). In re Bridgepoint Educ., at *6-7.

Computer-Search-Magic

The Plaintiffs countered that the review costs would more likely be $11,279, because of the predictive coding system the Defendants would use instead of manual review. In re Bridgepoint Educ., at *7.

The Defendants countered that “predictive coding” did not make “manual review” for relevance elective, because the predictive coding software assigned a percentage estimate to each record on the record’s probability of being relevant. Id. As such, attorney review is still required for relevance and privilege review. Id.

The Court denied expanding the scope of discovery by nine months based on the “proportionality” rule of Federal Rule of Civil Procedure Rule 26(b)(2)(C). The Rule states a Court can limit discovery if the “burden or expense of the proposed discovery outweighs the likely benefit.” In re Bridgepoint Educ., at *9-10.

The Court found expanding the scope to be unduly burdensome. Moreover, while there might have been relevant information in the expanded timeframe, the Court agreed with the Defendants that relevant information would be in the originally agreed timeframe. In re Bridgepoint Educ., at *10-11.

Predictive Coding was also at the center of the fourth discovery dispute. The Plaintiffs argued discovery produced from three Individual Defendants should be added to the Defendants’ predictive coding software. In re Bridgepoint Educ., at *12. According to the Plaintiffs, the Defendants “unilaterally-selected search terms” to identify the original production. Id.

The Defendants argued their review process for the original production was reasonable. Moreover, adding the original production to the predictive coding process could “negatively impact the reliability of the predictive coding process.” In re Bridgepoint Educ., at *12-13. However, the Defendants were willing to run additional searches on the Individual Defendants’ production. In re Bridgepoint Educ., at *13.

The Court noted that the Defendants’ linear search methodology for the three Individual Defendants had been approved by the Court. As such, the Court ordered the parties to meet and confer on additional search terms on the original production for the Individual Defendants. Id.

Bow Tie Thoughts

In re Bridgepoint Education is an interesting spin on predictive coding cases, because effectively the REQUESTING party is arguing for the producing party to use predictive coding to reduce proportionality issues.

Businessman managing electronic documentsFirst things first: Saying, “predictive coding” has no magical properties. Nor will review costs decline by getting the opposing party to say Rumpelstiltskin. Even if an attorney is somehow tricked into saying Mister Mxyzptlk’s name backwards in a hearing, a Court will always be concerned about proportionality before expanding the scope of discovery.

Proportionality will always be concerned with cost of review, but discovery review does not exist independent of the case. The scope of discovery should not be expanded because the cost of review can be reduced leveraging advanced search capabilities alone. The issue is whether there are relevant records in the expanded universe and if the “burden or expense of the proposed discovery outweighs the likely benefit.”

Finally, just because one search methodology was used to identify records over another does not devalue the responsiveness of the production. If a requesting party is concerned with the adequacy of a production, challenge it accordingly by showing production gaps or other evidence to demonstrate the production is inadequate.

Stuck in the Predictive Coding Pipeline

ExxonMobil Pipeline had a problem in discovery: their discovery responses were overdue. The requests for production was served in November 2013 and due after one extension in January 2014. The Plaintiffs rightly brought a motion to compel.

The Defendants had enough discovery to give most eDiscovery attorneys a migraine with a nosebleed: 16 separate lawsuits, with 165 discovery requests in one case, a total of 392 requests in all the related cases, and 83 custodians with approximately 2.7 million electronic documents. Other discovery going back to 1988 had over 63,000 paper documents that were scanned and to be searched with keywords. Additionally, there were approximately 630,000-800,000 documents that had to be reviewed for responsiveness, confidentiality, and privilege. The Defendants had produced 53,253 documents consisting of over 191,994 pages. United States v. ExxonMobil Pipeline Co., 2014 U.S. Dist. LEXIS 81607, 5-8 (E.D. Ark. June 9, 2014).

pipeline

The Defendants suggested using predictive coding in light of the large volume of discovery, but the Plaintiff the United States did not agree with the use of predictive coding (at least since the filing of the motions). ExxonMobil Pipeline, at *6. Moreover, the parties did not seek relief from the Court on the use of predictive coding, other than to order the parties meet and confer. ExxonMobil Pipeline, at *6-7.

The Defendants explained that using traditional review with 50 attorneys that document review could be completed by the end of June 24 and production by the end of August 2014. ExxonMobil Pipeline, at *6.

The United States disagreed with the Defendants assumption of lawyers only reviewing 250 documents/files a day. Moreover, the Defendants did not raise concerns about document review when they entered an agreed upon scheduling order in October 2013. ExxonMobil Pipeline, at *6-7.

The Court acknowledged that the Defendants had a large volume of discovery to review. Moreover, it was unclear if the parties had agreed to a review methodology before the Court issued its order. Regardless, the Court ordered the Defendants to complete their review and production by July 10, 2014, absent good cause. ExxonMobil Pipeline, at *7-8.

Bow Tie Thoughts

Most attorneys do not think about document review strategies at the beginning of a case. They should. Discovery is the backbone of civil litigation. Unless you know the information you have to review, strategies to maximize efficiency, and reviewing for claims or defenses, document review can be a nightmare experience.

This case does not go into why the Defendants sought agreement from the Plaintiff on the use of predictive coding. I do not agree with that strategy, unless a specific review protocol was ordered at the Rule 16 conference that the producing party wanted to change.

The issue with a document production is whether or not the production is adequate. Lawyers should agree to the subject matter of the case, custodians, data ranges, and other objective information that goes to the merits of the lawsuit. When lawyers start asking each other for permission on whether they can use predictive coding, visual analytics, clustering, email threading, or any other technology, civil litigation becomes uncivil. Case in point: the Plaintiffs argued the Defendants could review more than 250 documents a day in this case. Such disputes turn into an academic fight over how much lawyers can read and analyze in a 9-hour workday. The end result of such motion practice would be a Judge ordering lawyers to read faster.

My advice is to focus on the merits and not derail the case with a fight over what review technology can be used. Fight over whether the production is adequate, not what whether you can use predictive coding.

Guess What? Cooperation Does Not Mean Privilege or Relevancy Are Dead

Here is the big lesson from the latest Biomet opinion over predictive coding:

The Steering Committee wants the whole seed set Biomet used for the algorithm’s initial training. That request reaches well beyond the scope of any permissible discovery by seeking irrelevant or privileged documents used to tell the algorithm what not to find. That the Steering Committee has no right to discover irrelevant or privileged documents seems self-evident.

United States District Court Judge Robert Miller, In re Biomet M2a Magnum Hip Implant Prods. Liab. Litig., 2013 U.S. Dist. LEXIS 172570, at *3 (D. Ind. 2013).

One word: Good.

Cooperation does not mean attorney work product is eviscerated when discussing predictive coding. Moreover, if ESI is not relevant, why drive up discovery costs in reviewing it?  Furthermore, Federal Rule of Civil Procedure Rule 26(b)(1) does not allow a requesting party to find out how the producing party used ESI before its production. Biomet, at *4.

The opinion goes on to discuss Biomet’s position that it had produced all discoverable documents to the Steering Committee. However, this is where Judge Miller made a judicial warning: Biomet did not need to identify its seed set, but the “unexplained lack of cooperation in discovery can lead a court to question why the uncooperative party is hiding something, and such questions can affect the exercise of discretion.” Biomet, at *5-6.

The Court held it would not order Biomet to disclose its seed set, but did “urge” them to “re-think its refusal.” Biomet, at *6.

Bow Tie Thoughts

There is no good answer to the issue in this case. Technology issues should be worked out by experts in a non-combative way when it comes to production formats, scope of data, date ranges, custodians and other objective factors in conducting a search. Courts really do not want to get sucked into it. However, one issue since Da Silva Moore v Publicis Groupe & MSL Group is the idea that parties need to have transparent process that both sides agree to for predictive coding. I do not think the Federal Rules of Civil Procedure require such disclosures at all. Moreover, it intrudes into attorney work product.

What is the answer? I would offer a requesting party to demonstrate there is a production gap or otherwise show how the production is deficient. This easily escalates into a quagmire over discovery about discovery. No body wins when that happens.

As for a producing party, I would not take a position that could incur the wrath of a Court if the requesting party later demonstrates a production was deficient.

Clawback Agreements to Eliminate the Burden of Privilege Review

F14DiveI am gonna to hit the brakes, he’ll fly right by me.

Maverick, Top Gun

You’re gonna do what?

Merlin, Top Gun

Attorneys feel compelled by their duty of competency to their clients to have eyes on every document that might be privileged.

SurprisedWoman

Given cases such as J-M Manufacturing it is easy to see why attorneys are concerned about privilege review in large document review cases.

If an associate attorney were to tell a partner she could rely on a clawback agreement and not conduct privilege review, it is a safe bet the partner would respond as Merlin did to Maverick in Top Gun.

In re Coventry Healthcare, Inc. v. This Document Relates involved issues of undue burden in producing electronically stored information. The Defendants argued that the Plaintiff’s proposed search terms produced “hits” of nearly 200,000. The Defendants claimed the estimated cost to process, host and review the data for responsiveness and privilege was approximately $388,000. In re Coventry Healthcare, Inc. v. This Document Relates, 2013 U.S. Dist. LEXIS 39050, at *14 (D. Md. Mar. 21, 2013). Accordingly, the Defendants sought relief under the proportionality principles of Rule 26(b)(2)(C)(iii).

The parties to their credit had negotiated over search terms to ease the burden on the Defendants (and ultimately themselves in what the Plaintiffs would have to review).

The Court made what some would consider a surprising statement: [A] clawback order can protect Defendants against a claim of waiver, such that Defendants need no longer bear the cost of reviewing the ESI for responsiveness and privilege. In re Coventry Healthcare, Inc., at *16. As such, the Defendants failed to show undue burden in producing electronically stored information.

Crusher Claw

The Court’s ruling is not the first time such an order was given. As stated in footnote 6, former Magistrate Judge (now District Court Judge) Paul Grimm found that “the more practical approach is to avoid the necessity of an expensive and time-consuming privilege review by entry of a court order with a clawback provision that protects against a claim of waiver by production of a privileged document.” In re Coventry Healthcare, Inc., at 14-15, fn 6, citing Hopson v. Mayor & City Council of Baltimore, 232 F.R.D. 228 (D. Md. 2005).

The Court granted the Plaintiff’s motion to compel and invited the Defendants to confer with the Plaintiffs on a clawback order if they wanted one.

Bow Tie Thoughts

There are several thought leaders who have argued that the cost of privilege review can be eliminated with protective orders and clawback agreements. While having both is definitely a good idea, many attorneys feel terrified at not conducting some sort of privilege review.

One option to privilege review is to leverage technology-assisted review in identifying potentially privileged ESI. I would still always recommend a clawback agreement and potentially a protective order, depending on the subject matter of the ESI. However, TAR such as predictive coding could be used to identify potentially privileged material. Generally speaking, many attorneys have an idea what sort of communications or files will be protected by a privilege. Leveraging technology to identify attorney-client communications, intellectual property, or personal identifiable information can enable the data to be organized based on both privilege and responsiveness. This allows the attorneys to review data that is potentially privilege, if they are so inclined to have eyes on such ESI.

2012 Case Law Year-In-Review

2012 eDiscovery Case Law included everything from Tweets to Computer-Assisted Review. However, there was also a very basic theme that is hard to ignore: Cases should be about the merits. And for cases to be decided on the merits, attorneys need to educate themselves on electronic discovery so they know what to argue to a Judge.

Many 2012 cases focused on search terms search term efficiency, demonstrating undue burden, and proportionality, which all highlighted the need for attorneys to understand electronic discovery. Attorneys cannot competently represent their clients without understanding what technology is relevant in a case; the possible sources of electronically stored information; and what technology to use to review electronically stored information.

I discuss many of these cases in my Year in Review, available on the above YouTube link and on my podcast channel.

I wish everyone a very success 2013.

Judicial Test Pilot

When you think of Tom Wolfe’s “The Right Stuff,” test pilots and astronauts immediately come to mind. Aviators who risked their lives testing new technology that went higher and faster than anything else that could fly.

The pilots who flew the X planes of the 1940s to 1960s built the future we have today. Simply put, without the Space Program, we would not have had the Computer Era of the 1970s. Without the innovations of the 1970s, we would not have social networks and smartphones of today.

Lawyers and Judges do not come to mind when you say “test pilot.”  However, we do have brave attorneys and judges willing to “fly higher” than others for the greater good. Magistrate Judge Andrew Peck is one of them for his Da Silva Moore v Publicis Groupe & MSL Group opinion.

Judge Peck’s “computer-assisted review” opinion is a watershed because of its recognition of using technology to save money and find responsive electronically stored information.  As Michael Arkfeld commented on the Moore opinion being upheld, “Years from now we will look back and refer to this as the Zubulake for Search!”

When discovery is in the terabytes and there are millions of files to review, Federal Rule of Civil Procedure Rule 1 requirements to “secure the just, speedy, and inexpensive determination” of a lawsuit are often the first casualties of litigation. Fed. R. Civ. P. 1.

The Plaintiffs challenged Judge Peck’s Da Silva Moore order on numerous grounds. District Judge Andrew Carter upheld the order, referring to the findings as “well reasoned and they consider the potential advantages and pitfalls of the predictive coding software.”  Moore v. Publicis Groupe & Msl Group, 11 Civ. 1279 (ALC) (AJP).

Let’s explore the significance of Magistrate Judge Peck’s order.

What is Computer Assisted Review?

There are many forms of “computer-assisted review” besides “predictive coding.”  For example, California Rules of Court Rule 3.750(b)(10) allows a Court in complex litigation to order the parties to use an “electronic document depository.”  Such a depository would likely be a “cloud” solution, enabling parties to individually login to access the discovery for search and review.

Magistrate Judge Facciola is also no stranger to cases highlighting “computer-assisted review.”

In El-Amin v. George Wash. Univ., Judge Facciola ordered the parties consider using a hosted review platform.

The order set out as a “primary goal” for the parties to select a review platform with “hyper-linked to fields in a database that will permit the instantaneous retrieval from within the database of the information offered by plaintiffs in support of any factual proposition.”  El-Amin v. George Wash. Univ., 2008 U.S. Dist. LEXIS 85009 (D.D.C. Oct. 22, 2008).

Judge Facciola also warned of trying to use every new litigation support technology to trying to find electronically stored information:

[N]ew technologies have the capacity to be outcome determinative but often at significant expense. Thus the courts are required to strike a balance between allowing the requesting party to take full advantage of the technologies available to it and protecting the producing party from having to pay to leave no stone unturned. Resting all of the costs of electronic discovery on the producing party may create a perverse incentive on the part of the requesting party to dispense with reason and restraint and unleash every new technology under the sun to try and find information that supports the requesting party’s claims.

Covad Communs. Co. v. Revonet, Inc., 2009 U.S. Dist. LEXIS 47841, at * 29-30 (D.D.C. May 27, 2009).

Lite This Candle: Data Analytics & Discovery

“This judicial opinion now recognizes that computer-assisted review is an acceptable wayto search for relevant ESI in appropriate cases.”

Judge Andrew Peck

Moore v. Publicis Groupe & Msl Group, 2012 U.S. Dist. LEXIS 23350, at *2 (S.D.N.Y. Feb. 24, 2012)

“Computer-Assisted Review” as discussed in the Da Silva Moore opinion focused on “predictive coding.”

This is a significant difference than many of the past cases addressing “computer-assisted review,” because many of those cases either focus on 1) simply using a review application; or 2) search terms and/or the adequacy of a production.

Judge Peck defined “computer-assisted coding” in his article “Search Forward,” as the “use [of] sophisticated algorithms to enable the computer to determine relevance, based on interaction with (i.e., training by) a human reviewer.” Moore, at *5.

“Predictive coding” is probably better defined as data or mechanical analytics. “Data Analytics” is defined on SearchDataManagement.com as follows:

Data analytics (DA) is the science of examining raw data with the purpose of drawing conclusions about that information. Data analytics is used in many industries to allow companies and organization to make better business decisions and in the sciences to verify or disprove existing models or theories. Data analytics is distinguished from data mining by the scope, purpose and focus of the analysis. Data miners sort through huge data sets using sophisticated software to identify undiscovered patterns and establish hidden relationships. Data analytics focuses on inference, the process of deriving a conclusion based solely on what is already known by the researcher.

Taking Flight: Goals of Discovery Review

The practice of law is frequently a casualty when it comes to reviewing large sets of data in discovery.

It is important not to forget the goals of discovery: Document review does not exist for the sake of document review; it exists to find relevant information.

Magistrate Judge Peck outlined the following goals of discovery review in his order, which stand as a strong reminder of the purpose of discovery:

The objective of review in ediscovery is to identify as many relevant documents as possible, while reviewing as few non-relevant documents as possible.

Recall is the fraction of relevant documents identified during a review; precision is the fraction of identified documents that are relevant.

Thus, recall is a measure of completeness, while precision is a measure of accuracy or correctness. 

The goal is for the review method to result in higher recall and higher precision than another review method, at a cost proportionate to the “value” of the case.

Moore v. Publicis Groupe & Msl Group, 2012 U.S. Dist. LEXIS 23350, at *27 (S.D.N.Y. Feb. 24, 2012) (Emphasis Added)

The Eyes of the World Are Upon You: Why Court Found Computer Assisted Review Appropriate

Magistrate Judge Peck found “computer-assisted review” to be appropriate for the following reasons:

(1) The parties’ agreement [Something contested by the Plaintiffs]

(2) The vast amount of ESI to be reviewed (over three million documents)

(3) The superiority of computer-assisted review to the available alternatives (i.e., linear manual review or keyword searches)

(4) The need for cost effectiveness and proportionality under Rule 26(b)(2)(C)

(5) The transparent process proposed by [Defendant].

Moore v. Publicis Groupe & Msl Group, 2012 U.S. Dist. LEXIS 23350, at *35-36 (S.D.N.Y. Feb. 24, 2012)

Fairing: Reading Rule 26(g)

“In large-data cases like this, involving over three million emails, no lawyer using any search method could honestly certify that its production is “complete” — but more importantly, Rule 26(g)(1) does not require that. Plaintiffs simply misread Rule 26(g)(1).”

Judge Andrew Peck

Moore v. Publicis Groupe & Msl Group, 2012 U.S. Dist. LEXIS 23350, 20-23 (S.D.N.Y. Feb. 24, 2012).

The Plaintiffs challenged the predictive coding protocol under Rule 26(g)(1)(A), arguing that production had to be certified as “complete,” and that the Defendants were given “unlawful cover” with the predictive coding protocol. Moore v. Publicis Groupe & Msl Group, 2012 U.S. Dist. LEXIS 23350, at *21 (S.D.N.Y. Feb. 24, 2012).

Judge Peck held the Plaintiffs’ reading of Rule 26(g)(1)(A) was erroneous, because the certification requirements applied to initial disclosures under Rule 26(a)(1); Discovery responses are covered by Rule 26(b)(2)(C)’s proportionality principle.  Moore v. Publicis Groupe & Msl Group, 2012 U.S. Dist. LEXIS 23350, 20-23 (S.D.N.Y. Feb. 24, 2012).

Go Around: Federal Rule of Evidence Rule 702 & Daubert

The Plaintiffs challenged the predictive coding protocol as violating Federal Rule of Evidence Rule 702 and Daubert, in part, because the Defense experts spoke at the hearing, but were not sworn in at the time. Moore v. Publicis Groupe & Msl Group, 2012 U.S. Dist. LEXIS 23350, at *23 (S.D.N.Y. Feb. 24, 2012).

The Court held that Rule 702 & Daubert apply to presenting evidence at trial; the Rules are not applicable in how ESI is searched for in discovery. Id. As the Court explained:

If MSL sought to have its expert testify at trial and introduce the results of its ESI protocol into evidence, Daubert and Rule 702 would apply. Here, in contrast, the tens of thousands of emails that will be produced in discovery are not being offered into evidence at trial as the result of a scientific process or otherwise. The admissibility of specific emails at trial will depend upon each email itself (for example, whether it is hearsay, or a business record or party admission), not how it was found during discovery.

Moore v. Publicis Groupe & Msl Group, 2012 U.S. Dist. LEXIS 23350, at *23 (S.D.N.Y. Feb. 24, 2012).

Message from the Control Tower

District Judge Andrew Carter upheld Magistrate Judge Peck’s order after the Plaintiffs challenged Judge Peck’s order on numerous grounds. Moore v. Publicis Groupe & Msl Group, 11 Civ. 1279 (ALC) (AJP).

Judge Carter said the following on the protocol (and the confusion over whether the Plaintiffs agreed to the protocol):

Nevertheless, the confusion is immaterial because the ESI protocol contains standards for measuring the reliability of the process and the protocol builds in levels of participation by Plaintiffs. It provides that the search methods will be carefully crafted and tested for quality assurance, with Plaintiffs participating in their implementation. For example, Plaintiffs’ counsel may provide keywords and review the documents and the issue coding before the production is made. If there is a concern with the relevance of the culled documents, the parties may raise the issue before Judge Peck before the final production. Further, upon the receipt of the production, if Plaintiffs determine that they are missing relevant documents, they may revisit the issue of whether the software is the best method. At this stage, there is insufficient evidence to conclude that the use of the predictive coding software will deny Plaintiffs access to liberal discovery. 

Moore v. Publicis Groupe & Msl Group, 11 Civ. 1279, at *3-4 (ALC) (AJP).

Judge Carter found the challenge on the reliability of the “predictive coding” were “premature.”  The Court stated, “It is difficult to ascertain that the predictive software is less reliable than the traditional keyword search.” Moore v. Publicis Groupe & Msl Group, 11 Civ. 1279, at *4 (ALC) (AJP).

The Court noted that experts were present at the hearing and that the “lack of a formal evidentiary hearing at the conference is a minor issue because if the method appears unreliable as the litigation continues and the parties continue to dispute its effectiveness, the Magistrate Judge may then conduct an evidentiary hearing.” Id.

Judge Carter stated the following on Plaintiffs’ challenges to the protocol being “speculative”:

Judge Peck is in the best position to determine when and if an evidentiary hearing is required and the exercise of his discretion is not contrary to law. Judge Peck has ruled that if the predictive coding software is flawed or if Plaintiffs are not receiving the types of documents that should be produced, the parties are allowed to reconsider their methods and raise their concerns with the Magistrate Judge. The Court understands that the majority of documentary evidence has to be produced by MSLGroup and that Plaintiffs do not have many documents of their own. If the method provided in the protocol does not work or if the sample size is indeed too small to properly apply the technology, the Court will not preclude Plaintiffs from receiving relevant information, but to call the method unreliable at this stage is speculative. 

Moore v. Publicis Groupe & Msl Group, 11 Civ. 1279, at *4 (ALC) (AJP).

Bow Tie Thoughts

Judge Peck and Judge Carter’s opinions will have a lasting impact in eDiscovery.  This is not a case of machines replacing humans, but the importance of using technology to identify what is relevant to a lawsuit.  Moreover, perfection is not the standard in discovery; proportionality is simply not disappearing ink in the Federal Rules of Civil Procedure.

It is also noteworthy that Judge Carter adopted Judge Peck’s analysis of Rule 26(g) and Federal Rule of Evidence Rule 702 in footnote 3 of his opinion.  I have met many attorneys who viewed productions needing to be certified as being “complete and correct” under Rule 26(g)(1)(A).  Seeing both a Magistrate Judge and District Judge who held that productions are not held to a standard of perfection, but one of proportionality, should bring more reason to large cases requiring the review of terabytes of ESI.

The goal of discovery is to find relevant information.  Data analytics that assist attorneys in determining relevant ESI will enable lawyers to focus on litigating their clients interests, opposed to slogging through protracted document review.

The adoption of “data analytics” in discovery review will be a significant step to focusing on the practice of law instead of review. However, like any technology, it must be used correctly. This requires attorneys with knowledge, who put the time in to learn how the application operates. Moreover, the system must be tested to ensure either the system is working correctly or whether the search terms need to be adjusted to find relevant ESI. There are a host of other technical issues to make sure the process is defensible.

“Data Analytics” will not be limited to helping attorneys with the three million record cases. Solo practitioners to lawyers at mid-sized firms who can leverage this technology (once commercially affordable or offered by service providers) to identify relevant ESI out of 20,000 records in a matter or days, or hours, will provide greater services to their clients than those who are not using such technology.