I’ve been blogging about e-discovery since June 2005, but have removed most older posts. If you are looking for something from 2005-2009 or so, please let me know.
The question comes up very specifically when lawyers advise their clients to “preserve” (through the mechanism of a litigation hold notice), and the first thing clients do in response to such a notice is to go through their email, browsing and searching for potentially relevant messages, and then either print those messages or move those messages into a new folder created specifically for the litigation. The act of opening these emails, of moving them or printing them is itself problematic, because theoretically custodians should be preserving the email messages intact. The problem is that when litigation threatens, individuals have no choice but to go through their records, sometimes frantically, to ensure that they have the records that they need to instruct their lawyers appropriately. In real life this happens even before the actual preservation threshold is reached, as managers usually have a sense that litigation might arise well before counsel are retained.
Indeed, the rules in Ontario require parties to file with their statements of claim or defence any relevant documents on which they intend to rely. In order to gather these documents, the principles of maintaining ESI in its pristine format and environment, as it was kept in the normal course of business, becomes practically (and economically) impossible.
In Siemens Canada Limited v. Sapient Canada Inc, 2014 ONSC 2314 (CanLII), Master DE Short makes a number of important observations covering many aspects of e-discovery, in particular, how parties need to co-operate on the meaning of “proportionality.” I begin with some of the key points:
- Parties must draft a Discovery Plan at the outset of litigation to ensure agreement on the scope of discovery. Failure to do so may lead to the court imposing a plan and costs sanctions. [paras. 39, 151, 158-160]
- Disagreements about the details of a Discovery Plan should be mediated, if necessary with the assistance of a neutral e-discovery expert. The Court is not the best forum for resolving technical e-discovery matters. [paras. 36-40, 89, 145-6]
- Even though the production of documents is subject to the proportionality principle (“proportionality is now the default”), the test for production is relevance to disputed facts in the pleadings. [paras. 57, 161]
- Where relevant data is difficult to collect or may not be recoverable, attempts should be made to preserve it. [para. 101]
- A party cannot unilaterally determine what is proportional, for example, by limiting the number of custodians searched or keywords applied. These decisions should be transparent, co-operative and form part of the Discovery Plan. [paras. 119, 128, 151]
- The fact that a custodian may have only a few relevant emails does not justify excluding that custodian from a search. Large volumes of email and other structured data (such as a project document repository) can be dealt with by using de-duplication and search engines. [paras. 106, 111]
- An employer should not leave individual custodians to determine which emails they consider to be relevant. [para. 149]
- If no response to a litigation hold memo is received from a custodian, follow-up is required. [paras. 95, 139]
- A records retention policy of deleting emails every thirty days “can potentially cause serious problems,” including the necessity to expensively restore back-up tapes. [paras. 136-138, 156]
In 2007 Sapient won a five-year, $70 million SAP implementation project for Enbridge. Sapient subcontracted with Siemens for conversion and other services. Sapient assigns 120 employees and contractors to the project while Siemens as subcontractor assigns 38. Joint project staff are granted access to Sapient’s project document repository (ResultSpace).
Delays mount up and Sapient terminates the Siemens subcontract. In July 2009 Siemens sues for breach of contract for $20 million. Sapient counterclaims for $10 million for delay.
Siemens carries on discovery the old fashioned way, ignoring the 2010 rule amendments (in particular the requirement of a Discovery Plan). Siemens identifies all 38 project staff as custodians and collects all their data; performs key word searches, reviews and eventually produces some 120,000 documents. (Since it no longer has access to ResultSpace, all Siemens productions come from other sources such as email.)
Sapient also works traditionally (i.e. no meet & confer, no Discovery Plan). But they decide to unilaterally invoke the new principle of proportionality as follows:
- Sapient identifies only 14 custodians (out of 120 project staff) as holders of potentially relevant material, because it is mere speculation that the other custodians possess any additional evidence material to the case
- When identifying potential sources of relevant information, Sapient overlooks the project document repository.
- Sapient asks the 14 custodians to review and select for production only what they believe to be relevant
- When encountering technical problems with collecting from four of the custodians, Sapient makes no effort to preserve or recover the data.
- Sapient refuses to share with Siemens any information about what sources or custodians were identified or what keywords were used.
- Manual relevancy review does not include emails relating to the “status of the project” because Sapient asserts that project status was well documented in minutes, status reports etc.
Sapient initially produces just over 21,000 records, one-sixth of Siemens’ production, even though it has three times as many project staff.
During oral discovery, Sapient questions a Siemens representative on a document that was never produced. When pressed, Sapient admits that it inadvertently neglected the ResultSpace repository. In October 2012, two years after production was supposed to be complete, Sapient produces another 20,000 documents – effectively doubling its production.
Siemens brings a motion for further productions, including more custodians, transparency, and all documents relating to “project status”. Siemens attaches a “state-of-the-art” fifteen-page Discovery Plan (which is not appended to the reasons). While the parties agree on some elements, the main disagreement is about scope.
Is it really necessary (or proportional) for Sapient to add more custodians when so many documents have already been produced already, and the existence of any new emails that might be material to the issues is only speculative?
Is it really necessary for Sapient to find all emails relating to “project status” when this is already well documented in project status reports, meeting minutes and elsewhere?
Should the scope of relevance in the Discovery Plan be expressly limited by the words Subject to the principle of proportionality, as argued by Sapient?
Siemens was granted partial relief. Sapient was ordered to:
- Search emails and ResultSpace data for an additional eight custodians. (Assessment of the emails of these custodians would determine whether even more custodians might need to be added.)
- Restore and search more backup tapes for the 10 original custodians.
- Apply the original search terms to the complete email files (not just the self-selected emails).
- Include emails about the “status of the project” when doing manual review of the search results.
Despite the Master’s concerns that what is “proportional” can be interpreted differently by each party, and that otherwise producible documents might be withheld as a result, the Master agreed that it would be useful to preface the scope of relevance with the words Subject to the principle of proportionality in order to show that the parties have paid attention to the provisions of rule 29.1.03(3)(e).
No costs were awarded under rule 29.1.05 because the parties did not comply with the Discovery Plan rule.
 “The discovery plan shall be in writing, and shall include any other information intended to result in the expeditious and cost-effective completion of the discovery process in a manner that is proportionate to the importance and complexity of the action.”
 “On any motion under Rules 30 to 35 relating to discovery, the court may refuse to grant any relief or to award any costs if the parties have failed to agree to or update a discovery plan in accordance with this Rule.”
The scope and method of data collection should be determined after a legal and strategic assessment of all the pertinent facts and issues in the proceedings, and in accordance with a plan, preferably as agreed with other parties, including regulators. Limited scope and improper self-collection (e.g. undocumented procedures) by clients can lead to serious problems down the road including:
- Challenges to the admissibility or weight of the evidence
- Time and expense of collecting data twice
- Sanctions for missing relevant custodians or data sources
- Possible loss of critical data and metadata leading to inability to prove or defend case
- Court or regulatory sanctions for spoliation
- Potential breach of employee privacy rights
On the other hand, full-bore forensic imaging can be costly, intrusive and unnecessary. There is no “right approach” or cheat sheet that covers all cases, because the proper scope and method of collection depend on many factors, including:
- the type of proceeding – litigation, arbitration, investigation?
- the type of matter – family law, commercial, product liability, employment?
- factors of proportionality – what’s at stake here?
- whether allegations of fraud have been made
- whether communications have been or are in danger of being repudiated (“I did not send that email”)
- trust level between the parties
- whether an agreement has been (or can be) negotiated
In addition to the legal and strategic issues around collection are the technical ones. There are many applications that purport to collect data forensically, and many experts out there with differing and sometimes confusing qualifications. Different methods, skills and tools must be used on different sources of data, For example, collecting Facebook pages is a very different exercise from collecting Blackberry text messages or deleted emails from an Exchange server.
Effective planning is the only way to ensure that collection is done appropriately for the matter at hand and in a cost-effective but defensible way.
NOT READY FOR PRIME TIME?
- data that is not collected properly may need to be collected again
- custodians who are missed the first time around may be gone (with their data) when most needed
- data might be collected from duplicate sources, or even collected twice from the same course
- huge volumes of data may be collected, processed, searched and reviewed unnecessarily
- critical repositories of relevant information are easy to miss
- if the integrity or authenticity of produced data is challenged, no adequate response may be forthcoming due to the lack of process and documentation
- the whole e-discovery process is tinged with panic which puts you at a disadvantage
- legal fees and commercial vendor bills escalate out of control as more work is delegated
If e-discovery is treated as an emergency then you are not prepared.
GETTING STARTED – THE CRITICAL ELEMENTS
- Compliance with applicable records management and admissibility standards (ISO 15849 and many other standards and best practices listed here, CAN/CGSB 72.34). Proper records management is the foundation of the E-discovery Reference Model and the Information Governance Reference Model.
- Document retention policy updated to include electronic records. Many organizations have figured out retention periods and practices for document destruction – but these rules are usually impossible to apply in the case of unstructured data and e-mail. You need a new philosophy, perhaps best expressed today in the Capstone approach of the US government.
- Current data inventory and map. Many organizations store information in a variety of decentralized locations and applications. Users often have the ability to create undocumented data repositories. Getting a handle on where data resides and what format it is in can be crucial for a quick and effective litigation response. Moreover, it is usually these undocumented repositories where damaging records may reside. In one recent case, a party “forgot” to check its project document repository for relevant records. “During oral discovery, Sapient questions a Siemens representative on a document that was never produced. When pressed, Sapient admits that it inadvertently neglected the ResultSpace repository. In October 2012, two years after production was supposed to be complete, Sapient produces another 20,000 documents – effectively doubling its production.” Siemens Canada Limited v. Sapient Canada Inc, 2014 ONSC 2314.
- Policies, procedures and audit for backup and archive. Every organization has procedures for backing up data. But here are a few realities:
- Backup procedures are often put into place by the IT department, and often using default configurations, without regard to records management policies (if those even exist)
- Backup procedures, if documented, are often not followed in practice because IT staff are reluctant to “delete” anything that might conceivably be needed in the future
- Without an effective archive and backup policy, duplicative, legacy and documents long thought to be deleted are now a very active part of the e-discovery project
- Litigation hold policy, procedure and precedent. You know your organization best. While outside counsel can assist with drafting the legal content of a hold, your legal hold should be based on a precedent that fits the records management procedures used by your custodians.
- Collection process, documentation and oversight. It is often not necessary for an outside computer forensic specialist to image hard drives and collect data. However, unless your staff are trained in the appropriate processes, are provided with the appropriate tools, and have developed the habit of documenting what they do, costly outside services will always be recommended.
- Awareness and training. When custodians (users) receive a legal hold notice – do they know what to do? When the IT department is asked to search for documents or emails in a certain date range, do they know what to do? Or will they wing it? Everyone on the litigation response team must be trained like firefighters, and anyone who might be affected by an e-discovery project should be comfortable with the process.
- Testing and updating. Any response plan must be tested, modified accordingly, and updated regularly.
- Planning, communication and reporting. E-discovery usually involves internal staff from IT, records management, legal and operations. It also involves outside counsel, and may involve commercial suppliers. It is critical that all parties are in touch regularly when a response is triggered.
- Cross-disciplinary team with connections to outside counsel. Appointing a single paralegal or lawyer as the litigation response case manager is a great idea, and the strongest response is planned and executed by a cross-disciplinary team. The larger the organization, the larger and more diverse team is indicated.
Litigation Response Plan – series of videos and documentation from US lawyer Tom Howe. http://www.litigationresponseplan.com/
Stephen O’Leary – Litigation Response Planning. http://web.simmons.edu/~wilczek/ediscovery/oleary-armaboston.pdf
Government of Alberta, Litigation Readiness and Information Management, http://www.im.gov.ab.ca/documents/imtopics/Litigation_Readiness_and_IM_Tip_Sht_1.pdf.
Litigation Response Planning and Policies for E-Discovery, http://library.ahima.org/xpedio/groups/public/documents/ahima/bok1_036581.hcsp?dDocName=bok1_036581
Richard Medina, How to Develop and Implement Your Discovery Readiness Program – http://www.cmswire.com/cms/information-management/how-to-develop-and-implement-your-discovery-readiness-program-020464.php.
The law (and outside counsel) demands admissibility and defensibility.ADMISSIBILITY
To defend challenges to admissibility we need to prove the integrity of the information system (Evidence Act). This supports our claim of authenticity (the evidence is what it purports to be) and best evidence.
Proof of anything requires documentation. The only way to prove authenticity of an individual record is with metadata. This requires appropriate (and secure) methods and tools for preservation and handling of potentially relevant data.
The integrity of the information system itself can be proven with evidence of compliance with a standard such as ISO 15489 (Records Management) or CGSB 72.34 (Admissibility of Electronic Records).
Good RIM practices are the foundation of the EDRM.
To defend challenges to defensibility we need to prove that our collection is complete (subject to objective claim of proportionality). This means our methods and tools for searching, collecting and reviewing must comply with the rules, developing case law and available and emerging technologies. Best practice is Standard Operating Procedure (litigation response plan) not Seat of the Pants.
Self-collection is frowned upon. Use of modern tools including machine learning is encouraged. Defensibility also requires co-operation and documentation.
- Modern lawyers
- Forward-thinking judge
- Technology-ready courtroom
- Trained court services staff
- Hardware, software, network
- Experienced IT and litigation support professionals for network setup, electronic transcripts and exhibits
- Practicable agreed procedures and protocols
- Judicial electronic records management policy
Careful readers of the revised Sedona Canada Principles will note the demise of the “meet and confer” in favour of a more shapeless process called “co-operation.” In my comments on the public draft I didn’t complain about this, as it had been proposed at a Sedona Conference meeting in Toronto and discussed quite thoroughly.
The issue is that lawyers felt that being shoehorned into a specific procedure is unduly restrictive. Being lawyers they start asking questions like – does the meet and confer have to be face to face? What about an exchange of emails – does that qualify as a meet and confer? Concerns about delay and cost relating to organizing in-person meetings I think are legitimate. Why have a face to face when agreement on discovery scope in a routine matter can be reached over the phone?
Of course the original meet and confer concept never required a face to face meeting and I don’t think there’s any reason to interpret it that way – at least not in 2015, when videoconferencing, web conferencing and conference calling are mature technologies.
To me, if eliminating the meet and confer was a way of making the process less restrictive, I was in favour. Let’s eliminate as many excuses and roadblocks as possible. Adding more formal procedures to the already overburdened civil litigation process is a bad idea. On the other hand, asking litigators to “co-operate” with each other at the outset of a lawsuit without any formal process for doing so has led to failure. Co-operation and litigation do not naturally go together.
In Ontario the fallback position is that since a joint discovery plan is mandatory, the parties have no choice but to co-operate. Ultimately this means that with or without a formal meet and confer the parties must devise some method of coming to agreement on scope and format of disclosure, identity and timing of oral discoveries. But in practice this has met with resistance from the bar – so much so that some believe rule 29.1 (the Ontario rule mandating discovery plans) should be revoked.
Without a mandatory joint agreement there would be no apparent incentive whatsoever to co-operate in e-discovery matters. Of course there are real incentives – like saving the client a great deal of money or getting to the truth more quickly. But for many lawyers those are secondary considerations to the prime directive – fight and win.
Unless the parties agree, providing copies alone – either in hard copy or soft copy – does not satisfy the rules. (The leading case in Ontario is Wilson v. Servier Canada Inc., 2002 CanLII 3615 (ON SC), and a more recent example is Gamble v. MGI Securities, 2011 ONSC 2705(CanLII). Together with the copies, parties are supposed to serve a detailed list of the documents produced. Traditionally this list was typed or prepared as a word processed document. Each document is described in the list and numbered, like so:
Tab 31. Email from Green to Brown re contract negotiations dated February 12, 2006.
Tab 32. Signed service contract between Green and Brown, March 1, 2006.
Tab 33. Monthly TD bank statements for account #43-0888765, June-December, 2008.
The parties must have some way of using the numbered list to reference the copies. In a paper production, the copies are usually placed in a binder organized with numbered tabs, each tab corresponding to the serial number on the document list. So a complete paper production must have the following elements:
1. Copies of each document identified with a serial document number (usually indicated on a tab)
2. A numbered list of the documents where each number corresponds to the proper binder tab.
Where the productions are made by way of soft copies, the “tab” approach doesn’t work because there is no physical binder. Instead, the image files can be saved with file names that match a serial number on the list, for example:
Production #31. Letter from Green to Brown re contract negotiations dated February 12, 2006. (The corresponding image file could be named “00000031.pdf”)
Manually referring to a list and then browsing through a collection of named image files is slow and clumsy, and completely unnecessary, because if the documents are provided in digital form, and the list is itself in word processing format, we can use the simple concept of hypertext linking to make it very easy for the reader to browse through the list of documents and view the corresponding image by clicking.
An entry in the list of documents would then look like this:
31. Letter from Green to Brown re contract negotiations dated February 12, 2006. Click here: n:\documents\clients\green\productions\00000031.pdf.
Note that the actual file path (i.e. location of the file on a hard drive) is variable depending on where the produced files are saved. Note as well, that there are may be performance limits to how many files can exist in a single folder, so in many production sets, the soft copies are arranged in folders and subfolders.
So a complete production of soft copies must have the following elements:
1. Image files of the documents, each one named with a unique serial number.
2. A soft copy list of the document image files with hypertext links from each document reference to the corresponding path (location) and filename for each image file representing the document.
Remember I mentioned that the location (path) of each file is variable? You saved your productions on your C:\ drive, and I copied them to my F:\ drive. Your hypertext links pointing to C:\ won’t work for me.
To make it easy to modify these links, the standard approach is to provide two related files: one list of documents with a document reference number, and another list linking the document reference number with the corresponding file image name(s) and location. These are called LOAD FILES. While preparing two such files sounds like more work than preparing one, it is actually easier because this is the way litigation support software does it by default.So for a complete production between parties using litigation support software rather than a word processor, a complete production requires:
1. Image files saved in a folder with unique serial number filenames
2. A structured file (like a spreadsheet) listing the documents with reference to corresponding image file references
3. Another structured file linking each image reference to the location and file name of the corresponding file.
Document Image file:
First structured file:
This file links the key elements of the document (author, title, recipient, date), to the unique document production number.
Re: Contract Negotiations
Second structured file:
This file links the document production number to the path and filename of the document image file.
Native file production, like all things digital, entirely changes the production process. First, native files already have filenames, and those filenames are important, so they should not be renamed with a serial number. Instead of working with a file called “00000031.pdf”, we now have files with such names as “LB_contract_rev03.docx”, and files with the same name may exist in different folders on the custodian’s hard drive.
As the volume of producible documents increases, especially emails, the concept of preparing a traditional list is daunting. Who can afford to type up a list of every email? Manually preparing a list is not necessary because emails have their own “metadata” or header information. While it may not look pretty, it is available for free. So the structured file listing emails might look like this:
FILE TYPE: MSG
FILE SIZE: 24kb
RE: FW:Available for lunch today?
DATE SENT: 2/12/2006 10:05:36 GMT-5
DATE RECEIVED: 2/12/2006 10:23:15 GMT-5
Note that the email header provides more information than is normally entered manually. Note also, that the MSG file format has been extracted from an Outlook container file. An MSG file contains the email message, metadata about the message, and all attachments. (In this post I am not including a discussion of how to handle attachments, to keep things a little simpler.)
The structured file linking the document list to the native files would look like this:
FILE PATH: \productions\green\email\inbox\FW:Available for lunch today?.msg
So a complete production in native format would include these elements:
1. Copies of the native files in their originally mapped folders (green\email\inbox\)
2. A structured file containing the relevant metadata extracted from the emails, with a unique document production number
3. A structured file linking the unique document production number to the filename and folder location (path) of the native file.