Mysteries of Metadata: What to Watch Out For in Commercial Litigation Cases

Jennifer Parent headshot
Jennifer L. Parent
Director, Litigation Department & Chair Business Litigation Practice Group
Published: New Hampshire Bar News
December 15, 2017

Co-authored by Dawn Poulson

You just sent the client’s answer to a 35-page complaint and counterclaims to the court for filing when you notice a new email from opposing counsel.  It’s a draft discovery plan.  Upon review, you discover that the other side is demanding that all electronically stored information (ESI) be produced in native format.  In this contract dispute, you know that there will be a significant amount of email and that drafts of the contract will be at issue.  You remember something about “metadata” and the secrets it may hold.  Is this something you need to worry about given that you helped the client with the negotiations?  How will counsel’s demand increase the cost to your client?

The volume of electronic documents and data created on a daily basis is staggering.  With the ever-changing technology and increased use of email on both company-owned and personal devices, it is not surprising that companies are dealing with the complexities of preservation, collection, and production of this electronically stored information (ESI) in litigation.  Producing ESI must be done with caution and care to protect client information and yet in a format agreed upon by the parties.  Documents today can be produced in many formats to accommodate various document review platforms.  Aside from a paper copy or a PDF, the difference in formats mainly relates to the metadata provided.  Managing the risks associated with metadata lurking unseen in documents is a critical component in every commercial litigation case.

Earlier this year, the Business Court considered the issue of the proper format for the production of ESI in a commercial litigation case.  In Mason v. OSR Open Systems, Inc. et al (No. 218-CR-2016-CV-1294)(https://www.courts.state.nh.us/superior/orders/bcdd/Mason-v-OSR.pdf), the plaintiff demanded discovery in an entirely native format, claiming that production in any other format limited the ability to utilize features of plaintiff’s discovery review software system.  The defendants objected and moved for a protective order, contending that production in an entirely native format was unduly burdensome due, in large part, to the necessity to review the metadata for privilege before any production and that a native production was not the industry standard.  By motion, the defendants requested that they be allowed to produce ESI in the standard TIFF+ format.  Following two evidentiary hearings and extensive briefing, the Court granted the defendants their relief and denied plaintiff’s motion to produce ESI in an entire native format.

Metadata is information embedded within the digital framework of an electronically stored document.  It is often described as “data about data.”  It can be invisible to the naked eye when a document is printed and can pose unique concerns for litigants in business disputes as it may contain privileged or confidential information.

As the Court in Mason noted, there are two types of metadata – system metadata and application metadataSystem metadata is “information related by the user or by the organization’s information management system” automatically.  For example, this would include things like the file name, the identity of the user who created the file, made modifications or last saved the file, and other user profile type information.  Application metadata is created as part of “the application software used to create the document or file.”  For example, this would include information about fonts, spacing, and size.  It can also include the history of changes made to a document, modifications, edits, or comments.  This metadata is embedded within the file and travels with the file when it is moved or copied.  In other words, while deleted in the final version, “tracked changes” or comments may still reside within the document and that metadata can only be viewed in its native format.

A native production includes all metadata associated with a document, i.e. both system and application metadata. The issue with a production in native format is that the parties do not control which metadata fields are produced – which could encompass over hundreds of fields of metadata.  The plaintiff argued that only native documents would allow its discovery review platform to properly function.  The defendants countered that a native production may expose the producing party to the risk of revealing potentially privileged, confidential, or irrelevant metadata associated with the native document and that the defendants would need to take the additional step to review the metadata for privilege before any production.  This can be burdensome in terms of both time and cost and pose a challenge due to the unknown number of metadata fields.

Defendants also contended that a native format is not the industry standard.  Relying on the Second Edition and more pointedly on the Third Edition of The Sedona Principles, recently finalized in October 2017, defendants confirmed the most common way ESI has been produced over the last decade has been to “create a static electronic image in Tagged Image File Format (TIFF) or Adobe Portable Document (PDF) file format, to place the extracted text from the documents into a text file, and to place the selected metadata and other non-apparent data into one or more separate load files.”  This form is frequently referred to as “TIFF, Text and Load Files” or “TIFF+”.  In this format, parties can agree on the selected metadata fields for the load files to be produced to allow for searches based upon this information.  For example, parties often agree to email metadata fields of “to,” “from,” “cc,” “bcc,” “date,” “email subject,” “filename,” etc.   A TIFF+ production can be searched, sorted, or filtered by standard document review platforms.

Under the circumstances in Mason, the Court found significant that a native format production was not necessary in order for the litigation to proceed efficiently.  The Court spent considerable time addressing the features of plaintiff’s discovery review software and was guided by the Second and Third Editions of the Sedona Principles and the recognition that to be reasonably usable, ESI need not necessarily be produced in native format.  It further recognized that the TIFF+  production format provided the information needed to establish the facts at issue in the case and the data to allow the plaintiff to “functionally access, cull, analyze, search and display the information produced” on the review platform used by plaintiff’s counsel.  Relying on expert testimony offered by the defendants, the Court also found that requiring native format would double the cost of production to defendants because it “may cause difficulties in reviewing the material for responsiveness and for privilege before production,” there were challenges with making a record of what was actually produced, and the electronic data could be manipulated in native form.

There are many mysteries of what information is captured by metadata lurking within a document and caution should be given before providing documents in their native format.  Lawyers should keep this decision in mind whenever considering the format of ESI production in their commercial cases.