Remote Collection: The Apple Pay of eDiscovery in a COVID-19 World

I often continue doing things just because that’s the way I’ve always done them. There is a level of comfort that comes from familiarity, and to be honest as I age I realize I can get more set in my ways (as my children often tell me), eschewing new ways of doing things – even if they are quicker or more efficient. Sometimes it takes a major disruption to force change, as the eDiscovery market saw with accelerated adoption of Predictive Coding in the wake of the Great Recession. This is true in many industries, including consumer products: witness the accelerated adoption of “contactless payment” like Apple Pay during the COVID-19 pandemic. It has been available for years, but adopted mainly by younger generations while us old folks clung to credit cards and, in some cases, cash (gasp!). But COVID-19 has changed this dynamic for many, myself included, as the prospect of touching a credit card machine is now unacceptable. Whereas using Apple Pay was a ‘nice-to-have’ before COVID-19, it has become a ‘must-have’ now. This type of resistance to change is arguably even more commonplace in the legal world, where convention and comfort often reign supreme. How we have been conducting eDiscovery collection for years is a perfect example of clinging to outdated methods – but with the advent of COVID-19, this too is about to change for good.

Collection of digital evidence in legal proceedings was an implicit requirement under the Federal Rules of Civil Procedure (FRCP) long before it was codified explicitly in the 2006 amendments with the addition of Electronically Stored Information (ESI) under amended Rule 34(a)) as a “new” category. I distinctly remember conducting discovery in 1998 and 1999 as a 3^rd year law student and then 1^st year associate for a Bay Area law firm: it was the proverbial “banker box” process, with all discovery in paper form. In those days, even email messages and Word Perfect documents were simply printed out to be Bates stamped and reviewed in hard copy by hand. Document review has always been tedious, but at least back then the volumes were significantly lower than they are these days.

During this timeframe, however, email and the dissemination of ever-greater volumes of electronic information it facilitated was exploding. This, of course, meant that evidence (in the forensic context) and relevant information for eDiscovery was increasingly digital in nature. So when discovery practitioners went looking for tools to help them preserve and collect digital information, where did they turn? To the forensic world, of course, as the more stringent requirements and processes of criminal proceedings and evidence necessitated the development of such tools earlier than had been needed in civil discovery. And if a tool was good enough for criminal proceedings, it should be plenty good enough for those in the civil world. Thus, forensic tools like Guidance’s Encase® and AccessData’s FTK® which were built for law enforcement crossed over into the civil world.

However, the needs of the data collection process for civil discovery were and remain quite different from those of the criminal world:

On average civil discovery involves far more “custodians” (owners or stewards of information) than criminal proceedings, e.g. 5-15 custodians in civil matters vs. 1, maybe 2, in criminal
Whereas a typical criminal proceeding focuses on the communication media of one or occasionally a few alleged perpetrators (i.e. their cell phone, laptop, social media), civil discovery is typically significantly broader given the greater number of corporation applications and data repositories, including corporate email, file shares, ‘loose files’ (e.g. Word or Excel documents only stored locally), cloud storage repositories like Dropbox or Google Vault
Due to the larger number of custodians and typically broader data types to be searched, the volume of information in civil discovery is usually significantly greater than in a criminal proceeding
In handling criminal evidence there is a presumption that the alleged perpetrator may have tried to hide, alter or destroy evidence; absent very unusual circumstances, no such presumption exists in civil discovery
While confiscation of devices (laptops, desktops, cell phones, records) is the standard in criminal proceedings, the opposite is true in civil discovery. Custodians need their devices so they can do their jobs
Collection of evidence in criminal proceedings is handled by law enforcement (e.g. upon arrest or as part of a ‘dawn raid’ type of event), while the parties themselves conduct civil discovery (as a business process typically handled by legal or outsourced to service providers)

These differences were insignificant when data volumes were small and the data was relatively easy to get to, as was the case for many years. And as the first technology on the market, forensic tools and vendors did a great job of building and defending their incumbency, through certifications, “court-cited workflows” and knowledge bases widely advertising their deep expertise in forensic collection as practiced by a cadre of forensic examiners leveraging their technical abilities into lucrative careers – thereby creating a significant barrier to entry for non-forensic eDiscovery collection tools and practitioners.

In spite of this strong incumbency, almost all corporate legal departments have long wanted a better approach to collection than forensic tools offered; many of their outside counsel have felt similarly. They have long felt collection using forensic tools and workflows were and remain deeply flawed for eDiscovery in a number of ways:

Chronic overcollection: as forensic tools were built to capture all information, including things like slack space which can be important in criminal proceedings but are almost never even in scope in civil matters, the volume of data collected is far greater than needed. While service providers charging hourly professional services time and monthly per-GB hosting fees may not mind, for clients paying to collect/filter/host/review/produce knowingly unnecessary data this makes no sense and adds significant cost to the entire process, each and every time
Weeks or months-long process: because forensic tools must process data on a server before searching or culling it, they require physical access to a device (e.g. via a USB port). There is an option to copy entire drives with GBs of data through a VPN connection, but this approach has never worked well, if at all. Given the coordination needed to gain physical access to devices which may be located in myriad different cities or countries, as well as the need to complete collection before paring down or even searching of data can begin, what should take hours or days instead takes weeks if not months
Highly disruptive: as each forensic image is being taken of each laptop or desktop, the user of each such machine must stop whatever they are doing and surrender their machine to the forensic staff for a day or more. Even if there is a spare laptop available, it will often have none of their ‘stuff’ on it. Needless to say, this highly intrusive process makes each such worker far less productive and is very disruptive
“Recreating the wheel” every time: when the next matter arrives, can forensic examiners simply use the data from the last collection? Unfortunately, no, as each custodian has presumably created and received new data, necessitating the whole process from before be repeated. Forensic collection quite literally recreates the wheel with every collection

By contrast, remote collection is designed specifically for civil eDiscovery. It is built for a distributed workforce and requires no physical access to any devices. A small software agent is installed on each device which creates its own local index; legal staff can then simply search this index for whatever ESI they want to find. This distributed architecture facilitates ‘Pre-Case Assessment’, where search terms are sampled on data in-place, before any ESI is collected. This turns the forensic collection workflow on its head, as analysis can be done from the very beginning of the preservation/collection process, allowing lawyers to gain insight far earlier in any proceeding and supporting a surgical collection process, leading to far lower data volumes (and therefore much lower eDiscovery costs). And because remote collection can be an entirely cloud-based process, no hardware or specialized staff is required – in fact, collections can be done without IT ever being involved.

Why hasn’t the industry adopted remote collection before now? Because everyone involved in the process except the client was benefited from it: forensic experts, service providers and forensic technology providers. They had a strong incentive to keep things as they had always been, to the client’s detriment. In a COVID-19 world, however, even these groups must change their workflows as physical access to devices has not only fallen out of favor – it is now impossible and perhaps even dangerous. What remote employee would want a stranger to come to their home and take their laptop for hours? That scenario is simply no longer an option. Similarly to how touching a point-of-sale machine went from a minor inconvenience to a wildly irresponsible and even dangerous activity when Apple Pay is a far better approach, forensic collection in eDiscovery is in the process of giving way to remote collection. Clients will be much better off for it.

X1 Enterprise Collect Platform

X1 SEARCH

X1: Next Gen GRC &
eDiscovery Law Blog