Web Collection With Legitimate Public Policy Purpose Deemed Legal and Proper by Federal Courts

By John Patzakis

Social Media evidence collection for judicial and compliance purposes can involve web collection techniques using artificial or “examiner” accounts in order to access and collect public information. This process involves a degree of automation and results in the collection and preservation of the evidence for analysis, authentication, and potential disclosure to other parties and presentation in court as evidence. According to multiple US Federal Courts, notably, Sandvig v. Sessions, 315 F. Supp. 3d 1 (D.D.C. 2018), such activities are not only legally permissible, but protected under the First and Fifth Amendments of the US Constitution.


A Federal District Court in Northern California reached a similar result regarding routine web scraping in hiQ Labs, Inc. v. LinkedIn Corp., 273 F. Supp. 3d 1099, (N.D. Cal.  2017), although the Sandvig court also specifically addressed the use of automated bots in conjunction with artificial user accounts.

In Sandvig, the Plaintiffs are academic researchers who assess online housing, finance, and employment transactions to identify any discriminatory effects of algorithms.  To accomplish this, they create fake profiles, including profiles for minorities, write automated collection bots, and use those profiles and bots to engage and test the profiles and collect resulting data. The Plaintiffs freely admitted that their activities violated various terms of service provisions of third party Internet sites, and thus sought a preemptive declaratory judgment ruling from the federal court to allow their activities.

In ruling in the academics’ favor, the court found that web scraping “falls under the ambit of the First Amendment.” And that “simply placing contractual conditions on accounts that anyone can create, as social media and many other sites do, does not remove a website from the First Amendment protections of the public Internet.” The Sandvig Court noted that the US Supreme Court has made a number of recent statements that give full First Amendment application to the gathering of public information, most notably, Packingham v. North Carolina, 137 S. Ct. 1730 (2017). The Packingham Court struck down a state prohibition on registered sex offenders accessing social media websites, without concern for whether any given website might itself act to bar users who are registered sex offenders. As Packingham recognized, given the nature of the Internet as a locus of communication and expressive activity, state-imposed restrictions on accessing the Internet are subject to First Amendment scrutiny regardless of whether an individual is seeking “access to particular websites, run by private companies.”

Regarding the issue of automated collection techniques, the Sandvig Court determined “that plaintiffs wish to scrape data from websites rather than manually record information does not change the analysis. Scraping is merely a technological advance that makes information collection easier; it is not meaningfully different from using a tape recorder instead of taking written notes, or using the panorama function on a smartphone instead of taking a series of photos from different positions. And, as already discussed, the information plaintiffs seek is located in a public forum.”

And regarding the artificial or “fake” accounts needed to collect data and interact with the public information on the subject sites, the court ruled “plaintiffs have a First Amendment interest in harmlessly misrepresenting their identities to target websites.” The Court noted that plaintiffs’ research requires them to create false employer and job-seeker profiles on employment websites and to use automated bots “to make it appear to a number of housing and employment sites that multiple people are accessing the information they have made available.” The court again cited the Supreme Court, this time in United States v. Alvarez, 567 U.S. 709 (2012), which provides that because “some false statements are inevitable if there is to be an open and vigorous expression of views in public and private conversation,” and because “[t]he Government has not demonstrated that false statements generally should constitute a new category of unprotected speech”  false claims that are not “made to effect a fraud or secure moneys or other valuable considerations fall within First Amendment protection.”

The Sandvig ruling is directly relevant to the user base of X1 Social Discovery, who use X1 to gather evidence for judicial purposes, including civil litigation, regulatory and corporate compliance matters, and for their criminal defense. This is constitutionally protected activity under the free speech and right to petition (access the courts) clauses of the First Amendment, and the Fifth Amendment right to due process. In the international arena, there similarly is a recognized common law right to gather evidence.

In fact, the public policy and Constitutional justifications are arguably particularly stronger for those who use web collection to gather evidence for judicial and compliance purposes. Social media evidence collection of third party public data is a standard and widespread practice mandated by the courts, legal ethics, and the rules of evidence. Copious volumes of case law, legal treatises, and state bar ethics opinions provide that the legal duties of competency and diligent representation necessitate that attorneys search, collect and preserve social media evidence, and that the failure to do so can constitute ineffective assistance of counsel. (see, Cannedy v Adams, 706 F.3d 1148 (9th Cir. 2013)).

It is also critical that the right software be used for social media evidence collection. When lawyers and their representatives resort to print screen or simple flat file images, they are failing to collect key metadata and cannot effectively authenticate the evidence, thus rendering it worthlessly inadmissible for trial purposes. Simple screen captures are not defensible, with several courts disallowing or otherwise calling into question social media evidence presented in the form of a screen shot image. Additionally, flat file images dramatically increase eDiscovery processing costs, while the right software can readily import data straight into attorney review platforms for efficient search, review, tagging, redaction and production.

There are of course limitations to be aware of in this arena. Engaging in mass surveillance or engaging in large-scale data collection and brokering that data for marketing or other improper purposes would arguably fall outside of these protections. However, these recent legal rulings, including relevant Constitutional analysis and precedent from the US Supreme Court, provide that important guidance to legal and compliance professionals who necessarily utilize examiner accounts to engage in web collection of social media data for judicial and compliance purposes.