Dark Data is an Unmet Cyber Security Challenge

By John Patzakis
September 28, 2021

Enterprises today are creating and storing massive volumes of unstructured data, distributed across the enterprise at a very fast pace. IT experts refer to this data type as “dark data.” Research advisory firm Gartner defines dark data as “the information assets organizations collect, process and store during regular business activities, but generally fail to use for other purposes.” according to Rahul Telang, professor of information systems at Carnegie Mellon University, “[o]ver 90% of the data in business is dark data.”

Dark data exists due to organizational silos and a highly distributed and mobile workforce, a trend that proliferated during the COVID pandemic and has now solidified as the new normal. As a result, there is a proliferation of unmanaged data stored in file shares, laptops, unarchived email accounts, shared cloud drives such as OneDrive and Dropbox and many other repositories. According to Anthony Juliano, CTO of Landmark Ventures, “dark data is exploding rapidly with the dissolution of the perimeter; it’s a largely unaddressed risk vector. A vast majority of the CIOs and CISOs I speak with are now prioritizing solving this problem not only going forward, but also backwards – and it’s not easy.”

Cyber security platforms generally have a good handle on perimeter integrity, encryption, and other key priorities such as zero day network attacks and malware. However, while these measures are clearly important, distributed dark data is largely a blind spot for cybersecurity tech, and as such organizations have very little visibility into the content of such data. GDPR, CCPA and other recent privacy regulatory requirements add increased urgency to this challenge.

CISOs and legal and compliance executives often aspire to implement information governance and security programs like defensible deletion, data migration, and data audits across their unstructured data to detect risks and remediate non-compliance. However, without an actual and scalable technology platform to effectuate these goals, those aspirations remain just that.

One tactic attempted by some CIOs to attempt to address this daunting challenge is to periodically migrate disparate data from around the global enterprise into a central location, such as an archiving platform. But boiling the ocean through data migration and centralization is extremely expensive, highly disruptive, and frankly unworkable for numerous reasons. While such a concept may seem like a good idea when drawn up on the whiteboard, originations quickly learn that you cannot just migrate hundreds of terabytes of distributed dark data to an archive, mainly due to network bandwidth and other logistical constraints, as well as the reality that you are merely copying and duplicating the data being migrated, which actually makes the situation worse.

Another tactic is data loss prevention (DLP). Again, this approach is thwarted by the new normal of a distributed, global workforce. Additionally, DLP tools are traditionally hampered by an inability to have deep content insight to unstructured data, resulting in false positives, inaccurate classification and unacceptable disruption to employee and business workflows.

What has always been needed is gaining immediate visibility into unstructured distributed data across the enterprise in-place, through the ability to search and report across several thousand endpoints, file shares and other unstructured data sources, and return results within minutes instead of days or weeks. None of the other approaches outlined above come close to meeting this requirement and in fact actually perpetuate information security and governance failures.

Born and bred to address global eDiscovery challenges, X1 Enterprise platform (X1E) represents a unique approach to dark data, by enabling enterprises to quickly and easily search across multiple distributed endpoints and data servers in-place through a true distributed, parallelized computing architecture. Legal, security and compliance teams can easily perform unified complex searches across both unstructured content and metadata, obtaining statistical insight into the data in minutes, instead of days or weeks. With X1E, organizations can also automatically migrate, collect, or take other action on the data as a result of the search parameters. Built on our award-winning and patented X1 Search technology, X1E is the first product to offer true and massively scalable distributed searching that is executed in its entirety on the end-node computers for data audits across an organization. This game-changing capability vastly reduces costs while greatly mitigating risk and disruption to operations.