The social media behemoth admits it can’t keep track of the data it collects on users
Facebook has acknowledged that even its own privacy engineers struggle to understand how users’ personal data is processed and stored in a recently-leaked document published by the Motherboard news outlet on Wednesday.
“We’ve built systems with open borders,” the document, apparently written by privacy engineers working on the Ad and Business Product team, laments, likening the platform’s data management systems to a “bottle of ink” comprised of different kinds of user information. “You pour that ink into a lake of water (our open data systems; our open culture) … and it flows … everywhere,” the team explains.
“How do you put that ink back in the bottle? How do you organize it again, such that it only flows to the allowed places in the lake?” the team asked, describing the conundrum as a “fundamental” problem for Facebook.
“We can’t confidently make controlled policy changes or external commitments such as ‘we will not use X data for Y purpose’. And yet, this is exactly what regulators expect us to do,” the document, written in 2021, continues. The model is supposed to sit “at the center of [Facebook’s] monetization strategy and [act as] the engine that powers Facebook’s growth,” yet the company’s own employees clearly struggle to understand the basic concepts underpinning that strategy.
The issue, referred to as “data lineage,” is at the core of recent legal developments in multiple countries regarding how social media data is used. If Facebook collects a user’s phone number for the stated purpose of securing the individual’s account with two-factor authentication, for example, it is illegal to feed it to the platform’s “people you may know” feature under the EU’s 2018 General Data Protection Regulation law. Tech blog Gizmodo caught Facebook doing just that shortly after the law was passed, and the platform had to stop the practice.
As governments – including the EU, India, and the US – pass increasingly stringent regulations aimed at controlling speech on social media, the inability of Facebook employees to manage or even understand how their platforms handle user data is becoming an increasingly serious issue.
The leaked document obtained by Motherboard suggests employees may not even be able to limit the use of individuals’ data, due to the sheer volume of information collected on a daily basis.
While a Facebook spokesperson denied the document constituted evidence it was not complying with privacy regulations, an employee who spoke to the outlet on the condition of anonymity argued that, if anything, the paper’s condemnation of Facebook’s cluelessness didn’t go far enough. “Facebook has a general idea of how many bits of data are stored in its data centers. The where [the data] goes part is, broadly speaking, a complete s***show,” the employee told Vice in an online chat, suggesting it gave Facebook “legal cover” because of how much it would cost the company to “fix this mess.”
“It gives them the excuse for keeping that much private data simply because at their scale and with their business model and infrastructure design they can plausibly claim that they don’t know what they have,” the employee explained.
Privacy activist and senior fellow at the Irish Council for Civil Liberties Johnny Ryan told Motherboard that the document “admits what we long suspected: that there is a data free-for-all inside Facebook, and that the company has no control whatsoever over the data it holds.”
“It is a black and white recognition of the absence of any data protection. Facebook details how it breaks each principle of data protection law. Everything it does to our data is illegal. You’re not allowed to have an internal data free-for-all,” Ryan insisted.
The document warns that a “multi-year investment in Ads and our infrastructure teams” will be required in order to “gain control over how our systems ingest, process and egest data” in order to bring the platform into compliance with the current regulatory climate, warning that restrictions on the use of individuals’ private data “will continue to expand around the globe as we shift toward consent.”
We face a tsunami of inbound regulations that all carry massive uncertainty.