The Internet Is Turning Into a Data Black Box. An ‘Inspectability API’ Could Crack It Open

The Internet Is Turning Into a Data Black Box. An ‘Inspectability API’ Could Crack It Open

In today’s digital world, injustice lurks in the shadows of the Facebook post that’s delivered to certain groups of people at the exclusion of others, the hidden algorithm used to profile candidates during job interviews, and the risk-assessment algorithms used for criminal sentencing and welfare fraud detention. As algorithmic systems are integrated into every aspect of society, regulatory mechanisms struggle to keep up.

Over the past decade, researchers and journalists have found ways to unveil and scrutinize these discriminatory systems, developing their own data collection tools. As the internet has moved from browsers to mobile apps, however, this crucial transparency is quickly disappearing.

Third-party analysis of digital systems has largely been made possible by two seemingly banal tools that are commonly used to inspect what’s happening on a webpage: browser add-ons and browser developer tools.

Browser add-ons are small programs that can be installed directly onto a web browser, allowing users to augment how they interact with a given website. While add-ons are commonly used to operate tools like password managers and ad-blockers, they are also incredibly useful for enabling people to collect their own data within a tech platform’s walled garden.

Similarly, browser developer tools were made to allow web developers to test and debug their websites’ user interfaces. As the internet evolved and websites became more complex, these tools evolved too, adding features like the ability to inspect and change source code, monitor network activity, and even detect when a website is accessing your location or microphone. These are powerful mechanisms for investigating how companies track, profile, and target their users.

I have put these tools to use as a data journalist to show how a marketing company logged users’ personal data even before they clicked “submit” on a form and, more recently, how the Meta Pixel tool (formerly the Facebook Pixel tool) tracks users without their explicit knowledge in sensitive places such as hospital websites, federal student loan applications, and the websites of tax-filing tools.

In addition to exposing surveillance, browser inspection tools provide a powerful way to crowdsource data to study discrimination, the spread of misinformation, and other types of harms tech companies cause or facilitate. But in spite of these tools’ powerful capabilities, their reach is limited. In 2023, Kepios reported that 92 percent of global users accessed the internet through their smartphones, whereas only 65 percent of global users did so using a desktop or laptop computer.

Though the vast majority of internet traffic has moved to smartphones, we don’t have tools for the smartphone ecosystem that afford the same level of “inspectability” as browser add-ons and developer tools. This is because web browsers are implicitly transparent, while mobile phone operating systems are not.

If you want to view a website in your web browser, the server has to send you the source code. Mobile apps, on the other hand, are compiled, executable files that you usually download from places such as Apple’s iOS App Store or Google Play. App developers don’t need to publish the source code for people to use them.

Similarly, monitoring network traffic on web browsers is trivial. This technique is often more useful than inspecting source code to see what data a company is collecting on users. Want to know which companies a website shares your data with? You’ll want to monitor the network traffic, not inspect the source code. On smartphones, network monitoring is possible, but it usually requires the installation of root certificates that make users’ devices less secure and more vulnerable to man-in-the-middle attacks from bad actors. And these are just some of the differences that make collecting data securely from smartphones much harder than from browsers.

The need for independent collection is more pressing than ever. Previously, company-provided tools such as the Twitter API and Facebook’s CrowdTangle, a tool for monitoring what’s trending on Facebook, were the infrastructure that powered a large portion of research and reporting on social media. However, as these tools become less useful and accessible, new methods of independent data collection are needed to understand what these companies are doing and how people are using their platforms.

To meaningfully report on the impact digital systems have on society, we need to be able to observe what’s taking place on our devices without asking a company for permission. As someone who has spent the past decade building tools that crowdsource data to expose algorithmic harms, I believe the public should have the ability to peek under the hood of their mobile apps and smart devices, just as they can on their browsers. And it’s not just me: The Integrity Institute, a nonprofit working to protect the social internet, recently released a report that lays bare the importance of transparency as a lever to achieve public interest goals like accountability, collaboration, understanding, and trust.

To demand transparency from tech platforms, we need a platform-independent transparency framework, something that I like to call an inspectability API. Such a framework would empower even the most vulnerable populations to capture evidence of harm from their devices while minimizing the risk of their data being used in research or reporting without their consent.

An application programming interface (API) is a way for companies to make their services or data available to other developers. For example, if you’re building a mobile app and want to use the phone’s camera for a specific feature, you would use the iOS or Android Camera API. Another common example is an accessibility API, which allows developers to make their applications accessible to people with disabilities by making the user interface legible to screen readers and other accessibility tools commonly found on modern smartphones and computers. An inspectability API would allow individuals to export data from the apps they use every day and share it with researchers, journalists, and advocates in their communities. Companies could be required to implement this API to adhere to transparency best practices, much as they are required to implement accessibility features to make their apps and websites usable for people with disabilities.

In the US, residents of some states can request the data companies collect on them, thanks to state-level privacy laws. While these laws are well-intentioned, the data that companies share to comply with them is usually structured in a way that obfuscates crucial details that would expose harm. For example, Facebook has a fairly granular data export service that allows individuals to see, amongst other things, their “Off-Facebook activity.” However, as the Markup found during a series of investigations into the use of Pixel, even though Facebook told users which websites were sharing data, it did not reveal just how invasive the information being shared was. Doctor appointments, tax filing information, and student loan information were just some of the things that were being sent to Facebook. An inspectability API would make it easy for people to monitor their devices and see how the apps they use track them in real time.

Some promising work is already being done: Apple’s introduction of the App Privacy Report in iOS 15 marked the first time iPhone users could see detailed privacy information to understand each app’s data collection practices and even answer questions such as, “Is Instagram listening to my microphone?”

But we cannot rely on companies to do this at their discretion—we need a clear framework to define what sort of data should be inspectable and exportable by users, and we need regulation that penalizes companies for not implementing it. Such a framework would not only empower users to expose harms, but also ensure that their privacy is not violated. Individuals could choose what data to share, when, and with whom.

An inspectability API will empower individuals to fight for their rights by sharing the evidence of harm they have been exposed to with people who can raise public awareness and advocate for change. It would enable organizations such as Princeton’s Digital Witness Lab, which I cofounded and lead, to conduct data-driven investigations by collaborating closely with vulnerable communities, instead of relying on tech companies for access. This framework would allow researchers and others to conduct this work in a way that is safe, precise, and, most importantly, prioritizes the consent of the people being harmed.

Leave a Reply

Your email address will not be published. Required fields are marked *