We, the digital proletariat

Data is the lifeblood that allows any machine learning model to perform its task. It’s not magic. It’s not the so-called “AI” being intelligent or sentient. It’s statistics. And, as the tech sector delves into rapid developments of specialised LLMs that they can further commoditise, they will require vast amounts of diverse data to train—leading to what some have called as data hunting.

Just as how the industrial revolution turned human labour into a mere cog in the machine, the so-called AI revolution has converted our interactions and our identities into commodities to be bought and sold. We have unwittingly contributed to the very systems that exploit us. Our digital footprints are the raw materials for the new age capitalists: the Big Tech. Every single one of us who have posted even a single piece of content online are all part of this,whether we like it or not. In the recent New York Times vs OpenAI legal debacle, the latter claimed that it is impossible to create ‘AI’ tools like GPT without copyrighted materials. If they can bend the law to however they want, what makes you think those images you posted on Facebook as part of your December dump were not used to train Meta’s ML models?

We are now part of a universal digital sweatshop that transcends international borders. Our labour is ignored and uncompensated based on the belief that since we shared content freely, companies have the right to monetise it whenever and however they want. Time and time again, as we have seen in recent news, companies have collected our data without explicit consent. When they do ask for ‘consent’, they give us word salad in the user agreements or just ask us to opt our way out of the inferno that they manufactured. The aggressive collection of data paves the way for a future where a few corporations will have disproportionate control over vast datasets, which they can exploit for unwarranted targeted advertising, surveillance and practices that would reinforce biases or unfairly influence individual choices and behaviours.

Consistent with their greedy branding, the exploitation, of course, does not end with the involuntary surrender of what I call our ‘quantified selves’ to Big Tech. In fact, it extends to more tangible exploitation of human labour in the Global South. Services like Amazon Mechanical Turk uses Human API to perform tasks such as 'identifying the red apple in this image of a fruit basket.' Of course, they would not dare to brand them as digital, underpaid slaves. Instead, they prefer to label them as freelancers to make it sound ethical. Hurray! Another job for the PR industry! These freelancers forfeit any remaining vestiges of their bargaining power for as little as 2 USD a day so that your LLMs will not spew out rubbish.

This is the reality behind your glamorous “AI” models. While “AI” companies in the developed world reap huge profits, the groundwork is outsourced to workers in Bangladesh, Kenya, the Philippines and India. How disgusting it is that the very countries that were once plundered for their resources are now the same countries being exploited for cheap labour? But it is fine, isn’t it? As long as we don’t see them. Out of sight, out of mind.