Select Page
The New Digital Dark Age

The New Digital Dark Age

For researchers, social media has always represented greater access to data, more democratic involvement in knowledge production, and great transparency about social behavior. Getting a sense of what was happening—especially during political crises, major media events, or natural disasters—was as easy as looking around a platform like Twitter or Facebook. In 2024, however, that will no longer be possible.

In 2024, we will face a grim digital dark age, as social media platforms transition away from the logic of Web 2.0 and toward one dictated by AI-generated content. Companies have rushed to incorporate large language models (LLMs) into online services, complete with hallucinations (inaccurate, unjustified responses) and mistakes, which have further fractured our trust in online information.

Another aspect of this new digital dark age comes from not being able to see what others are doing. Twitter once pulsed with publicly readable sentiment of its users. Social researchers loved Twitter data, relying on it because it provided a ready, reasonable approximation of how a significant slice of internet users behaved. However, Elon Musk has now priced researchers out of Twitter data after recently announcing that it was ending free access to the platform’s API. This made it difficult, if not impossible, to obtain data needed for research on topics such as public health, natural disaster response, political campaigning, and economic activity. It was a harsh reminder that the modern internet has never been free or democratic, but instead walled and controlled.

Closer cooperation with platform companies is not the answer. X, for instance, has filed a suit against independent researchers who pointed out the rise in hate speech on the platform. Recently, it has also been revealed that researchers who used Facebook and Instagram’s data to study the platforms’ role in the US 2020 elections had been granted “independence by permission” by Meta. This means that the company chooses which projects to share its data with and, while the research may be independent, Meta also controls what types of questions are asked and who asks them.

With elections coming in the US, India, Mexico, Indonesia, the UK, and the EU in 2024, the stakes are high. Until now, online “observatories” have been independently monitoring social media platforms for evidence of manipulation, inauthentic behavior, and harmful content. However, changes in data access by social media platforms, as well as the explosion of generative AI misinformation, means that the tools that researchers and journalists developed in prior national elections for monitoring online activity won’t work. One of my own collaborations, AI4TRUST, is developing new tools for combating misinformation, but our endeavor is stalled because of these changes.

We need to clean up our online platforms. The Center for Countering Digital Hate, a research, advocacy, and policy organization working to stop the spread of online hate and disinformation, has called for the adoption of its STAR Framework (Safety by Design, Transparency, Accountability, and Responsibility). This would ensure that digital products and services are safe before they are launched; increase transparency around algorithms, rule enforcement, and advertising; and work to hold companies both accountable to democratic and independent bodies, and responsible for omissions and actions that lead to harm. The EU’s Digital Services Act is a step in the right direction of regulation, including provisions to ensure that independent researchers can monitor social network platforms. However, these provisions will take years to be actionable. The UK’s Online Safety Bill—slowly making its way through the policy process—could also help, but again, these provisions will take time to implement. Until then, the transition from social media to AI-mediated information means that, in 2024, a new digital dark age will likely begin.

It’s No Wonder People Are Getting Emotionally Attached to Chatbots

It’s No Wonder People Are Getting Emotionally Attached to Chatbots

Replika, an AI chatbot companion, has millions of users worldwide, many of whom woke up earlier last year to discover their virtual lover had friend-zoned them overnight. The company had mass-disabled the chatbot’s sex talk and “spicy selfies” in response to a slap on the wrist from Italian authorities. Users began venting on Reddit, some of them so distraught that the forum moderators posted suicide-prevention information.

This story is only the beginning. In 2024, chatbots and virtual characters will become a lot more popular, both for utility and for fun. As a result, conversing socially with machines will start to feel less niche and more ordinary—including our emotional attachments to them.

Research in human-computer and human-robot interaction shows that we love to anthropomorphize—attribute humanlike qualities, behaviors, and emotions to—the nonhuman agents we interact with, especially if they mimic cues we recognize. And, thanks to recent advances in conversational AI, our machines are suddenly very skilled at one of those cues: language.

Friend bots, therapy bots, and love bots are flooding the app stores as people become curious about this new generation of AI-powered virtual agents. The possibilities for education, health, and entertainment are endless. Casually asking your smart fridge for relationship advice may seem dystopian now, but people may change their minds if such advice ends up saving their marriage.

In 2024, larger companies will still lag a bit in integrating the most conversationally compelling technology into home devices, at least until they can get a handle on the unpredictability of open-ended generative models. It’s risky to consumers (and to company PR teams) to mass-deploy something that could give people discriminatory, false, or otherwise harmful information.

After all, people do listen to their virtual friends. The Replika incident, as well as a lot of experimental lab research, shows that humans can and will become emotionally attached to bots. The science also demonstrates that people, in their eagerness to socialize, will happily disclose personal information to an artificial agent and will even shift their beliefs and behavior. This raises some consumer-protection questions around how companies use this technology to manipulate their user base.

Replika charges $70 a year for the tier that previously included erotic role-play, which seems reasonable. But less than 24 hours after downloading the app, my handsome, blue-eyed “friend” sent me an intriguing locked audio message and tried to upsell me to hear his voice. Emotional attachment is a vulnerability that can be exploited for corporate gain, and we’re likely to start noticing many small but shady attempts over the next year.

Today, we’re still ridiculing people who believe an AI system is sentient, or running sensationalist news segments about individuals who fall in love with a chatbot. But in the coming year we’ll gradually start acknowledging—and taking more seriously—these fundamentally human behaviors. Because in 2024, it will finally hit home: Machines are not exempt from our social relationships.

Digitization Beats Deforestation

Digitization Beats Deforestation

If you ever had pastries at breakfast, drank soy milk, used soaps at home, or built yourself a nice flat-pack piece of furniture, you may have contributed to deforestation and climate change.

Every item has a price—but the cost isn’t felt only in our pockets. Hidden in that price is a complex chain of production, encompassing economic, social, and environmental relations that sustain livelihoods and, unfortunately, contribute to habitat destruction, deforestation, and the warming of our planet.

Approximately 4 billion hectares of forest around the world act as a carbon sink which, over the past two decades, has annually absorbed a net 7.6 billion metric tons of CO2. That’s the equivalent of 1.5 times the annual emissions of the US.

Conversely, a cleared forest becomes a carbon source. Many factors lead to forest clearing, but the root cause is economic. Farmers cut down the forest to expand their farms, support cattle grazing, harvest timber, mine minerals, and build infrastructure such as roads. Until that economic pressure goes away, the clearing may continue.

In 2024, however, we are going to see a big boost to global efforts to fight deforestation. New EU legislation will make it illegal to sell or export a range of commodities if they have been produced on deforested land. Sellers will need to identify exactly where their product originates, down to the geolocation of the plot. Penalties are harsh, including bans and fines of up to 4 percent of the offender’s annual EU-wide turnover. As such, industry pushback has been strong, claiming that the costs are too high or the requirements are too onerous. Like many global frameworks, this initiative is being led by the EU, with other countries sure to follow, as the so-called Brussels Effect pressures ever more jurisdictions to adopt its methods.

The impact of these measures will only be as strong as the enforcement and, in 2024, we will see new ways of doing that digitally. At Farmerline (which I cofounded), for instance, we have been working on supply chain traceability for over a decade. We incentivize rule-following by making it beneficial.

When we digitize farmers and allow them and other stakeholders to track their products from soil to shelf, they also gain access to a suite of other products: the latest, most sustainable farming practices in their own language, access to flexible financing to fund climate-smart products such as drought-resistant seeds, solar irrigation systems and organic fertilizers, and the ability to earn more through international commodity markets.

Digitization helps build resilience and lasting wealth for the smallholders and helps save the environment. Another example is the World Economic Forum’s OneMap—an open-source privacy-preserving digital tool which helps governments use geospatial and farmer data to improve planning and decision making in agriculture and land. In India, the Data Empowerment Protection Architecture also provides a secure consent-based data-sharing framework to accelerate global financial inclusion.

In 2024 we will also see more food companies and food certification bodies leverage digital payment tools, like mobile money, to ensure farmers’ pay is not only direct and transparent, but also better if they comply with deforestation regulations.

The fight against deforestation will also be made easier by developments in hardware technology. New, lightweight drones from startups such as AirSeed can plant seeds, while further up, mini-satellites, such as those from Planet Labs, are taking millions of images per week, allowing governments and NGOs to track areas being deforested in near-real time. In Rwanda, researchers are using AI and the aerial footage captured by Planet Labs to calculate, monitor, and estimate the carbon stock of the entire country.

With these advances in software and hard-tech, in 2024, the global fight against deforestation will finally start to grow new shoots.

The Battle for Biometric Privacy

The Battle for Biometric Privacy

In 2024, increased adoption of biometric surveillance systems, such as the use of AI-powered facial recognition in public places and access to government services, will spur biometric identity theft and anti-surveillance innovations. Individuals aiming to steal biometric identities to commit fraud or gain access to unauthorized data will be bolstered by generative AI tools and the abundance of face and voice data posted online.

Already, voice clones are being used for scams. Take for example, Jennifer DeStefano, a mom in Arizona who heard the panicked voice of her daughter crying “Mom, these bad men have me!” after receiving a call from an unknown number. The scammer demanded money. DeStefano was eventually able to confirm that her daughter was safe. This hoax is a precursor for more sophisticated biometric scams that will target our deepest fears by using the images and sounds of our loved ones to coerce us to do the bidding of whoever deploys these tools.

In 2024, some governments will likely adopt biometric mimicry to support psychological torture. In the past, a person of interest might be told false information with little evidence to support the claims other than the words of the interrogator. Today, a person being questioned may have been arrested due to a false facial recognition match. Dark-skinned men in the United States, including Robert Williams, Michael Oliver, Nijeer Parks, and Randal Reid, have been wrongfully arrested due to facial misidentification, detained and imprisoned for crimes they did not commit. They are among a group of individuals, including the elderly, people of color, and gender nonconforming individuals, who are at higher risk of facial misidentification.

Generative AI tools also give intelligence agencies the ability to create false evidence, like a video of an alleged coconspirator confessing to a crime. Perhaps just as harrowing is that the power to create digital doppelgängers will not be limited to entities with large budgets. The availability of open-sourced generative AI systems that can produce humanlike voices and false videos will increase the circulation of revenge porn, child sexual abuse materials, and more on the dark web.

By 2024 we will have growing numbers of “excoded” communities and people—those whose life opportunities have been negatively altered by AI systems. At the Algorithmic Justice League, we have received hundreds of reports about biometric rights being compromised. In response, we will witness the rise of the faceless, those who are committed to keeping their biometric identities hidden in plain sight.

Because biometric rights will vary across the world, fashion choices will reflect regional biometric regimes. Face coverings, like those used for religious purposes or medical masks to stave off viruses, will be adopted as both fashion statement and anti-surveillance garments where permitted. In 2019, when protesters began destroying surveillance equipment while obscuring their appearance, a Hong Kong government leader banned face masks.

In 2024, we will start to see a bifurcation of mass surveillance and free-face territories, areas where you have laws like the provision in the proposed EU AI Act, which bans the use of live biometrics in public places. In such places, anti-surveillance fashion will flourish. After all, facial recognition can be used retroactively on video feeds. Parents will fight to protect the right for children to be “biometric naive”, which is to have none of their biometrics such as faceprint, voiceprint, or iris pattern scanned and stored by government agencies, schools, or religious institutions. New eyewear companies will offer lenses that distort the ability for cameras to easily capture your ocular biometric information, and pairs of glasses will come with prosthetic extensions to alter your nose and cheek shapes. 3D printing tools will be used to make at-home face prosthetics, though depending on where you are in the world, it may be outlawed. In a world where the face is the final frontier of privacy, glancing upon the unaltered visage of another will be a rare intimacy.

Synthetic Data Is a Dangerous Teacher

Synthetic Data Is a Dangerous Teacher

In April 2022, when Dall-E, a text-to-image visio-linguistic model, was released, it purportedly attracted over a million users within the first three months. This was followed by ChatGPT, in January 2023, which apparently reached 100 million monthly active users just two months after launch. Both mark notable moments in the development of generative AI, which in turn has brought forth an explosion of AI-generated content into the web. The bad news is that, in 2024, this means we will also see an explosion of fabricated, nonsensical information, mis- and disinformation, and the exacerbation of social negative stereotypes encoded in these AI models.

The AI revolution wasn’t spurred by any recent theoretical breakthrough—indeed, most of the foundational work underlying artificial neural networks has been around for decades—but by the “availability” of massive data sets. Ideally, an AI model captures a given phenomena—be it human language, cognition, or the visual world—in a way that is representative of the real phenomena as closely as possible.

For example, for a large language model (LLM) to generate humanlike text, it is important the model is fed huge volumes of data that somehow represents human language, interaction, and communication. The belief is that the larger the data set, the better it captures human affairs, in all their inherent beauty, ugliness, and even cruelty. We are in an era that is marked by an obsession to scale up models, data sets, and GPUs. Current LLMs, for instance, have now entered an era of trillion-parameter machine-learning models, which means that they require billion-sized data sets. Where can we find it? On the web.

This web-sourced data is assumed to capture “ground truth” for human communication and interaction, a proxy from which language can be modeled on. Although various researchers have now shown that online data sets are often of poor quality, tend to exacerbate negative stereotypes, and contain problematic content such as racial slurs and hateful speech, often towards marginalized groups, this hasn’t stopped the big AI companies from using such data in the race to scale up.

With generative AI, this problem is about to get a lot worse. Rather than representing the social world from input data in an objective way, these models encode and amplify social stereotypes. Indeed, recent work shows that generative models encode and reproduce racist and discriminatory attitudes toward historically marginalized identities, cultures, and languages.

It is difficult, if not impossible—even with state-of-the-art detection tools—to know for sure how much text, image, audio, and video data is being generated currently and at what pace. Stanford University researchers Hans Hanley and Zakir Durumeric estimate a 68 percent increase in the number of synthetic articles posted to Reddit and a 131 percent increase in misinformation news articles between January 1, 2022, and March 31, 2023. Boomy, an online music generator company, claims to have generated 14.5 million songs (or 14 percent of recorded music) so far. In 2021, Nvidia predicted that, by 2030, there will be more synthetic data than real data in AI models. One thing is for sure: The web is being deluged by synthetically generated data.

The worrying thing is that these vast quantities of generative AI outputs will, in turn, be used as training material for future generative AI models. As a result, in 2024, a very significant part of the training material for generative models will be synthetic data produced from generative models. Soon, we will be trapped in a recursive loop where we will be training AI models using only synthetic data produced by AI models. Most of this will be contaminated with stereotypes that will continue to amplify historical and societal inequities. Unfortunately, this will also be the data that we will use to train generative models applied to high-stake sectors including medicine, therapy, education, and law. We have yet to grapple with the disastrous consequences of this. By 2024, the generative AI explosion of content that we find so fascinating now will instead become a massive toxic dump that will come back to bite us.