Select Page
The Real Harm of Crisis Text Line’s Data Sharing

The Real Harm of Crisis Text Line’s Data Sharing

Another week, another privacy horror show: Crisis Text Line, a nonprofit text message service for people experiencing serious mental health crises, has been using “anonymized” conversation data to power a for-profit machine learning tool for customer support teams. (After backlash, CTL announced it would stop.) Crisis Text Line’s response to the backlash focused on the data itself and whether it included personally identifiable information. But that response uses data as a distraction. Imagine this: Say you texted Crisis Text Line and got back a message that said “Hey, just so you know, we’ll use this conversation to help our for-profit subsidiary build a tool for companies who do customer support.” Would you keep texting?

That’s the real travesty—when the price of obtaining mental health help in a crisis is becoming grist for the profit mill. And it’s not just users of CTL who pay; it’s everyone who goes looking for help when they need it most.

Americans need help and can’t get it. The huge unmet demand for critical advice and help has given rise to a new class of organizations and software tools that exist in a regulatory gray area. They help people with bankruptcy or evictions, but they aren’t lawyers; they help people with mental health crises, but they aren’t care providers. They invite ordinary people to rely on them and often do provide real help. But these services can also avoid taking responsibility for their advice, or even abuse the trust people have put in them. They can make mistakes, push predatory advertising and disinformation, or just outright sell data. And the consumer safeguards that would normally protect people from malfeasance or mistakes by lawyers or doctors haven’t caught up.

This regulatory gray area can also constrain organizations that have novel solutions to offer. Take Upsolve, a nonprofit that develops software to guide people through bankruptcy. (The organization takes pains to claim it does not offer legal advice.) Upsolve wants to train New York community leaders to help others navigate the city’s notorious debt courts. One problem: These would-be trainees aren’t lawyers, so under New York (and nearly every other state) law, Upsolve’s initiative would be illegal. Upsolve is now suing to carve out an exception for itself. The company claims, quite rightly, that a lack of legal help means people effectively lack rights under the law.

The legal profession’s failure to grant Americans access to support is well-documented. But Upsolve’s lawsuit also raises new, important questions. Who is ultimately responsible for the advice given under a program like this, and who is responsible for a mistake—a trainee, a trainer, both? How do we teach people about their rights as a client of this service, and how to seek recourse? These are eminently answerable questions. There are lots of policy tools for creating relationships with elevated responsibilities: We could assign advice-givers a special legal status, establish a duty of loyalty for organizations that handle sensitive data, or create policy sandboxes to test and learn from new models for delivering advice.

But instead of using these tools, most regulators seem content to bury their heads in the sand. Officially, you can’t give legal advice or health advice without a professional credential. Unofficially, people can get such advice in all but name from tools and organizations operating in the margins. And while credentials can be important, regulators are failing to engage with the ways software has fundamentally changed how we give advice and care for one another, and what that means for the responsibilities of advice-givers.

And we need that engagement more than ever. People who seek help from experts or caregivers are vulnerable. They may not be able to distinguish a good service from a bad one. They don’t have time to parse terms of service dense with jargon, caveats, and disclaimers. And they have little to no negotiating power to set better terms, especially when they’re reaching out mid-crisis. That’s why the fiduciary duties that lawyers and doctors have are so necessary in the first place: not just to protect a person seeking help once, but to give people confidence that they can seek help from experts for the most critical, sensitive issues they face. In other words, a lawyer’s duty to their client isn’t just to protect that client from that particular lawyer; it’s to protect society’s trust in lawyers.

And that’s the true harm—when people won’t contact a suicide hotline because they don’t trust that the hotline has their sole interest at heart. That distrust can be contagious: Crisis Text Line’s actions might not just stop people from using Crisis Text Line. It might stop people from using any similar service. What’s worse than not being able to find help? Not being able to trust it.

Simulation Tech Can Help Predict the Biggest Threats

Simulation Tech Can Help Predict the Biggest Threats

The character of conflict between nations has fundamentally changed. Governments and militaries now fight on our behalf in the “gray zone,” where the boundaries between peace and war are blurred. They must navigate a complex web of ambiguous and deeply interconnected challenges, ranging from political destabilization and disinformation campaigns to cyberattacks, assassinations, proxy operations, election meddling, or perhaps even human-made pandemics. Add to this list the existential threat of climate change (and its geopolitical ramifications) and it is clear that the description of what now constitutes a national security issue has broadened, each crisis straining or degrading the fabric of national resilience.

Traditional analysis tools are poorly equipped to predict and respond to these blurred and intertwined threats. Instead, in 2022 governments and militaries will use sophisticated and credible real-life simulations, putting software at the heart of their decision-making and operating processes. The UK Ministry of Defence, for example, is developing what it calls a military Digital Backbone. This will incorporate cloud computing, modern networks, and a new transformative capability called a Single Synthetic Environment, or SSE.

This SSE will combine artificial intelligence, machine learning, computational modeling, and modern distributed systems with trusted data sets from multiple sources to support detailed, credible simulations of the real world. This data will be owned by critical institutions, but will also be sourced via an ecosystem of trusted partners, such as the Alan Turing Institute.

An SSE offers a multilayered simulation of a city, region, or country, including high-quality mapping and information about critical national infrastructure, such as power, water, transport networks, and telecommunications. This can then be overlaid with other information, such as smart-city data, information about military deployment, or data gleaned from social listening. From this, models can be constructed that give a rich, detailed picture of how a region or city might react to a given event: a disaster, epidemic, or cyberattack or a combination of such events organized by state enemies.

Defense synthetics are not a new concept. However, previous solutions have been built in a standalone way that limits reuse, longevity, choice, and—crucially—the speed of insight needed to effectively counteract gray-zone threats.

National security officials will be able to use SSEs to identify threats early, understand them better, explore their response options, and analyze the likely consequences of different actions. They will even be able to use them to train, rehearse, and implement their plans. By running thousands of simulated futures, senior leaders will be able to grapple with complex questions, refining policies and complex plans in a virtual world before implementing them in the real one.

One key question that will only grow in importance in 2022 is how countries can best secure their populations and supply chains against dramatic weather events coming from climate change. SSEs will be able to help answer this by pulling together regional infrastructure, networks, roads, and population data, with meteorological models to see how and when events might unfold.

The Turing Test Is Bad For Business

The Turing Test Is Bad For Business

Fears of Artificial intelligence fill the news: job losses, inequality, discrimination, misinformation, or even a superintelligence dominating the world. The one group everyone assumes will benefit is business, but the data seems to disagree. Amid all the hype, US businesses have been slow in adopting the most advanced AI technologies, and there is little evidence that such technologies are contributing significantly to productivity growth or job creation.

This disappointing performance is not merely due to the relative immaturity of AI technology. It also comes from a fundamental mismatch between the needs of business and the way AI is currently being conceived by many in the technology sector—a mismatch that has its origins in Alan Turing’s pathbreaking 1950 “imitation game” paper and the so-called Turing test he proposed therein.

The Turing test defines machine intelligence by imagining a computer program that can so successfully imitate a human in an open-ended text conversation that it isn’t possible to tell whether one is conversing with a machine or a person.

At best, this was only one way of articulating machine intelligence. Turing himself, and other technology pioneers such as Douglas Engelbart and Norbert Wiener, understood that computers would be most useful to business and society when they augmented and complemented human capabilities, not when they competed directly with us. Search engines, spreadsheets, and databases are good examples of such complementary forms of information technology. While their impact on business has been immense, they are not usually referred to as “AI,” and in recent years the success story that they embody has been submerged by a yearning for something more “intelligent.” This yearning is poorly defined, however, and with surprisingly little attempt to develop an alternative vision, it has increasingly come to mean surpassing human performance in tasks such as vision and speech, and in parlor games such as chess and Go. This framing has become dominant both in public discussion and in terms of the capital investment surrounding AI.

Economists and other social scientists emphasize that intelligence arises not only, or even primarily, in individual humans, but most of all in collectives such as firms, markets, educational systems, and cultures. Technology can play two key roles in supporting collective forms of intelligence. First, as emphasized in Douglas Engelbart’s pioneering research in the 1960s and the subsequent emergence of the field of human-computer interaction, technology can enhance the ability of individual humans to participate in collectives, by providing them with information, insights, and interactive tools. Second, technology can create new kinds of collectives. This latter possibility offers the greatest transformative potential. It provides an alternative framing for AI, one with major implications for economic productivity and human welfare.

Businesses succeed at scale when they successfully divide labor internally and bring diverse skill sets into teams that work together to create new products and services. Markets succeed when they bring together diverse sets of participants, facilitating specialization in order to enhance overall productivity and social welfare. This is exactly what Adam Smith understood more than two and a half centuries ago. Translating his message into the current debate, technology should focus on the complementarity game, not the imitation game.

We already have many examples of machines enhancing productivity by performing tasks that are complementary to those performed by humans. These include the massive calculations that underpin the functioning of everything from modern financial markets to logistics, the transmission of high-fidelity images across long distances in the blink of an eye, and the sorting through reams of information to pull out relevant items.

What is new in the current era is that computers can now do more than simply execute lines of code written by a human programmer. Computers are able to learn from data and they can now interact, infer, and intervene in real-world problems, side by side with humans. Instead of viewing this breakthrough as an opportunity to turn machines into silicon versions of human beings, we should focus on how computers can use data and machine learning to create new kinds of markets, new services, and new ways of connecting humans to each other in economically rewarding ways.

An early example of such economics-aware machine learning is provided by recommendation systems, an innovative form of data analysis that came to prominence in the 1990s in consumer-facing companies such as Amazon (“You may also like”) and Netflix (“Top picks for you”). Recommendation systems have since become ubiquitous, and have had a significant impact on productivity. They create value by exploiting the collective wisdom of the crowd to connect individuals to products.

Emerging examples of this new paradigm include the use of machine learning to forge direct connections between musicians and listeners, writers and readers, and game creators and players. Early innovators in this space include Airbnb, Uber, YouTube, and Shopify, and the phrase “creator economy” is being used as the trend gathers steam. A key aspect of such collectives is that they are, in fact, markets—economic value is associated with the links among the participants. Research is needed on how to blend machine learning, economics, and sociology so that these markets are healthy and yield sustainable income for the participants.

Democratic institutions can also be supported and strengthened by this innovative use of machine learning. The digital ministry in Taiwan has harnessed statistical analysis and online participation to scale up the kind of deliberative conversations that lead to effective team decisionmaking in the best managed companies.

Humans Can’t Be the Sole Keepers of Scientific Knowledge

Humans Can’t Be the Sole Keepers of Scientific Knowledge

There’s an old joke that physicists like to tell: Everything has already been discovered and reported in a Russian journal in the 1960s, we just don’t know about it. Though hyperbolic, the joke accurately captures the current state of affairs. The volume of knowledge is vast and growing quickly: The number of scientific articles posted on arXiv (the largest and most popular preprint server) in 2021 is expected to reach 190,000—and that’s just a subset of the scientific literature produced this year.

It’s clear that we do not really know what we know, because nobody can read the entire literature even in their own narrow field (which includes, in addition to journal articles, PhD theses, lab notes, slides, white papers, technical notes, and reports). Indeed, it’s entirely possible that in this mountain of papers, answers to many questions lie hidden, important discoveries have been overlooked or forgotten, and connections remain concealed.

Artificial intelligence is one potential solution. Algorithms can already analyze text without human supervision to find relations between words that help uncover knowledge. But far more can be achieved if we move away from writing traditional scientific articles whose style and structure has hardly changed in the past hundred years.

Text mining comes with a number of limitations, including access to the full text of papers and legal concerns. But most importantly, AI does not really understand concepts and the relationships between them, and is sensitive to biases in the data set, like the selection of papers it analyzes. It is hard for AI—and, in fact, even for a nonexpert human reader—to understand scientific papers in part because the use of jargon varies from one discipline to another and the same term might be used with completely different meanings in different fields. The increasing interdisciplinarity of research means that it is often difficult to define a topic precisely using a combination of keywords in order to discover all the relevant papers. Making connections and (re)discovering similar concepts is hard even for the brightest minds.

As long as this is the case, AI cannot be trusted and humans will need to double-check everything an AI outputs after text-mining, a tedious task that defies the very purpose of using AI. To solve this problem we need to make science papers not only machine-readable but machine-understandable, by (re)writing them in a special type of programming language. In other words: Teach science to machines in the language they understand.

Writing scientific knowledge in a programming-like language will be dry, but it will be sustainable, because new concepts will be directly added to the library of science that machines understand. Plus, as machines are taught more scientific facts, they will be able to help scientists streamline their logical arguments; spot errors, inconsistencies, plagiarism, and duplications; and highlight connections. AI with an understanding of physical laws is more powerful than AI trained on data alone, so science-savvy machines will be able to help future discoveries. Machines with a great knowledge of science could assist rather than replace human scientists.

Mathematicians have already started this process of translation. They are teaching mathematics to computers by writing theorems and proofs in languages like Lean. Lean is a proof assistant and programming language in which one can introduce mathematical concepts in the form of objects. Using the known objects, Lean can reason whether a statement is true or false, hence helping mathematicians verify proofs and identify places where their logic is insufficiently rigorous. The more mathematics Lean knows, the more it can do. The Xena Project at Imperial College London is aiming to input the entire undergraduate mathematics curriculum in Lean. One day, proof assistants may help mathematicians do research by checking their reasoning and searching the vast mathematics knowledge they possess.

What Makes an Artist in the Age of Algorithms?

What Makes an Artist in the Age of Algorithms?

In 2021, technology’s role in how art is generated remains up for debate and discovery. From the rise of NFTs to the proliferation of techno-artists who use generative adversarial networks to produce visual expressions, to smartphone apps that write new music, creatives and technologists are continually experimenting with how art is produced, consumed, and monetized.

BT, the Grammy-nominated composer of 2010’s These Hopeful Machines, has emerged as a world leader at the intersection of tech and music. Beyond producing and writing for the likes of David Bowie, Death Cab for Cutie, Madonna, and the Roots, and composing scores for The Fast and the Furious, Smallville, and many other shows and movies, he’s helped pioneer production techniques like stutter editing and granular synthesis. This past spring, BT released GENESIS.JSON, a piece of software that contains 24 hours of original music and visual art. It features 15,000 individually sequenced audio and video clips that he created from scratch, which span different rhythmic figures, field recordings of cicadas and crickets, a live orchestra, drum machines, and myriad other sounds that play continuously. And it lives on the blockchain. It is, to my knowledge, the first composition of its kind.

Could ideas like GENESIS.JSON be the future of original music, where composers use AI and the blockchain to create entirely new art forms? What makes an artist in the age of algorithms? I spoke with BT to learn more.

What are your central interests at the interface of artificial intelligence and music?

I am really fascinated with this idea of what an artist is. Speaking in my common tongue—music—it’s a very small array of variables. We have 12 notes. There’s a collection of rhythms that we typically use. There’s a sort of vernacular of instruments, of tones, of timbres, but when you start to add them up, it becomes this really deep data set.

On its surface, it makes you ask, “What is special and unique about an artist?” And that’s something that I’ve been curious about my whole adult life. Seeing the research that was happening in artificial intelligence, my immediate thought was that music is low-hanging fruit.

These days, we can take the sum total of the artists’ output and we can take their artistic works and we can quantify the entire thing into a training set, a massive, multivariable training set. And we don’t even name the variables. The RNN (recurrent neural networks) and CNNs (convolutional neural networks) name them automatically.

So you’re referring to a body of music that can be used to “train” an artificial intelligence algorithm that can then create original music that resembles the music it was trained on. If we reduce the genius of artists like Coltrane or Mozart, say, into a training set and can recreate their sound, how will musicians and music connoisseurs respond?

I think that the closer we get, it becomes this uncanny valley idea. Some would say that things like music are sacrosanct and have to do with very base-level things about our humanity. It’s not hard to get into kind of a spiritual conversation about what music is as a language, and what it means, and how powerful it is, and how it transcends culture, race, and time. So the traditional musician might say, “That’s not possible. There’s so much nuance and feeling, and your life experience, and these kinds of things that go into the musical output.”

And the sort of engineer part of me goes, well Look at what Google has made. It’s a simple kind of MIDI-generation engine, where they’ve taken all Bach’s works and it’s able to spit out [Bach-like] fugues. Because Bach wrote so many fugues, he’s a great example. Also, he’s the father of modern harmony. Musicologists listen to some of those Google Magenta fugues and can’t distinguish them from Bach’s original works. Again, this makes us question what constitutes an artist.

I’m both excited and have incredible trepidation about this space that we’re expanding into. Maybe the question I want to be asking is less “We can, but should we?” and more “How do we do this responsibly, because it’s happening?”

Right now, there are companies that are using something like Spotify or YouTube to train their models with artists who are alive, whose works are copyrighted and protected. But companies are allowed to take someone’s work and train models with it right now. Should we be doing that? Or should we be speaking to the artists themselves first? I believe that there needs to be protective mechanisms put in place for visual artists, for programmers, for musicians.