The AI Mirror

The AI Mirror by Shannon Vallor

""This book is "about our evolving and troubled relationship with the machines we have built as mirrors to tell us who we are, when we ourselves don’t know."

Chapter 1 abstract:
"Today’s powerful AI technologies work like mirrors. They reflect back to us endless variations of our most predictable patterns of thought and behavior, extracted from digital oceans of collected data about human speech, movement, decision-making and culture. The companies building these mirrors market them as windows into our future, predicting what we will do, what we will say, what we will consume, and what we will believe. Yet by reflecting only the unsustainable patterns of our past, these projections don’t foretell the future; in fact, they endanger it. Dispelling overhyped fears of our extinction at the hands of malevolent AI overlords, the introduction outlines the real existential threat to our humanity posed by our increasingly distorted relationships to these tools. Yet we can still reclaim the humane potential of AI technologies—and more importantly, our chance to make our own futures—by refusing to be captured in their frame."

Chapter 1:
"The companies building today’s most powerful AI technologies increasingly position these to represent, or even stand in for, humanity’s common voice and collective judgment. We are told that AI knows us better than we know ourselves. Judges get predictive algorithms to tell them who to release from prison, while employers get automated HR software to tell them who to hire or promote. AI tools will tell your government whether you deserve public benefits and your hospital whether you deserve treatment. AI can write a university professor’s course lectures and their students’ essays on them. "

"We need to understand this threat in a radically different way. AI does not threaten us as a future successor to humans. It is not an external enemy encroaching upon our territory. It threatens us from within our humanity. In the words of a well-worn horror film trope: “the call is coming from inside the house.” This makes a great difference to how we must respond to the threat that AI poses. For the threat is not a machine that stands apart from us. We can’t fight AI without fighting ourselves. The harms to people and to society that may follow from our relationship to “intelligent” machines are not foreign assaults, but symptoms of an internal disorder: a fully human pathology that requires collective self-examination, critique, and most of all, healing."

"At the very moment when accelerating climate change, biodiversity collapse and global political instability command us to invent new and wiser ways of living together, AI holds us frozen in place, fascinated by endless permutations of a reflected past that only the magic of marketing can disguise as the future."

"The danger is a call to human action and responsibility. We are the source of the danger to ourselves from AI, and this is a good thing—it means we hold the power to resist, and the power to heal. After all, Ovid’s story is not about the evils of reflecting pools! What endangered Narcissus was not some shiny water. It was his own weakness of character—his vanity, narcissism and selfish obsession—that enabled his detachment from others and the fatal loss of his world. The story of Narcissus (and of Echo, who appears in the next chapter) is about virtue, or rather, vice—what we lose when we habitually turn our eyes away from our shared future, away from its noblest possibilities and unmet responsibilities."

"To change that will require something more radical than the solutions sought by a growing number of computer scientists interested in fields like “AI safety,” who are looking for ways to program AI to be more reliably beneficial and aligned with human values.[note] The reason this kind of strategy won’t work, at least not in our present environment, is because human values are at the very root of the problem. AI isn’t developing in harmful ways today because it’s misaligned with our current values. It’s already expressing those values all too well. ... AI is a mirror of ourselves, not as we ought to be or could be, but as we already are and have long been. This is why we can’t let today’s AI tools decide who we will become, why we can’t let them project our futures for us. Yet increasingly, that’s exactly what we are building them to do.
[Note:Value alignment is the term often used by a small but influential community of computing researchers who take AI safety to be an urgent priority due to their belief in the imminent development of human-level AI (AGI or “artificial general intelligence”), or even AI with “superhuman” capabilities that could be destructive or malevolent. See Russell (2019).]"

"AI mirrors are being used to tell us what we will learn, in which career we can succeed, which roads we will travel, who we can love, who we will exclude or abuse, who we will detain or set free, who we will heal or house, what we will buy, and the investments we’ll make. They tell us what we will read, what words we’ll type next, which music we’ll hear, what images we’ll paint, the experiences we’ll seek, the risks we will accept, the strategies we’ll adopt, the policies we will support, and the visions of our future that we will embrace.... Through automation, these algorithmic predictions, which increasingly cover every domain of human experience, quickly become self-fulfilling prophecies. They replicate patterns extracted from data about past human habits, preferences and decisions. "

"None of these patterns are new; but today’s AI mirrors will accelerate and solidify them if we do not change course. Changing how we build AI will be part of that, but it won’t be enough. What we are rapidly making of AI today is what we had long ago made of ourselves. This means that we cannot gain lasting protection from the harmful potential of AI merely by changing lines of code. To believe that there is a neat computing solution to our present peril makes as much sense as trying to clean dirt off one’s face by furiously scrubbing the mirror.

We need a deeper and more lasting cultural transformation of our relationship to AI, and to technology more broadly, as our own creative power. In the face of accelerating climate crisis and other existential risks, the future desperately needs us to understand—more fully than ever—who human beings can be, and what we can do together. Yet, because today’s AI mirrors face backward, the more we rely on them to know who we are, the more the fullness of our humane potential recedes from our view. We must reimagine and transform our relationship with these tools if we hope to chart a new path to shared flourishing on an increasingly fragile planet."

Chapter 2:
The child’s smile tells you something about their own enjoyment of the story. The eye-roll emoji informs you that your co-worker shares your boredom. Even the dog’s whimper expresses his own felt distress, not just a hollow response to yours. But with AI, something else is happening. We are hearing our words, and the words of others like us, bouncing off a complex algorithmic surface and altered just enough by the bounce that we don’t realize it’s our own thoughts coming back to us. ..
When we are engaged with a large language model and find ourselves suddenly struck by its apparent wit or insight, we must realize that we are very much like Narcissus talking to himself, his words dimly mirrored and transformed by Echo’s lips. The only difference is that, unlike Echo, behind a large language model there is no suppressed, silent agent struggling to speak her own thoughts. Unlike Echo, a large language model returns to us not our own recent utterances, but a statistical variation on the collected, digitized words of untold millions. The fact that Amazon’s AI voice assistant device is called Echo should earn from us an ironic smile."

"In a wider sense, of course, all technologies, and thus all forms of AI, are mirrors of human thought and desire, as emphasized in the introduction. Our technologies reflect what we think we want, what we think we need, what we think is important, what we think others will praise or buy. Even AI systems like AlphaFold, which aren’t trained on human speech or behaviors but on molecular structures, reflect to us our priorities (proteins are pretty important stuff for living things like us!) and mirror our ways of representing things like proteins (we represent proteins in discrete, formal and mathematizable descriptions, not in epic poems)."

Chapter 3 (abstract): "By reproducing and amplifying historical patterns of injustice found in our data, AI pushes humanity’s past failures into our future, ensuring that we make the same mistakes, only at ever greater scales."

"There’s a vast chasm between an AI mirror that mathematically analyzes and generates word predictions from the patterns within all the stories we’ve told, and an actual AGI that could tell us its own story. "

"It is not easy to eliminate unwanted biases from the data set or from the trained model either, since they are usually intertwined with the information the model needs to perform its task. For example, researchers discovered in 2019 that a risk prediction algorithm used nationwide by hospitals in the United States was replicating the long history of racial bias in American health care by diverting medical care away from high-risk Black patients, even though these patients were in fact sicker than the White patients the tool prioritized for care.[i] Yet race had been carefully excluded from the training data. You might wonder, then, how the algorithm could end up racially biased. It predicted patient care needs by a different variable, namely, cost: how much money has been spent on a person’s care. Unfortunately, Black patients are commonly undertreated by physicians in the United States and denied access to more expensive tests and treatments readily given to White patients with similar symptoms and clinical findings. So, when the algorithm’s designers naively chose healthcare cost as a good proxy for healthcare need, they unwittingly doomed Black patients to being rated as needing less care than White patients who had already received better, costlier care from their doctors. A learning algorithm found and reproduced the pattern of racial discrimination without ever being given a race label.
[i] See Obermeyer et al. (2019).
That means that even if there is a race label in the dataset, you can’t just delete that label, retrain the model, and rest easy. An AI algorithm can reconstruct the discriminatory pattern of racial differences in patient treatment from subtle cues linked to many other variables, such as zip code, prescription histories, or even how physicians’ handwritten clinical notes describe their patients’ symptoms. And if you deleted all those training data, the model wouldn’t have what it needs to do its job. There are often ways to reduce and mitigate the presence of unfair biases in machine learning models, but it’s not easy. More importantly, it doesn’t actually solve the underlying problem.

The fundamentally correct explanation always offered for unfair machine learning bias is that the trained model is simply mirroring the unfair biases we already have. The model does not independently acquire racist or sexist associations. We feed these associations to the model in the form of training data from our own biased human judgments. The AI hospital tool that discriminated against Black patients and denied them needed medical care was trained on data from U.S. doctors and hospital administrators who discriminated against these patients in the first place. The model then learned that pattern and amplified it during the model training phase. It discriminated against Black patients even “better,” and more consistently, than the human doctors and hospital administrators had! This is precisely what machine learning models are built to do—find old patterns, especially those too subtle to be obvious to us, and then regurgitate them in new judgments and predictions. When those patterns are human patterns, the trained model output can not only mirror but amplify our existing flaws."

"This is just one example of the kind of runaway feedback loop documented by sociologist Ruha Benjamin, in which the old human biases mirrored by our AI technologies drive new actions in the world, carving those harmful biases even deeper into our world’s bones.[i] We see this in the phenomenon of Instagram digital video “beauty filters” designed with Eurocentric biases that make your skin whiter, your eyes wider, and your nose narrower. These kinds of filters are strongly associated with the negative effects of Instagram on young people’s mental health and self-image, particularly the effects on women whose real-world appearance does not match the standards of White female beauty that their filters allow them to mirror online. In their paper “The Whiteness of AI,” researchers Stephen Cave and Kanta Dihal detailed numerous ways in which AI today mirrors back to us and strengthens the dominant cultural preference for Whiteness, from its depictions in stock imagery to the nearly universal choice of white plastic for “social” robots.
AI mirrors thus don’t just show us the status quo. They are not just regrettable but faithful reflections of social imperfection. They can ensure through runaway feedback loops that we become ever more imperfect. Even still, bias in AI, whether it unjustly punishes us for our race, age, weight, gender, religion, disability status, or economic class, is not a computer problem. It’s a people problem. It is an example of the virtually universal explanation for all undesirable computer outputs not related to mechanical hardware failure: the computer did precisely what we told it to do, just not what we thought we had told it to do. Much of software engineering is simply figuring out how to close the gap between those two things. In this way, the AI mirror metaphor is already profoundly helpful. It allows us to see that the failings of computer systems and their harmful effects are in fact our failings, and our sole responsibility to remedy.
[i] See Benjamin (2019)."

"AI bias has simply made untenable the attractive illusion that computing is, or can be, a morally neutral scientific endeavor.It also undermines comfortable assumptions that these kinds of bias must be “edge cases,” aberrations, or relics of the distant past. AI today makes the scale, ubiquity, and structural acceptance of our racism, sexism, ableism, classism, and other forms of bias against marginalized communities impossible to deny or minimize with a straight face. It is right there in the data, being endlessly spit back in our faces by the very tools we celebrate as the apotheosis of rational achievement. The cognitive dissonance this has produced is powerful and instructive. In the domain of social media algorithms, the AI mirror has revealed other inconvenient truths, such as our penchant for consuming and sharing misinformation as a trade in social capital that is largely immune to fact-checking or corrective information, and our vulnerability through this habit to extreme cultural and political polarization. But while the metaphor of the AI mirror is entirely apt, illuminating, and useful, we have not yet learned enough from it."

"What aspects of ourselves, individually and collectively, do AI mirrors bring forward into view, other than our entrenched biases against our own kind? And more importantly, what aspects of ourselves do they leave unreflected and unseen? To answer this, it helps to think carefully about the relevant properties of today’s AI tools. We need to consider what functions for AI as the equivalent of a polished surface, and what functions in a role comparable to refracted light. Today’s machine learning models receive and reflect discrete quantities of data. Data are their only light. Data can be found in many forms: still or video images, sound files, strings of text, numbers, or other symbols. If the original data are analog, they must be converted from their continuously variable form to a digital form involving discrete binary values."

"Finally, only a subset of the data about humans that could be used to train machine learning models is actually being used today for this purpose. Most training data for AI models heavily overrepresent English language text and speech, as well as data from young, white, male subjects in the Northern Hemisphere, as well as cheap data generated in bulk by online platforms. Google’s, Meta’s, and Microsoft’s mirrors largely reflect the most active users of their tools, and of the Internet more broadly. Unfortunately, access to these resources has never been equitably distributed across the globe. It follows that what AI systems today can learn about us and reflect to us is, just as with glass mirrors, only a very partial and often distorted view. To suggest that they reflect humanity is to write most people out of the human story, as we so often do.
We must also inquire about the mirror’s surface. The “surface” of an AI mirror is the machine learning and optimization algorithm that determines which features of the “incident light”—that is, the data the model is trained on—will be focused and transmitted back to us in the form of the model’s outputs. It is the algorithm embedded in a machine learning model that expresses the chosen objective function (a mathematical description of what an “optimal” solution to the problem will look like). The learning algorithm and model hyperparameters determine how the training data are processed into a result as the model “learns.” The algorithmic “surface” of the model determines which of the innumerable possible patterns that can be found within the data (the model’s “light”) will be selected during model training as salient and then amplified as the relevant “signal” to guide the model’s outputs (the particular “image” of the data it reflects).
"

"What do these properties of the AI mirror’s light and its algorithmic surfaces reveal, reinforce, and perpetuate about us? What do they conceal, diminish, and extinguish? First, they reveal and reinforce our belonging to certain socially constructed categories. Decades ago, in their landmark book Sorting Things Out: Classification and Its Consequences, Geoffrey Bowker and Susan Leigh Star demonstrated the extent to which, in their words, “to classify is human.”[i][EF1] We have been classifying and labeling the world, and one another, for millennia. Yet only with the rise of modern data science has it seemed possible to produce a comprehensive regime of human classification, one that would allow every conceivable label for a human to be matched to every individual human, and statistically correlated in such a way that the relationships between these labels can be reliably predicted.
[i] See Introduction in Bowker and Star (1999)."

"How AI systems see us, and how the AI ecosystem represents us through these mirrors, is not how we see each other in these intermittent moments of solidarity. To an AI model, I am a cluster of differently weighted variables that project a mathematical vector through a pre-defined possibility space, terminating in a prediction. To an AI developer, I am an item in the training data, or the test data. To an AI model tester, I am an instance in the normal distribution, or I am an edge case. To a judge looking at an AI pretrial detention algorithm, I am a risk score. To an urban designer of new roads for autonomous vehicles, I am an erratic obstacle to be kept outside the safe and predictable machine envelope. To an Amazon factory AI, I am a very poorly optimized box delivery mechanism.
When we are then asked to accept care from a robot rather than a human, when we are denied a life-changing opportunity by an algorithm, when we read a college application essay written for the candidate by a large language model—we must realize what in that transaction, however efficient it might be and however well it might scale, has fallen into the gap between our lived humanity and the AI mirror. We have to acknowledge what has been lost in that fall."

Chapter 4:
"Recall the earlier chapters’ description of today’s AI tools as mirrors pointing backwards, narrowly reflecting only the dominant statistical patterns in our data history. Such mirrors, when used not as reflections of the past but as windows into our future, serve as straitjackets on our moral, intellectual and cultural imagination. They project the statistical curve of history into the still-open future. And by rebranding that reflection as a prediction, they profoundly restrict our sense of the possible."

Chapter 4 [add to macropower about technology]
"For it has become increasingly challenging to understand exactly when, how, or by whose authority these algorithms produce their profound influences on our lives. This algorithmic opacity or lack of transparency in a “Black-Box Society,” to borrow the title of Frank Pasquale’s excellent 2015 book on this subject, raises profound ethical questions about justice, power, inequality, bias, freedom, and democratic values in an AI-driven world. The problem of algorithmic opacity is especially complex given its multiple and overlapping causes: proprietary technology, poorly labeled and curated data sets, the growing gap between the speed of machine and human cognition, and the inherently uninterpretable and unpredictable behavior of many machine learning processes. The latter can prevent even AI programmers from fully understanding the internal operations of the AI system they designed, learning the cause of any given output, or assessing its reliability in complex interactions with other social and computational systems.

What makes this opacity so hard to remove? We have seen how today’s AI systems mirror and magnify the statistical patterns they extract from the vast data pools used to train each machine learning model. The most powerful AI mirrors today can extract patterns that humans looking at the data could never find. This is partly because of the greater computational speed of their processing, but also because of the sheer size and complexity of models of this type, which belong to a class of machine learning called “deep” learning. Deep learning models are built with a highly complex network structure composed of multiple mathematical layers. The structures of these layers and the connections between them are defined by variables called parameters or weights. An early example of a large language model, Google’s Pathways Language Model (PaLM), had 540 billion of these variables. Others are now trained with more than a trillion.

By feeding computations across its internal layers forwards and backwards, while continually modifying its own weights to improve the result, a machine learning model can eventually converge upon an optimal solution. That is, it can solve the problem we set it to solve. That’s when we can get the model to work, anyway! Building these models can seem more like alchemy than traditional engineering, as it can be a mystifying task to get a deep learning model to converge. But even when we get the model to produce a solution, there is no way for a human to retrace or mentally represent the solution path. Our minds cannot consciously represent billions or trillions of variables! It is a practical impossibility. This is what is sometimes called intrinsic opacity in a “deep” learning model, that is, a model with very many layers and weights. A model like this isn’t opaque because we haven’t studied it hard enough or long enough. It would remain opaque even if the smartest human on the planet studied it from now until the heat death of the universe.

Therefore, when we gaze in our AI mirrors, we often cannot ask the mirror why it sees what it sees. Even if we design a way to ask, it can be impossible to get an answer that is both reliable (a correct and precise explanation) and interpretable (that is, understandable in human terms). And this is a big problem if what we are using this tool to do is to recommend a lifesaving medical treatment, deny someone a loan or a job, accuse them of fraud, or predict a child’s educational achievement—all things that AI mirrors are currently being used to do. How can we trust such decisions if we cannot understand or interrogate them? There are now many agendas in AI research driven by attempts to get around this difficult problem by finding new ways to make model outputs “explainable”.

"The danger arises from two characteristic elements of automated decisions issued by AI mirrors. The first is their opacity to human inspection, validation, and explanation—even by experts. The second is the growing ability of generative AI models to manufacture plausible facsimiles of moral reasoning that can lend a false veneer of public legitimacy to their decision outputs. What do we stand to lose? We risk losing the space to jointly reflect upon our societies’ actions and their moral status. We risk losing the space to publicly contest the rightness, goodness, or appropriateness of high-stakes decisions and policies that have been automated or opaquely steered by AI tools. We risk losing the space to assign moral responsibility to ourselves and others for such decisions and their consequences. Finally, we risk losing the space for moral imagination in exploring new and better possibilities for collective action and social policy."

"Remember: using AI mirrors to automate an unjust system will not only reflect the existing injustice—it will often magnify the injustice and then project it across every decision."

Vallor argues that people can give reasons for their decisions, while we cannot get reasons from AI. My response: people often do not know reasons for their decisions to the full extent.

Chapter 5:
"The point is that our AI mirrors are nothing like neutral reflections of a shared human reality, but they are very potent indicators of how a small subset of humans have seen and valued the world, and the marks they have left on it. "

"Today’s AI mirrors, particularly those developed for commercial purposes, are most often carelessly designed and used in ways that actually reinforce and deepen the dominant historical patterns of human valuing and acting that we already know to be unjust, unsustainable, and corrosive to our societies. The magnifying power of AI mirrors means that they have to be used with the expectation that they will amplify harmful patterns unless deliberately made to do otherwise. Because what will you get if you naively train an AI model on an unjust medical or financial or criminal justice system? An AI tool that calculates how to be even more efficient than people at delivering injustice."

"Through the algorithmic feedback loops operating in the digital media spaces that have come to function as the new public square, AI mirrors amplify and normalize our biases, reinforce our most polarizing opinions and most aggressive stances, and boost the visibility of our most uninformed “hot takes.” In doing so they reflect back to us images of human civic agency so distorted in their form that they not only shift the “Overton windows” of acceptable political conduct but make us lose our already fragile faith in the human capacity for political wisdom. The result is that actual political behavior drifts closer and closer to the originally distorted mean, producing as a self-fulfilling prophecy the very reality that our distorted mirrors predicted.

This is why it is so important—for purposes of both safety and transparency—that we know when our mirrors are distorting reality. There’s a very good reason that the distorting side mirror has a little warning engraved on it: “objects in mirror are closer than they appear.” In contrast, AI mirrors today are rarely tested rigorously to find the distortions they produce, and they almost always lack the safety and transparency signposts and guardrails we need. Until we demand these, it will be increasingly difficult to truly know ourselves. And when we can no longer know ourselves, we can no longer govern ourselves. In that moment, we will have surrendered our own agency, our collective human capacity for self-determination. Not because we won’t have it—but because we will not see it in the mirror. "

"An AI model has nothing to say, only an instruction to add some statistical noise to bend an existing pattern in a new direction. It has no physical, emotional or intellectual experience of the world or self to express. It has nothing that needs to come out. As long as we recognize and value the difference between mechanical creation and expression, AI poses no threat to our creativity or growth. But there lies the problem, in light of the cultural and economic values that currently shape AI’s affordances. Increasingly, we ourselves are measured in terms of our ability to resemble our own mechanical mirrors. Whether it’s the pressure to produce your next album, or publish enough to get tenure, or film enough videos to keep your subscribers—our dominant values favor those who don’t get writers block, who don’t struggle to find the right words, or images, or notes, or movements, who never get caught up in the swirling drag of inexpressible meanings. The system has long rewarded creators who work like machines. Should we really be surprised that we finally just cut out the middleman and built creative machines?"

Chapter 6:"I’ll say it again: AI is not the problem here. The problem is our unwillingness to step back from our tools to reevaluate the patterns they are reproducing—even the supposedly virtuous ones. Which of the widely celebrated virtues most likely to be reflected in our AI mirrors might in fact be traps? Which of them function as the moral equivalent of cement shoes, dragging us into the depths of old and unsustainable patterns, rather than freeing us to alter our familiar habits and values, or reconfigure them to our current needs?"