It’s a mistake to think that human-like agency is the only dangerous kind. That risks overlooking AIs causing agent-like harms in inhuman ways.
There are many agency-like phenomena, ones that we might or might not count as real agency, which could be dangerous in AI systems. Several catastrophic scenario considered in the next chapter involve one such peculiar form of agency, foreshadowed here.
These near-term scenarios of AI agency are overlooked in AI ethics. That community is concerned mainly with human agents misusing AI to cause harms, and does not take seriously the possibility of AI’s own agency causing harms unintended by any human. The same scenarios are overlooked in AI safety. That community is concerned mainly with mind-like AI developing human-like agency, and has not explored the large conceptual space of agencies dissimilar to human minds.1
I’ll discuss a few example types of non-mind-like agency here. I’ll concentrate on distributed agency, in which groups of things act collectively, because that’s one feature of the next chapter’s disaster scenarios.
Quasi-autonomous machines that unambiguously lack subjective experience can take actions with large, unpredictable, harmful effects.2 Nick Bostrom’s book Superintelligence gives as an example the 2010 flash crash. Automated stock trading systems caused a trillion dollars of losses in five minutes, due to unanticipated positive feedback loops. Those trading systems arguably had intentions (to avoid losing money) that they acted on (by selling falling stocks), and were arguably agents acting on behalf of the trading firms that ran them.
Each individual sell-bot would have saved its operator billions of dollars—if it weren’t for all the others. Each saw prices falling and sold in response, which drove prices down a bit further, triggering further bot actions. (“Bot,” originally short for “robot,” means an autonomous software agent—not necessarily a particularly smart one.) The bots accomplished the opposite of what they—or their operators—intended. Their collective action was disastrous.
More generally, agents, actions, and their likely effects cannot be analyzed in isolation. They must be understood in context, including the intentions and actions of other agents.
Biological systems exhibit various non-mind-like forms of agency. Let’s say you are fighting off a staph infection. From the linguistic form, it seems that you are the agent: you are taking an action, namely fighting it off. But in the early stages, you may be entirely unaware that you are doing that, and if you are successful, you may never know it happened. Is it reasonable to say that you fought it off? Maybe it wasn’t you, it was your immune system, which is a separate, independent agent. But the immune system itself is an immensely complex collection of disparate parts—bone marrow, glands, the lymphatic circulatory system, many different specialized types of cells, and a slew of immune-specific molecules.
Is “the immune system” an agent? Is it even a thing? It has no central controller. All the parts cooperate and communicate with each other using diverse signaling mechanisms.
Biologists routinely describe each of the parts, including even small molecules, as “acting on” other parts, and on pathogens, and on the rest of the body. That would make them agents (“things capable of acting”). Are biologists speaking metaphorically or literally when they speak of cells or molecules acting? If there was a single, reasonably crisp definition of “action,” this question would have an answer—but there isn’t one.
The immune system is agent-like in persistently attacking its perceived enemies, adaptively deploying multiple strategies and weapons as it learns how a pathogen behaves and discovers weak points.
Is that description metaphorical or literal? (What implications would either answer to that question have?) The U.S. military report Distributed Kill Chains describes a new strategy, “mosaic warfare.” It’s based on an extended, explicit analogy to the immune system, coordinating webs of partially-autonomous lethal artificial agency.3
The immune system as an example suggests that agency is not just a matter of degree. It comes in diverse varieties. The prototypical sort imagines human minds as coherent, unitary agents with definite beliefs, desires, and intentions, taking actions with describable effects. This is somewhat inaccurate as an explanation of human agency,4 and doesn’t apply to the immune system at all. The immune system isn’t coherent or unitary; ascribing beliefs, desires, or intentions to it may seem a metaphorical stretch too far; and it is so densely causally complex that the consequences of actions of its parts are often so widely distributed as to be ineffable.
There may be unenumerably many varieties of agency. We should be particularly wary of artificial agents whose agency is highly dissimilar to that of people, because we may easily overlook it, and we cannot easily understand it. We should not assume we will remain safe up until AI develops some particular mental ability we might imagine to be necessary for agency.
I will discuss two more forms of distributed agency: those of institutions and ideologies. Again, the agency of AI may be more similar to these than to that of people.
As with the immune system, it’s sometimes best to consider an institution as a unitary agent, and sometimes to see it as a collection of parts: people, but also policies, procedures, local social norms, records, buildings and equipment, propaganda and reputation. A contemporary institution is also a cyborg: software, data, and computer networks are intimately woven through everything it does. Institutions differ from the immune system in that some of their parts—their human members—have their own beliefs, desires, and intentions.
Institutions also have their own beliefs, desires, and intentions, which may differ from those of their members. Regardless of the claims made in an institution’s mission statement, commonly an institution comes gradually to act mainly to preserve and increase its own safety, reputation, wealth, power, and longevity. Institutions (and ideologies) outlast humans, and may relentlessly pursue objectives over centuries. Evangelical religions are examples.
Sometimes, most members of an institution are genuinely dedicated to its nominal, noble mission. In that case, perhaps no member supports the institution’s actual, covert, self-interested motivations, which are revealed only in its activities. Alternatively, some members may not care about the supposed mission, but they mostly also don’t care about the institution. They seek rewards, advancement, and power for themselves, with no concern for how that may affect the institution. Yet the inexorable logic of professional administration promulgates plans and performance criteria that serve inhuman institutional interests—instead of either the supposed mission, or the personal goals of members.
Ideologies are agents made of memes: viral ideas that parasitize human minds, take partial control over them, and direct us to infect other minds with slightly-altered replicas of themselves.
Ideologies, like the immune system, are agent-like in persistently attacking perceived enemies, adaptively deploying multiple strategies and weapons as they learn how their hosts and competitors behave. Successful ideologies coopt the intelligence of their hosts to innovate new constituent memes and new tactics for spreading them.
Ideologies evolve under selective pressure. Those that most successfully infect people, and most effectively twist their motivations, spread the fastest and displace others. That strips them down to relentless pursuit of a single objective: maximizing the number of infected brains.
Ideologies can infect and coopt institutions, not just individual people. Then both the explicit mission, and the visible actions, of an institution may realign to serve the ideology. Conversely, institutions can create and propagate infectious ideologies as means to their own noble or selfish ends.
Institutions and ideologies can be more, or less, aligned with human flourishing. For peace, prosperity, and survival, the world depends absolutely on systematic, more-or-less rational institutions, based on ideological principles and procedures. We rely on government agencies, courts, universities, medicine, and critical manufacturing and service industries. On the other hand, Nazism and communism, and the states they controlled, killed tens of millions of people. The guns were fired by individual people, but most would rather have been home playing cards and drinking beer.
Some people say institutions and ideologies aren’t really agents.5 They’re just groups of people. A collection is not a thing. Individuals have real goals and intentions and can act. Institutions and ideologies don’t and can’t. Some people say they don’t really exist, even. They are just mental representations in human brains.
Metaphysical arguments are unhelpful here. Regarding institutions and ideologies as autonomous agents may be a useful and true-enough model for many purposes.
- 1.Mike Travers’ “Patterns of refactored agency” (Ribbonfarm, November 27, 2012) is a valuable catalog. See also his collected writings on this topic at hyperphor.com/ammdi/agency.
- 2.Some plausible scenarios also show that AI could be catastrophic while definitely not having agency. Gwern Branwen argues, further, that non-agent AIs are likely to lead to agent AIs because agency is useful for most things, in “Why Tool AIs Want to Be Agent AIs,” gwern.net, 2016-09-07–2018-08-28.
- 3.O’Donoughue et al., Distributed Kill Chains: Drawing Insights for Mosaic Warfare from the Immune System and from the Navy, 2021. Commissioned by the Defense Advanced Research Projects Agency and written at The RAND Corporation, a defense research contractor.
- 4.See “Acting on the truth” and “Aspects of reasonableness” in my In the Cells of the Eggplant.
- 5.Some reject the “meme” meme as vague and unscientific. I find it useful as a way of looking and understanding, not as a Truth.