Autonomous AI agents

Most apocalyptic scenarios involve an AI acting as an autonomous agent, pursuing goals that conflict with human ones. Many people reject AI risk, saying that machines can’t have real goals or intentions. However, agency seems nebulous; and subtracting “real” agency from the scenario doesn’t seem to remove the risk.

An agent AI tries to do things in the real world. That may involve gathering information, building mental models, forming intentions based on desires or goals, reasoning and making plans, and taking action. Agency is what makes AI scary, for most people who do find it scary. For many people who don’t find AI scary, it is because they believe it is impossible for machines to try to do things, or to have desires or intentions.

“Agency” is a nebulous and confused concept.1 It means “the capacity to take action,” but what is an action? How are actions different from other events or effects? Typical analyses define “action” in terms of “intention,” which turns out to be irreducibly metaphysical. Many explanations of agency also explicitly depend on consciousness, free will, or mental representations. We have no coherent theories of any of those.

People who are sure AI is a not a threat usually have groundless metaphysical certainty that machines can’t have real intentions. Our own actions apparently flow from intentions, which flow from desires, which flow from subjective experiences which we like or dislike. Lacking subjective awareness, an AI is like an avalanche, not like an enemy: potentially dangerous, but usually comprehensible and controllable to an adequate extent. Contrariwise, people who worry that AI is a short-term threat often have a groundless certainty that making machines with intentions is easy.

What is real agency? How can we tell whether a system has it?

A white blood cell hunting and eating bacteria

Do white blood cells act on intentions? If you watch a video of one hunting down and killing bacteria, it sure looks like it!2 But intentions are usually analyzed as a particular sort of mental representation. Do white blood cells have mental representations?

Maybe agency is a matter of degree, not binary? Many things we intuitively count as actions don’t involve explicit intentions, or maybe intentions at all. (“When an automobile’s stability control system detects loss of steering, it applies the brakes.”) We could make an ordered list of questionable cases, with agency increasing from thermostats to chimpanzees. Our intuitions about whether members of the list count as “taking actions” vary depending on the circumstances and on how they are described.

As with other mental attributes, “agency” seems to be a way we understand some systems, rather than an objective property of them. Arguments about whether something is really an agent are unhelpful. Rather: is it more useful to reason about it as a thing or as an agent? That may depend on what is at stake for a particular analysis. The best understanding may view the system both ways, and explain how the two views connect. When engineering complex systems, it is usually necessary to tack back and forth between thinking of a control circuit as taking actions and considering it in purely mechanical, causal terms.

Artificial intelligence, as a technical discipline, split off from cybernetics in the mid-1950s.

Cybernetics is a field concerned with purposive systems: the observed outcome of actions are taken as inputs for further action in ways that support the pursuit and maintenance of particular conditions, or their disruption.3

The first thing cybernetic machines did was kill people. The field began during WWII with automatic weapon targeting systems. The most advanced were superhumanly-accurate radar-guided anti-aircraft guns and the control system for the V-2 long-range autonomous missile. These early successes were influential for both general cybernetics and its military applications after the war.

By the start of the Vietnam War, [a] new bomb computer was revolutionary in that the release command for the bomb was given by the computer, not the pilot; the pilot designated the target using the radar or other targeting system, then “consented” to release the weapon, and the computer then did so at a calculated “release point” some seconds later.4

Were those weapon systems really agents? Many people would say no—although cybernetic control theory, the field of autonomous purposive systems, created them. They aren’t agentish enough to count.

The US National Security Commission on Artificial Intelligence’s 2021 Report5 recommends spending $32bn per year on AI research to dramatically increase weapon systems agency:

Although U.S. weapons platforms have utilized autonomous functionalities for more than eight decades, AI technologies have the potential to enable novel, sophisticated offensive and defensive autonomous capabilities….

Existing DoD procedures are capable of ensuring that the United States will field safe and reliable AI-enabled and autonomous weapon systems…

There is little evidence that U.S. competitors have equivalent rigorous procedures to ensure their AI-enabled and autonomous weapon systems will be responsibly designed and lawfully used.

The Commission does not support a global prohibition of AI-enabled and autonomous weapon systems.

One essential difference between things and minds is that minds may be for or against you. If they are against you, they are endlessly persistent out of spite. If you survive an avalanche, you are done with it; it’s not going to come after you again when you’re not looking. And you can be sure about mere things; lightning does not strike from a clear sky. Psychological reasoning, on the other hand, never works very well. The person you thought was an ally may have been a deceitful enemy all along.

If a Terminator robot appears to purposefully chase you down, following a plan, reasoning about different means of transportation and fight strategies, with a goal of permanently disrupting your condition—do you care whether it’s really an agent, or if its seeming intentions are merely simulated?

Philosophers might care, but those of us concerned with AI risks shouldn’t. Our question is a pragmatic one: will this system harm people? We should address that with safety engineering, software quality methodology, and analysis of individual, social, and cultural effects—not philosophy.

  1. 1.You can read the Stanford Encyclopedia of Philosophy article on “Agency” if you’d like to get good and confused.
  2. 2.One such video is at
  3. 3.Paraphrasing the opening few sentences of the cybernetics Wikipedia entry.
  4. 4.Wikipedia, “Fire-control system.”
  5. 5.Available at