Scary AI when?

For much of the AI safety community, the central question has been “when will it happen?!” That is futile: we don’t have a coherent description of what “it” is, much less how “it” would come about. Fortunately, a prediction wouldn’t be useful anyway. An AI apocalypse is possible, so we should try to avert it.

We have no meaningful way of estimating the probability or timeline for catastrophes. We don’t adequately understand the operation even of existing systems, nor their effects, and cannot make predictions about technologies that don’t exist yet. We cannot address the range of possibilities with a uniform first-principles solution (“alignment“). We cannot rule out extreme scenarios of sudden domination or destruction by incomprehensible omnipotent superintelligence, but we also can’t do anything about that.

Scary AI is imagined as the output of a research process that has been highly unpredictable. (In both directions: despite confident predictions, there was hardly any progress in AI for two decades starting around 1990; there’s been dramatic, unexpected progress for the past few years.) The research community has often had a paradoxical anti-scientific attitude, actively resisting creating the sorts of understanding that would make prediction more reliable.1

Some in the AI safety field believe that research already under way will soon produce Scary AI: either as an emergent property of existing technologies, or as an added engineered feature that is a small extension of current algorithms. That might happen in as little as five years (and then we will all die horribly). Are language models Scary? explains reasons to think this is unlikely (although it can’t be ruled out). Other AI safety researchers believe that the qualitatively different sort of AI will require qualitatively different algorithms, which haven’t been invented yet, so it will be at least a decade (possibly several) before we all die horribly. I have no opinion about this, since it’s a speculation about unknown unknowns.

Since there’s no explanation of what would count as Scary AI, predicting “when it will happen” is foolish.

If “it” is an AI “waking up” into mindness, we have no clue what that would even mean, much less how it might happen, so trying to guess when seems pointless.

If “it” is a machine better than humans at all tasks, we’d need to know which are most difficult for machines. Ideas about that have been consistently wrong. Usually people expect the tasks that are most difficult for people to be the most difficult for machines, despite centuries of experience that show the opposite. Humans are terrible at systematic rationality, so problems which demand it are frequently posed as challenges for AI. Computers are extremely good at rationality—they are mathematical logic made flesh!—so solving puzzles and playing board games are trivial for AI. Household chores (making breakfast and cleaning up after it) are far out of reach for AI now. Are those the most difficult challenge? We have no idea.

If “it” is a sudden acceleration of science and technology, it’s probably more useful to investigate what that would involve concretely, before asking when or how AI might be involved. (I discuss that later.)

A typical prediction is that “it” will most likely happen in twenty or thirty years, or at any rate probably not for a decade, but almost certainly within a century.

This seems intuitively reasonable to me; and I think my intuition is worthless. Also, no one else’s estimate seems to have any basis in technical specifics;2 we’re all just saying “seems reasonable I guess.” We might think “this is a very hard technical problem, but not inherently impossible; and once a very hard technical problem is identified, it usually gets solved within a century—often only a few decades, or maybe it takes about a decade if we’re lucky.” But there are exceptions. Also, it’s not clear that categorizing it as “a very hard technical problem” is accurate. It might not be very hard, if approached differently. It might not be a technical problem at all: we don’t have a technical definition of what Scary AI is, and some people think there are metaphysical problems prior to any technical ones.

Nick Bostrom is more cynical:

Two decades is a sweet spot for prognosticators of radical change: near enough to be attention-grabbing and relevant, yet far enough to make it possible to suppose that a string of breakthroughs, currently only vaguely imaginable, might by then have occurred… Twenty years may also be close to the typical duration remaining of a forecaster’s career, bounding the reputational risk of a bold prediction.3

AI safety organizations also want to convince everyone that AI safety is important. Stressing that an apocalypse is likely in your personal lifetime is necessary to get the message across. “This year” is perfectly possible, but most people won’t buy that;4 and “by 2100” sounds sufficiently remote that the public will ignore it.

In any case, if the timing of an AI apocalypse is unknowable—as the broad spread of predictions also suggests—then what does it matter when exactly it is most probable? What would or should or could we do differently if we knew the time of maximum danger was 2020 or 2030 or 2040 or 2070? Why not do whatever it is now?5

Wanting to quantify uncertainty doesn’t make the attempt feasible or useful. AI risk is in the domain of Knightian uncertainty, of unknown unknowns. In such domains, probabilistic reasoning is unhelpful.6 The best approaches are to seek understanding through cautious exploration toward possible unknown unknowns; to prepare against known, unquantified but plausible risks; to avoid dramatic actions that could make a risky situation still more turbulent and confusing; and to act tentatively on such concrete understanding of possible improvements as one does have.

We should, therefore, work to better understand possible consequences of existing and near-future technologies. We can also take pragmatic actions to mitigate their predictable risks. Those also may protect against unforeseen, probably more distant AI scenarios, and (as a fortuitous side benefit) against unaligned humans. Further, we can try to slow or block any risky development until we’re reasonably confident it is a good idea.


  1. 1.Discussed in the “Backpropaganda” section of Gradient Dissent.
  2. 2.Trying to guess when computers will get as many flops as the human brain may be an exception, but I think it’s inherently irrelevant, and also attempts don’t narrow down the timing much. I discuss this “biological anchors” approach elsewhere.
  3. 3.Nick Bostrom, Superintelligence.
  4. 4.A deadly global pandemic caused by a bat virus is a tired made-for-TV science fiction movie plot, and therefore can’t happen this year.
  5. 5.In principle, if we expect near-term doom, a Hail Mary pass—looking for a low probability magic bullet—makes sense. If there’s more time, we might prioritize approaches that are unlikely to pay off in the next decade, but which have higher probability of success in the long run. But as Scott Alexander says, “It’s not like there’s some vast set of promising 30-year research programs and some other set of promising 5-year research programs that have to be triaged against each other.” That’s in his “Biological Anchors: A Trick That Might Or Might Not Work.”
  6. 6.See Part One of In the Cells of the Eggplant on ways probabilistic reasoning can be misleading; particularly “The probability of green cheese” on reasoning about unique events and unbounded unknowns.