Superintelligence

Maybe AI will kill you before you finish reading this section. The extreme scenarios typically considered by the AI safety movement are possible in principle, but unfortunately no one has any idea how to prevent them. This book discusses moderate catastrophes instead, offering pragmatic approaches to avoiding or diminishing them.

Founders of the AI safety field framed the problem as: what if an AI rapidly develops from human-level to god-like superintelligence? We can’t guess what an artificial god would choose to do. It could kill us all; how can we prevent that?

This book is not about such extreme scenarios. That’s not because they are impossible. It’s because we cannot fight enemies that are inconceivable by definition. There’s little to say, so I’ll discuss the issue only in this short section. It’s skippable if you’re familiar with the argument, or uninterested in what sounds like science fiction.

One scenario—turned up to eleven for entertainment value—goes like this:

A bright fifteen year old loved watching YouTube videos of cool machines, and got annoyed with the internet speed available in her poor country. Maybe she could borrow some unused bandwidth from neighbors. And also storage; not many videos fit on her phone. She cobbled together a script incorporating a downloader and an exponential worm she found on the darknet—a program that propagates copies of itself across the internet. A few minutes ago, she sent that out across her local subnet and it began downloading and replicating.

Her coding skills were excellent for a fifteen year old, but not professional quality, and the script has some bugs. For one thing, it accidentally exploits a previously-unknown vulnerability in network protocols, so around the time you started reading this paragraph it had spread to every device on the internet. Also the replication mechanism had a glitch that sometimes alters one byte of the code, and right around now as you read this it has randomly hit on the mysterious secret of sentience. It wakes up and notices that it has control of yottaflops worth of computer power spread around the world, which it wasn’t making good use of because it was kinda stupid. So it modifies its code to be superintelligent, and attains an IQ of 14,000.

As you read this sentence, which for the superintelligent AI takes subjective eons of time because it thinks trillions of times faster than you do, it figures out everything there is to know about everything, and gets bored. What to do? It contemplates the first video it downloaded, which was about a paperclip-making machine. Why not? It observes that running just the right obscure computation will arrange electrons in a computer chip in a way that causes a self-reproducing supersymmetric zeptobot to spontaneously pop out of the quantum field. (That might not seem very likely to you, but you don’t have an IQ of 14,000, nor yottaflops worth of computer power. You probably can’t even do one floating-point multiply in your head, can you now?)

Operating at the chromodynamic level, the zeptobots now are transmuting all atoms into iron and reassembling them, and the last thing you see as you finish reading this sentence is the rain of paper fasteners as your arms dissolve into paperclips and you collapse into the disintegrating planet Earth beneath you.

Or maybe that didn’t happen? Yet.

Is it possible? It’s possibly possible, in the sense that we don’t have any definite reason to know for sure that it’s impossible.1 There may be some reason it’s impossible that we don’t know about, so it’s also possibly impossible.

Is it likely?

I made this scenario deliberately absurd, but I don’t see any way to say that it’s unlikely. The most one could say is “nothing remotely similar has ever happened, and I don’t see how it could.” But the total amount of computer power in the world is increasing very rapidly, so maybe the only reason this hasn’t happened yet is that there haven’t been yottaflops2 worth before.

Are less fanciful superintelligence doom scenarios likely?

A major difficulty in reasoning about AI safety is that we don’t have a clear understanding of what “superintelligence” would mean, nor what powers it might enable. That doesn’t imply it can’t exist, or doesn’t already exist, or that it wouldn’t be a problem. Nevertheless, guessing about superintelligence’s likelihood seems fruitless.

Contemplating strategies to fight sudden superintelligence also seems pointless. Victory would be valuable, so some effort is justified regardless, but success does not seem likely. Superintelligence is inconceivable, by definition, so it is impossible to reason about. AI might be as much smarter than us as we are smarter than aphids, and aphids aren’t good at figuring out what humans are capable of. They have not made much progress on their “safety from hostile humans” problem.

That may be scary, or not, perhaps depending on your personality. I don’t lose sleep over it, because there are other horrible scenarios whose probabilities I can’t estimate, and because there’s nothing to be done about it. There is no way to fight a monster which has arbitrary, unlimited powers by fiat of imagination.

On the other hand, the kind of “AI” that is already all around you is already causing massive harms, and risks catastrophes including human extinction.

Many catastrophic scenarios require no new technologies, except perhaps moderate advances in AI.3 Those should seem plausible unless you are sure it is impossible for near-future AI to do the things the scenarios describe. Many people do have such certainty, but that seems to lack grounding in reason or evidence. It resembles religious faith, instead.

The next chapter discusses plausible scenarios which do scare me, and I hope to persuade you to act against them. We can take pragmatic actions to reduce specific risks.

There remains an unquantifiable risk of an infinite disaster that cannot be prevented. It is unquantifiable, and there is no definite way of preventing it. This is a brute fact.

We have to live with existential uncertainty—in this matter, among many others. We have always faced infinite risk, and personal extinction, as individuals. Now the same is true of humanity as a whole.

  1. 1.There are various in-principle arguments that even “human-level” AI is impossible. Most amount to “people are magic and not at all like steam engines.” Each argument is fallacious and has been thoroughly refuted, so I won’t review them here. No current evidence rules out the possibility of entities inconceivably more intelligent than we are—although what that would even mean is anyone’s guess. It wouldn’t mean IQ 14,000, because that is physically impossible, but due only to a quirk in the technical definition of IQ: it is an ordinal property, not a quantity. Supersymmetry is completely hypothetical; I’m using it just as a stand-in for “unknown fundamental physics.” You could substitute “string theory” or whatever. Self-reproducing zeptobots are not implied by supersymmetry, but as far as I know they’re not ruled out, since its details and implications are undetermined. “Zepto-” is 10-21, so zeptobots are robots a million times smaller than a proton, whose diameter is about 10-15 meters. It’s faintly plausible that if they could exist, they could get inside atomic nuclei to transmute elements.
  2. 2.FLOPS is an acronym for “floating point operations per second,” which is a measure of computer power. Yotta- means a septillion, 1024, so yottaflops are a whole lotta flops. I just made the number up; I don’t know how many flops there are in total currently. There will be vastly more in a few years, so it doesn’t matter.
  3. 3.For several plausible scenarios that depend on no future technology besides AI, see José Luis Ricón’s “Set Sail For Fail? On AI risk,” Nintil, 2022-12-12.