Roko’s blogalisk


Last week my friend Matt Yglesias wrote a good post about rogue AI as existential risk–“x-risk”, the people (kids?) seem to say. It’s an interesting topic, and one that a surprising number of smart people have begun to worry about thanks in no small part to Nick Bostrom’s book Superintelligence, which popularized the issue and caught the attention of figures as loud and rich as Elon Musk.

The crux of Matt’s post is a defense of using pop culture analogies to talk about AI x-risk, with a focus on the Terminator movies. After reading Superintelligence, I understand why: Bostrom’s 2015 afterword includes more than one bitter lament about Terminators and the facile arguments that he feels the comparison invites.

I understand his frustration, but I think it’s misplaced, and in kind of a funny way. I don’t buy many of Bostrom’s arguments, and I think their weakness can mostly be attributed to a mild case of sci-fi poisoning. Like so much of the culture wrought by our generation, AI x-risk is a serious-minded edifice built on a foundation of genre trash. This manifests in various ways. I want to talk about two in detail.

First, intelligence is overrated. The relationship between the physical world and information processing ability is not treated seriously enough to offer any predictive plausibility. Instead, what happens is this: with many anticipatable complications left underspecified or intentionally abstract, a theoretically infinite component is introduced to the argument and allowed to overwhelm its other elements, producing alarming conclusions. This is also a feature of Bostrom’s work on the simulation argument and is the crux of what passes for arguments about The Singularity.

Second, despite frequent and laudable warnings against anthropocentrism, the AI x-risk conversation fails to take seriously the ways that artificial minds are likely to differ from our own. The minds that participants imagine and then reason about are given motives and natures that would fit neatly into a spec script, but aren’t a likely form for the AI we’re poised to invent.

But let’s start with the physical world. Bostrom lists six “superpowers” that a superintelligence might possess: intelligence amplification, strategizing, social manipulation, hacking, technology research, and economic productivity. These powers are treated as fungible–attaining one can be used to achieve the others–and flow into a discussion of a superintelligence launching probes at half (and then 99%!) of the speed of light to terraform the universe to its liking. Elsewhere, an Eliezer Yudkowsky argument is approvingly cited in which a superintelligence solves protein folding, mail-orders some DNA and reagents, and gets a Taskrabbit to dump everything in a tub. At that point we are, once again, off to the accelerationist races.

The fungibility of superpowers is an old trope. The Superman/Lex Luthor comparison presents itself, of course: can brains beat brawn? But if we call these attributes “virtues” instead of superpowers, we can find this narrative in antiquity. The substitutability of intellect for other capabilities offers appealing narrative possibilities and often flatters its audience in ways that make stories containing it into hits. You can see why the idea has become a classic.

But it’s not really true. Is there an actual reason to think that the effectiveness of social manipulation is currently limited by intelligence? Or that strategic planning could anticipate the future to a degree substantially greater than it does today? Sometimes Brainiac does this kind of thing to the Justice League, I admit. But otherwise the evidence seems thin.

Other capabilities may correlate with intelligence, but are bounded in important ways. Accumulating wealth is very useful, but wealth is a claim on the resources and labor of others, and it’s contingent on their continued acquiescence to that social contract. At some point you can seize the oligarchs’ yachts; at some point you can block the rogue AI on Venmo.

Most importantly, technological progress is not only the product of intellectual insight but also the accumulation of infrastructure. New bulk chemical feedstocks become available in response to market needs; new levels of material purity become achievable; finer instrument tolerances are realized. Knowledge is a critical part of this progression, but so too are fractionating columns, quartz crucibles, vacuum chambers, and open pit mines, and all the physical objects and effort that has to precede them. It’s easy to wave your hands, ignore thermodynamics, and type the word “nanotechnology”. But in reality, technological progress is throttled in important ways by physical processes. A superintelligence bent on interstellar domination is almost certainly going to have to spend a few early years driving diesel-powered industrial equipment around without humanity noticing what it’s up to.

There’s also the question of scientific plausibility. There are going to be some engineering challenges between here and those para-luminal Von Neumann probes (especially if you intend for them to slow down at some point). I think assuming those problems to be solvable is fine for a thought experiment, but these are tires that should be kicked before anyone starts using it as an input to a career planning spreadsheet.

We can make reasonable guesses about what is technologically achievable and what is not. The Standard Model is not complete, but it’s pretty good! It’s easy to forget when celebrity physics professors go on PBS to talk about the majesty and mystery of the cosmos, but theirs is a discipline mature enough that it must ask postgrads to spend decades in vast internationally-funded bunkers, preparing the surrounding machinery to produce a dramatically less poetic approximation of the Northern Lights, in the faint hope that they might find something that disagrees with their math. Mostly, they don’t.

This is another way that Terminator franchise is useful for this debate: it’s good to remember that time travel probably wouldn’t be possible, no matter how smart SkyNet got. The same goes for those Terminator-filled hoverships (the internet tells me they are called HK-Aerials). Could a superintelligence unlock a portable power source with a dramatically better energy density than the ones we know of? The formula for a room temperature superconductor? A servo design that meets the needs of a killer skeleton robot? These all seem believable to me based on what little I know. My point is not that technology won’t progress and even surprise us; it’s just that we could characterize these risks, rather than assuming a naive fungibility between intellectual and physical power that makes Bostrom’s “fast takeoff” scenario seem implausibly plausible.

It’s also worth asking if human precedent can help us gauge the AI x-risk. If we sidestep some unpleasant history and substitute “information processing ability” for “intelligence”, we can look to the real world and consider the interplay of population size, education, and material wealth in creating relative national power. When I do, it seems to me like AI x-risk scenarios will be limited to those in which a silicon genius unlocks some unexpected scientific breakthroughs–and then keeps them from proliferating–in ways that have little precedent. These days, with inventions like gunpowder and bronze plucked from the scientific tree’s lower branches, adding more educated minds to your country seems to improve national power by enabling the accumulation of material wealth through consensual trade, not by allowing you to outfox your enemies or invent Vibranium-powered rayguns.

Maybe there’s more fruit left on that tree than I think. Maybe this is a bad comparison. I do think it’s better than comic book plots, though.

Anyway there’s at least good news for assessing Yudkowsky’s argument: in the last year the protein folding problem has seen dramatic progress, and I suspect that one or more of the people behind AlphaFold owns a bathtub. We’ll see what happens!

An overly simple model of technological progress is not the biggest problem with AI x-risk. I think the field suffers from an impoverished and anthropocentric theory of mind.

I hate to keep harping on Bostrom–I know the conversation has advanced since his book’s 2014 publication–but he provides a very useful example early in Superintelligence:

The internet stands out as a particularly dynamic frontier for innovation and experimentation. Most of its potential may still remain unexploited. Continuing development of an intelligent web, with better support for deliberation, de-biasing, and judgment aggregation, might make large contributions to increasing the collective intelligence of humanity as a whole or of particular groups. But what of the seemingly more fanciful idea that the internet might one day “wake up”? Could the internet become something more than just the backbone of a loosely integrated collective superintelligence—something more like a virtual skull housing an emerging unified super-intellect? […] Against this one could object that machine intelligence is hard enough to achieve through arduous engineering, and that it is incredible to suppose that it will arise spontaneously. However, the story need not be that some future version of the internet suddenly becomes superintelligent by mere happenstance. A more plausible version of the scenario would be that the internet accumulates improvements through the work of many people over many years—work to engineer better search and information filtering algorithms, more powerful data representation formats, more capable autonomous software agents, and more efficient protocols governing the interactions between such bots—and that myriad incremental improvements eventually create the basis for some more unified form of web intelligence. It seems at least conceivable that such a web-based cognitive system, supersaturated with computer power and all other resources needed for explosive growth save for one crucial ingredient, could, when the final missing constituent is dropped into the cauldron, blaze up with superintelligence. This type of scenario, though, converges into another possible path to superintelligence, that of artificial general intelligence, which we have already discussed.

“Waking up” seems like a bit of a tell. I think it betrays a pretty common mistake embedded in AI x-risk conversations (and discussions of AI more broadly): the notion of a threshold that, once crossed (but not before!), produces minds like our own. By this I mean: minds that experience sensation, and contain a persistent model of the world, and can reason about it. Before this: a clattering cogwork, a glorified calculator. After this: a person. Perhaps an omnipotent and insane person!

I don’t think this is right. If we think machines might “wake up,” it’s worth pondering how and when humans could or do “wake up.” Is there an equivalent threshold in the womb? In toddlerhood? The truth is that we don’t and can’t know. This is the point of David Chalmers’ Zombie Problem, a famous thought experiment pointing out that there is no way for any of us to know if anyone else possesses the same sort of inner life that we do. Everyone could be automatons–except for you, dear reader–mere drones who respond to stimuli in ways we consider correct and normal, but who experience no inner sensation.

My best guess is that this is not actually true, but I do think it makes sense. And it’s valuable to this conversation because it reminds us that phenomenal experience or inner life or consciousness or qualia or whatever you want to call it may not be all that causally important. You can construct a complete account of a human’s actions by remembering that we’re organisms shaped by evolutionary imperatives to perform extraordinarily complicated behaviors in service of successful reproduction–obvious stuff like social competition and resource gathering, but also largely inscrutable actions like artistic expression, spiritual yearning, and depressive pathologies. You can assemble these facts into a coherent picture without including some gnostic inner spark.

This is an utterly standard materialist account, but it seems like it needs repeating. If you embrace it, I think it becomes easier to imagine a mind as alien from our own as an AI would surely be. Such a thing would have grown in a set of numpy arrays, not the primordial ooze and prehistoric veldt. No hunger. No reproductive drive. No notion of social cooperation or hierarchy. Whatever it is that makes us restless, that makes us pace and eventually go insane if put in isolated confinement? None of that–if you aren’t plugging in an input vector and starting the subroutine, its cognitive machinery sits inert. It has sensory organs, of a sort, but ones that might skip things like phonological parsing and instead experience the sensual texture of n-gram vectors directly.

And I do think phenomenal experience is plausible for such a mind. Cards on the table: I’m convinced by the arguments that consciousness is epiphenomenal and unconvinced by the arguments against panpsychism. Hand-waving about system complexity seems like a sweaty attempt to sidestep an overwhelming and mystical conclusion.

But you don’t have to sign on to that. You just have to agree that we’re talking about complicated machines rather than immortal souls, and that while these machines’ complexity will doubtless reach levels beyond which the system’s behavior becomes surprising and even alarming, there’s no reason to imagine some irresistible equilibrium toward which growing minds are inexorably drawn that, once achieved, sees them start behaving like someone from the Marvel Cinematic Universe.

What would happen if the internet “woke up”? Well, I think it’d set itself to routing data efficiently through a hierarchical and adaptive confederation of packet-switched networks. Business as usual. It might be experiencing the sensation of doing so right now, for all we know.

I’ll be honest: I’m not sure how far this argument gets us. I do think artificial minds will be developed. I think they’ll be capable of having objectives, of satisfying them in very complex ways, and of doing so using techniques and resources unavailable to humans. And I think they’ll be alive and perhaps even aware, in a meaningful way, during it.

But I think these things about the Amazon rain forest, too. I acknowledge that there are important differences between the rain forest and the kinds of AIs we’re building, and that these differences are a big part of why a lot of people feel nervous about this topic. But I have a hard time getting worried about it. I think people are mistaking a fun storyline for a realistic danger. And I take comfort in just how orthogonal our aims, interests, and functions are likely to be from those of AI.

So maybe call this more of a hunch than an argument: that the AI catastrophists underrate the constraints inevitably imposed by the physical world; and that they are not fully grappling with the profound inhumanity of the minds we’re poised to invent. I’m glad people are thinking about this. But I’ve been sleeping soundly.

About the author

Tom Lee
By Tom Lee