Aaron's Blog
Pigeon Hour
Preparing for the Intelligence Explosion (paper readout and commentary)
0:00
-2:25:21

Preparing for the Intelligence Explosion (paper readout and commentary)

In which I read and then briefly discuss a paper by Fin Moorhouse & Will MacAskill

Preparing for the Intelligence Explosion is a recent paper by Fin Moorhouse and Will MacAskill.

  • 00:00 - 1:58:04 is me reading the paper.

  • 1:58:05 - 2:26:06 is a string of random thoughts I have related to it

I am well-aware that I am not the world's most eloquent speaker (lol). This is also a bit of an experiment in getting myself to read something by reading it out loud. Maybe I’ll do another episode like this (feel free to request papers/other things to read out, ideally a bit shorter than this one lol)


Below are my unfiltered, unedited, quarter-baked thoughts.

My unfiltered, unedited, quarter-baked thoughts

(Transcribed ~verbatim)

Okay, this is Aaron.

I'm in post-prod, as we say in the industry, and I will just spitball some random thoughts, and then I'm not even with my computer right now, so I don't even have the text in front of me.

I feel like my main takeaway is that the vibes debate is between normal to AI is as important as the internet, maybe. That's on the low end, to AI is a big deal. But if you actually do the not math, all approximately all of the variation is actually just between insane and insane to the power of insane. And I don't fully know what to do with that.

I guess, to put a bit more of a point on it, I'm not just talking about point estimates. It seems that even if you make quite conservative assumptions, it's quite overdetermined that there will be something explosive technological progress unless something really changes. And that is just, yeah, that is just a big deal. It's not one that I think of fully incorporated into my emotional worldview. I mean, I have it, I think, in part, but not, not to the degree that I think my, my intellect has.

So another thing is that one of the headline results, something that Will MacAskill, I think, wants to emphasize and did emphasize in the paper, is the century in a decade meme. But if you actually read the paper, that is kind of a lower bound, unless something crazy happens. And I'll, this is me editorializing right now.

So, I think something crazy could happen first, for example, nuclear war with China, that would destroy data centers and mean that, you know, AI progress is significantly set back, or it's an unknown unknown. But the century in a decade is really a truly a lower bound. You need to be super pessimistic with all the in-model uncertainty. Obviously there's out of model uncertainty, but the actual point estimates, whether you take geometric, however you do it, arithmetic means over distributions, or geometric means, however you combine the variables, you actually get much much faster than that.

So that is a 10x speed up, and that is, yeah, as I said 10 times, as pessimistic as you can get, I don't actually have a good enough memory to remember exactly what the point estimate numbers are. I should go back and look.

So chatting with Claude, it seems that there's actually a lot of different specific numbers and things. So one question you might have is, okay, over the fastest growing decade in terms of technological progress or economic growth in the next 10 decades, what will the peak average growth rate be? But there's a lot of different ways you can play with that to change it. It's, oh, what's the average going to be over the next decade? What about this coming decade? What about before 2030? Do we're talking about economic progress, progress or some less well-defined sense of technological and social progress.

But basically it seems the conservative scenario is, is that the intelligence explosion happens and at some, in some importantly long series of years, you get a 5x year over year. So not a doubling every year, but after two years, you get a 25x expansion of, of AI labor. And then 125 after three years. And I need to look back. I think one thing they don't talk about specifically is, oh yeah, sorry.

They do talk about one important thing to emphasize. And as you can tell, I'm not the most eloquent person in the world. Is that they talk about pace significantly and about limiting factors. But the third, the thing you might solve for, if you know those two variables is the length of time that such an explosion might take place across and just talking, thinking out loud, that is something that they, whether intentionally or otherwise, or me being dumb and missing it. I don't think that they give a ton of attention to, and that's yeah. I mean, my intuition is approximately fine.

Does it matter if the intelligence explosion conditional on conditional on knowing how to distribution of rates of say blocks of years, say, so we're not talking about seconds, we're not talking about, I guess we could be talking about months, but we're not talking about weeks, and we're not talking about multiple decades.

So we're talking about something in the realm of single digit to double digit numbers of years, maybe a fraction of a year. So two ish, three orders of magnitude of range. And so the question is, conditional on having a distribution of peak average growth rate for some block of time. Does it matter whether we're talking about two years, or 10 years or what? And sorry, backtracking, also conditional on having a distribution for the limiting factors.

So at what point do you stop scaling? Because we know that there's the talking point, infinite growth in a finite world is true. They're just off by 1000 orders of magnitude, or maybe 100. So there actually are genuine limiting factors. And they discussed this, at what point you might get true limits on power consumption or whatever.

But yeah, just to recap this little mini ramble. We don't, one thing the paper doesn't go over much is the length of time specifically, except insofar as that is implied by distributions you have for peak growth rates and limiting factors.

So another thing that wasn't in the paper, but that was, I'm just spitballing that was in Will MacAskill recent interview on the 80,000 hours podcast with Robert Roeblin about the world's most pressing problems and how you can use your career to solve them. Is that, yeah, I think Rob said this, he wishes that the AIX community hadn't been so tame or timid, in terms of hedging, saying, emphasizing uncertainty, saying, you know, there's a million ways it can be wrong, which is of course true. But I think his, the takeaway he was trying to get at was, even ex-ante, they should have been a little bit more straightforward.

And I actually kind of think there's a reasonable critique of this paper, which is that the century in a decade meme is not a good approximation of the actual expectations, you know, the expectations is something like 100 to 1000x, not a 10x speed up. As lucky as a reasonable conservative baseline, you have to be really within model pessimistic to get to the 10x point.

Another big thing to comment on is just the grand challenges. And so I've been saying for a while that my P doom, as they say, is something in the 50% range. Maybe now it's 60% or something after reading this paper up from 35% right after the bottom executive order. And what I mean by that, I actually think is some sort of loose sense of, no, we actually don't solve all these challenges.

Well, so it's not one thing MacAskill and Morehouse emphasize, but in both the podcast that I listened to and the paper is it's not just about AI control. It's not just about the alignment problem. You really have to get a lot of things right. I think this relates to other work that MacAskill is on that I'm not super well acquainted with. But there's the question of how much do you have to get right in order for the future to go well. And actually think there's a lot of strands there. Like I remember on the podcast with Rob, that we're talking in terms of percentage, percentage value of the best outcome. I'm not, yeah, I'm just thinking out loud here, but I'm not actually sure that's the right metric to go with.

It's a little bit like, so you can imagine just we have the current set of possibilities and then exogenously we get one future strand in the multiverse, the Everettian multiverse. And a single Everettian multiverse thread points to the future going a billion times better than it could otherwise. I feel like this approximately should not change approximately anything because you know it's not going to happen. But it does revise down those numbers, your estimate of the expected value, the expected percentage of the best future, it revises that down a billion fold.

And so this sort of, no I'm not actually sure if this ends up cashing, I'm just not smart enough to intuit well whether this ends up cashing out in terms of what you should do. But I suspect that it might, that's really just an intuition, so yeah I'm not sure.

You know something that will never be said about me is that I am an extremely well organized and straightforward thinker. So it might be worth noting these audio messages are just random things that come to mind as I'm walking around basically a park. Also that's why the audio quality might be worse.

Oh yeah getting back to what I was originally thinking about with the grand challenges and my P. Doom. They just enumerate a bunch of things that in my opinion really do have to go right in order for some notion of the future to be good. And so there's just a concatenation, I forget what the term is, but a concatenation issue of even if you're relatively optimistic and I kind of don't know if you should be on any one issue.

Like okay, so some of these, let me just list them off. AI takeover, highly destructive technologies, power concentrating mechanisms, value lock-in mechanisms, AI agents and digital minds, space governance, new competitive pressures, epistemic disruption, abundance, so capturing the upside and unknown unknowns. No, they're not, it's not as clean a model as each of these are fully independent. It's much more complex than that, but it's not as simple as you just, oh, if you have a 70% chance on each, you can just take that to the power of eight or however many there are and get the outcome that they all go well. Not only because they're overlapping and they're not independent, et cetera.

But I feel like this is sort of a more explicit, that, this is the kind of thing that I've been getting at with my relatively high, I think, PDoom numbers, even though a significant amount of my approximately 50% or 60% probability mass is basically not, not that there's a classic Yudkowskian takeover, single event where everybody drops dead from a pandemic that Claude 3.9 designed. But it's basically that, yeah, there's a shit ton of stuff to get right.

We could actually imagine a thriving sort of civilization on Earth, even one that ends wild animal suffering. And there's the question of, okay, what are the odds that, okay, what else is going on and also elsewhere in the universe and shit does get very weird. You get into weird counterfactuals, alternative evolution trajectories, aliens with different values.

So yeah, what else is there? Oh, yeah, I'm particularly nervous about getting AI consciousness right? I think the most likely way that we do get it right, or we do end up treating AI as well, is basically by accident. Just not a great position to be in.

I haven't really thought specifically about the likelihood that we basically treat AI as well as a group or don't.

I don't know. I want to say 50-50 or something, but it's also correlated with other things. Maybe that's too optimistic. I'm really just doing the opposite of what I dislike, which is when people refuse to put numbers on things. But I could change that in five minutes.

Yeah, another thing that is a little bit outside of the scope of the paper, but obviously relevant is, okay, so what do we do about this? And, you know, there's a meme of pause AI and actually I'm just in favor of frontier AI scaling. But quite possibly the more plausible and frankly, more important thing is that you actually take the growth rates down from 1000x year over year to approximately say 10x a year over year, which give you a 10 to the 10, x increase over a decade in terms of technological progress to approximately doubling. Sorry, approximately doubling every year, which would get you to the 10, which is 1000x or something. Or maybe you want to take us down to 30% growth. I don't know what 1.3 to the 10th is maybe that's roughly 10x or something. No, that can't be right.

But anyway, I do think it's an important question is, okay, what is the fastest rate of change we can get where we still muddle through? And 10x doesn't seem crazy to me. And I feel like this is actually just the default, this is the default alignment plan, is that we muddle through, and by some combination of effort, luck, restraint, and policy, just aren't independent. We merely see 30% annual growth, or something. Or things just go really right by accident. Now, that's not exactly a plan, right? But that is kind of where my optimism comes from, or my optimistic probability mass.

Yeah, and another point, important thing in the paper to reflect on, is punting stuff to future AI. And I don't think I have great original thoughts on this. I do think that, just point listeners to Joe Carl Smith, who I just, first of all, big fan. Second of all, even though he's wrong about stuff, not everything. Second of all, there, I think, he is specifically talking about, at least in this recent post that he wrote, the more fine grained dynamics of how we can best use AI to do AI alignment research. And that's only one sort of grand challenge in MacAskill and Morehouse's schema. But I will just point out, there's a lot of good thinking there.

Like one important takeaway is, you want to pause or slow down, maximally, in the range where AI is not yet maximally dangerous or ideally dangerous at all, but can significantly help with stuff with AI alignment, space governance, epistemics, et cetera. And that is something that I feel like I could have gotten to from first principles, but didn't. So hats on to me instead of hats off.

Yeah, also just in terms of the degree to which maybe this is going to manifest in my real life. I'm basically, so there's a chance I'll back out, but I'm not planning to, I'm just going to sleep on it one more night, planning to basically make a bet that any single calendar year by 2030 will have a growth Rate of above 5% now 5% is look and I'm planning to do this on something like a $1,000 magnitude and two to one odds against me. So I lose $1,000 if I lose and gain $500 if I win and I'll just throw that out there as 5% It's not directly addressed the question of whether we're got faster than last since the 1970s in economic growth. It's not directly addressed the question of whether we're got four percent versus seven percent. It isn't directly in the paper.

But yeah, I guess I'm reiterating here But look at a very broad qualitative level a lot of things have to go very specifically wrong quote-unquote wrong or right depending on your opinion in order for there not to be a world's historical speed up in technological growth and presumably also economic growth. I will also note that some of these random pauses I'm being self-conscious as I'm getting passed by joggers. And I kind of I would like to think that I think I'm a cool business analyst guy talking in these jargony terms. They probably don't.

Now totally separate from the substance. There's the question of whether I should do another thing of this this was actually a quite long paper.

So the recording is two hours I only did one very small section over again, literally 30 seconds. But basically just read it in multiple parts because I'm quite ADHD did like 7 and 17 different sections. That when you stitch them together you got two hours over the course of two days. But it was a true non-trivial amount of time energy and focus for those two days, but actually kind of the idea of doing of forcing myself to read this way. Because God knows that sometimes I won't otherwise a little bit too Twitter brained. But if you read it out loud to some extent you have to absorb some of it at least for me. You can't just skim totally over it and maybe I'll just maybe I'll do this more you know. Hopefully it won't be I think this is 55 56 pages. So pretty with some diagrams or whatever.

So I call it 50 pages of text so you're looking at 25 pages an hour or 30 minutes could do in Yeah, could not do it not that. Wouldn't be super difficult for me to do that in say three parts, I think. So maybe I should find more important PDFs to read there are really quite a lot of PDFs and Google Docs for that matter, I don't know. It's not a very enlightened point but I don't know, does anybody read this stuff?

Will MacAskill is kind of famous and this is a quite important paper and it's quite well written and it's very important. And so that's and but what about all the stuff that's just random PDFs out there that is written by some not a random guy, but a random smart guy which is maybe one-tenth of the importance which is still extraordinarily important or whatever. Does anybody read those PDFs?

The same park earlier today. I saw a wild turkey. And so I'm still hoping I see one. It's kind of cool. It was kind of from afar, but I haven't seen one yet another thing to spitball about which is I'm just reminded of isn't it's only tangentially related to the paper.

Is Leopold Aschenbrenner's if that's how you pronounce it. I forget the name but his big paper that was making the rounds on Twitter about how we need to ensure that the Democratic bloc of countries has a commanding lead over other particularly authoritarian countries particularly China when it comes to basically developing artificial intelligence and this is weird because it's a tension between okay if you're have a far enough lead then you know, you can take more safety measures. But also that sort of implies speeding up or whatever. I would just go on the record for that's for people listening to this. That's optimistic I basically agree with Leopold. And maybe tangential is putting it too weakly. This is certainly in the realm of MacAskill's grand challenges.

Um in some sense I yeah, maybe this is misplaced optimism but in terms of governance and stuff I'm not super confident I'm just spitballing but I do have a little bit of a sense of MacAskill maybe working a little bit too much of a Play model of you have an arbitrary country is what pops out of that whereas really what you actually have is the US and China and really idiosyncratic dynamics that can pop out of that and so some of my optimistic probability mass is basically on either by choice or by accident we get Leopold something approximating the ideal outcome of the US just continues to be dominant and that actually doesn't totally obviate especially when you get into the long term but at least diminishes or partially obviates some of the governance grand challenge stuff. There might just yeah, it's true that if there's arbitrary color countries there's these competitive dynamics and this and that.

I don't know I mean I do think part of the reason why maybe there's a vibes not quite a disagreement but a vibes separation or whatever is MacAskill thinking is not bound into this quite intuitive near-term frame of this is the world in 2025 and you know, there's... Yeah, this is truly a peak ADHD ramble but it's just quite interesting what dynamics pop out of the perhaps neglected intersection between quote-unquote long-term ism and actually very near-term timelines. You have the genuine intellectual foundations of long-term ism from what Toby Ward at all five years ago or something but we're not just talking about long... But Will MacAskill still actually quote-unquote doing long-term ism in this paper and I think that's good.

Like I think just on the merits long-term ism is more or less correct but maybe that actually binds you a little bit too much into a frame of these abstract models because you're so radically uncertain about what the future is gonna be like or as we're talking about developments that are potentially within the next two years quite plausibly the next five. So it's yeah, maybe abstract models just aren't actually the best way to go about doing intellectual work here.

I guess this is beating a dead horse to some extent but man, it is striking the degree to which that the what I think are the merits and or what objectively are the contents of this paper just how far that is from the current Overton window. The Overton window is quite a bit wider than it was in 2022. For example, but yeah, I genuinely don't know what percentage of members of Congress would basically nod along to this as opposed to objecting or just thinking it's crazy or something. What people and yeah this is quite a bit of uncertainty that I have.

Yeah, maybe this has come across but another thing is just in terms of vibes I think the paper is just quite high quality. This is from a one reading impression so I'm not gonna be able to justify this in a totally legit way but I was listening to this audiobook the other day. It was four stars or whatever and I was no, this is slop. It's written by a smart person or whatever but it's slop. I did not get the sense of this paper is slop. Needless to say if anybody had to be convinced that Will MacAskill is not writing slop you heard it here first.

Here's a really unrelated point. So I'm not a very good public speaker as you can tell I think I'm somewhere between a B maybe a B plus at best, but generally a B minus and C minus in terms of speaking quality. And I've actually been involved with this other school or screwworm related project.

And I think the answer is just Yeah for a lot of kind of random niche stuff no one else is actually gonna do it so even if you kind of suck you could make a contribution anyway, especially insofar as it involves punting some sort of other effort off to the actually competent people. So that's a vibe. I'm pretty sure nobody's listening at this point but if you are and you want to recommend papers for me to read you can do that. No promises, but just FYI.

Yeah, another thing just again kind of related in the same sense that Ashton Brenner's is related is I really don't think Donald Trump should be president during the intelligence explosion and this is not quite a hot take in my circle, but maybe I think it needs to be said because there's an ask you might have which is indefinite AI pause and then there's an ask you might have that's delayed timelines by two to three years and these are substantially different. In two to three years, you can do a lot of non frontier work. The products that anthropic and AI and Google DeepMind would make if they paused AI scaling it and put off an intelligence explosion would be quite society, quite I mean, it may be not transformative but quite impressive. It wouldn't feel like an AI pause.

And also I think that most of the smart people yes, even the tech bros are basically not fans of Donald Trump and so yeah, here's my ask as a rando on Pigeon hour the most famous podcast in the world is that we should not have self recursively self-improving or explosive technological growth driven by AI before the 2028 presidential election. Maybe I'll leave.

This is my last hot take. I don't know if this is tractable, maybe it's not. But I haven't seen it except for me on Twitter I haven't seen it said so explicitly. Usually it's a general notion of quote-unquote AI pause, and I you know, a lot of things are just dynamics or randomness, there's competitive pressures, but the federal government of the United States it's quite important and the person has quite a bit of power. I'm would be surprised if less than one percentage point worth of probability mass just comes down to the quality of the US federal government. Maybe it's five percentage points or something. That's yeah, if you take EV seriously, that's really fucking important. So yeah, I don't know Sam, Dario, Ilya, friends if you're listening to this, please wait until 2028 the outro music for pigeon hour doo-doo-doo-doo-doo

Discussion about this episode