EPISODE 2026-06-23

AI:AM LIVE — June 23, 2026 — Self-Improving GPU Kernels and Europe's AI Sovereignty: Bing Xu and Michiel Bakker

The open tracked an unusually quiet news day through a markets lens — rumors that GPT-5.6 was pulled back amid the model-release freeze and that Gemini 3.5 Pro is lagging, a 6% semiconductor selloff as SK Hynix overtook Samsung for the first time in 27 years, Anthropic's first memory-chip deal with Micron, and an extended debate on whether AI's leverage dynamics make the boom a 'too big to fail' bubble. Bing Xu, founder & CEO of INT21 (co-creator of MXNet and AITemplate, co-author of the original GAN paper, founder of NVIDIA-acquired HippoML), then made the case that self-improvement should target the infrastructure, not the model: his PTX Kernel Factory points agent swarms at the GPU ISA below CUDA, matching expert libraries like QuACK on mature kernels and posting up to 59% speedups on newer ones — and, he argued, deepening NVIDIA's moat rather than eroding it, because the evolutionary loop depends on NVIDIA's profiling ecosystem. MIT/DeepMind researcher Michiel Bakker followed on Europe 2031, his viral month-by-month scenario of Europe sleepwalking into AI dependence — a fictional 2028 export-control beat that materialized within a day of publishing when the US restricted Anthropic's models for foreign nationals — laying out why regulation requires capability first, why the nuclear-umbrella analogy fails for an economic technology, and where Europe's real leverage (the ASML/IMEC semiconductor ecosystem, a middle-power coalition) still lies. The hosts closed on the geopolitics of AI data and a tease of David Duvenaud and 'gradual disempowerment' the next morning.

▶ Full show on YouTube 𝕏 Live broadcast

The June 23 show opened on a rare quiet news day and turned to the markets for signal. Nathan relayed leaker-circle rumors that GPT-5.6 had been pulled back from a planned launch amid the ongoing model-release freeze, and that Gemini 3.5 Pro was falling short of the bar set by recent frontier models. Prakash's markets update flagged a 6% semiconductor selloff and a symbolic milestone — SK Hynix overtaking Samsung as Korea's most valuable company for the first time in 27 years — alongside Anthropic's first memory-chip partnership with Micron and an extended debate on whether AI's leverage dynamics, especially among leveraged Korean retail investors, make the current boom a 'too big to fail' bubble or a fundamentally backstopped expansion.

The two interviews shared a spine: where self-improving AI actually bites, and who controls the compute underneath it. Bing Xu (INT21) argued the highest-leverage self-improvement loop targets the GPU infrastructure rather than the model — his PTX Kernel Factory evolves NVIDIA PTX kernels that match or beat expert-written libraries, and he made the counterintuitive case that this raises NVIDIA's moat. Michiel Bakker (MIT / Google DeepMind) brought Europe 2031, his viral scenario of European AI dependence, and argued Europe still holds real leverage in the semiconductor ecosystem and a middle-power coalition — if it acts while the window is open. The hosts closed on the geopolitics of AI data and previewed David Duvenaud on 'gradual disempowerment.'

The rundown

3:40Opening30 min
Opening: GPT-5.6 Rumors, the Chip Selloff, and the 'Too Big to Fail' Bubble QuestionOn a quiet news day, Nathan and Prakash read the market: rumors that GPT-5.6 was pulled back amid the release freeze and that Gemini 3.5 Pro is lagging, a 6% semiconductor selloff as SK Hynix overtook Samsung for the first time in 27 years, Anthropic's first memory-chip deal with Micron, and an extended debate on whether AI's leverage dynamics make the boom 'too big to fail.'
Watch
As aired
Nathan Labenz and Prakash opened the June 23 show by noting an unusually quiet 24-hour news cycle — a brief lull after recent dramatic AI policy events. Nathan relayed rumors from leaker Discord circles suggesting that GPT-5.6 had been pulled back from a planned launch this week, likely because of uncertainty over the ongoing export-control and model-release freeze stemming from the administration's directives. The same rumor mill indicated that Gemini 3.5 Pro was falling short of the new benchmark set by models like Abel, meaning Google would need more time before that release.
Prakash then delivered his regular markets update, flagging notable turbulence in the semiconductor sector: the semiconductor index was down 6%, driven in part by a significant reshuffling in South Korea where SK Hynix had overtaken Samsung Electronics as the country's largest company by market cap for the first time in 27 years — a move interpreted by some as a bubble signal given heavy retail leverage. Micron was set to report earnings after hours, with whisper numbers around $27 billion for the quarter (~$110B annualized run rate). The previous day, Anthropic had announced a long-term design partnership with Micron — notable as Anthropic's first memory-chip deal, with Micron investing in Anthropic. NVIDIA, by contrast, was pulling back from circular investment deals as its stock had been essentially flat for six months. SpaceX was also discussed: down ~20% from a peak of ~$200/share to ~$160, with a major unlock of previously illiquid shares coming within 30 days.
The hosts then had an extended exchange on financial fragility and whether the AI investment boom echoes historical bubbles. Nathan raised the "too big to fail" framing — referencing a conversation with Dean Ball — and asked how much shock-absorption capacity exists if leveraged retail investors (particularly in Korea) trigger contagion. Prakash argued that near-term cash flows at major AI companies are solid through 2028, but the 3-to-5-year window carries real uncertainty, especially around algorithmic efficiency gains that could reduce the energy and hardware moats underlying current valuations. He introduced his concept of the "capital cycle" — where companies must deploy buybacks fast enough to outrun the leverage in the system — and observed that AI safety advocates have conspicuously avoided pairing "pause" proposals with acknowledgment of the market disruption that would follow. Nathan closed by noting that persistently high GPU rental prices (e.g., A100 hourly rates rising despite newer generations) might be the real fundamental backstop, while Prakash urged viewers with risk capital to lean into AI and tech stocks.
Key moments
I haven't seen a single AI safety person say, let's do a pause and somehow a market downturn won't happen.
Prakash21:25
I asked Dean Ball this on the long podcast with him last week — are we already in a spot where AI is too big to fail? And he basically said, yeah, I think that's a very legitimate worry.
Nathan Labenz12:26
Being a technologist and being on the AGI-build, it looks completely different. It looks like one of the turns in history.
Prakash29:23
What we covered
GPT-5.6 reportedly pulled back; Gemini 3.5 Pro said to be lagging. Nathan relayed leaker-circle rumors that GPT-5.6 had been pulled from a planned launch this week — likely tangled in the uncertainty around the administration's export-control and model-release freeze — and that Gemini 3.5 Pro was falling short of the new benchmark bar, pushing Google's release timeline back. Both were framed as unconfirmed rumor, not reporting.
A 6% chip selloff and a 27-year first: SK Hynix overtakes Samsung. Prakash flagged a 6% drop in the semiconductor index and a symbolic milestone — SK Hynix passing Samsung Electronics as Korea's most valuable company for the first time in 27 years — which some read as a bubble signal given heavy retail leverage. Micron was set to report after the bell with whisper numbers around $27B for the quarter (~$110B annualized).
Anthropic's first memory-chip deal — a long-term partnership with Micron. The prior day Anthropic announced a long-term design partnership with Micron, notable as its first memory-chip deal, with Micron investing in Anthropic. By contrast, NVIDIA's stock had been flat for six months as it pulled back from circular investment deals, and SpaceX was down ~20% ahead of a large unlock of previously illiquid shares.
Is the AI boom 'too big to fail' — or fundamentally backstopped? Nathan raised the 'too big to fail' framing from his Dean Ball conversation and asked how much shock-absorption exists if leveraged retail investors trigger contagion. Prakash argued near-term cash flows are solid through 2028 but the 3–5-year window is uncertain, introduced his 'capital cycle' idea (buybacks must outrun system leverage), and noted AI-safety 'pause' advocates rarely acknowledge the market disruption a pause would cause. Nathan countered that persistently high GPU rental prices may be the real fundamental backstop.
Full transcriptLightly edited · timestamps jump to YouTube
3:40
Prakash: Good morning. It is Tuesday, June 23, 9AM. Nathan, good morning.
3:47
Nathan Labenz: Good morning, Prakash. How are you?
3:49
Prakash: I am very good. We were just chatting about this, but it's actually been a rather quiet 24 hours. One of the few periods of time when we're like, you know, overnight, nothing dramatic happened, and it was just a normal business day.
4:10
Nathan Labenz: Yeah. We're hanging out in the shadow of recent dramatic events, but nothing super dramatic in the last day or two. It does seem — I mean, how much can you believe these rumors? But the sort of latest from the leaks Discord is that everybody just kind of feels like they're in a bit of a holding pattern right now. And supposedly GPT-5.6 was to be launched this week, but now will be held back potentially because they don't know what the status is gonna be or if they'll be allowed to. I mean, obviously they have lines into the administration, but I could easily imagine a situation where it's like, yeah, hold off until we get this figured out before you launch the next thing, because that's just gonna complicate everything. And we don't wanna have to yank you, but we also don't feel like we can play favorites, or let you go when we're not letting them go.
And from the same source, there is the idea that Gemini 3.5 Pro — the much-awaited — is just not up to standard. Not up to the new standard set by the likes of Abel. So again, in the rumor mill, we're gonna have to wait for that a little bit longer too. So we'll see how long this kind of summer doldrums period goes. I would bet it's not gonna go too long before something notable will happen. But for the moment, we have a little bit of a chance to breathe.
5:54
Prakash: Indeed. I'm gonna do a quick markets exercise where I will just quickly go over what happened in the markets overnight. So what has essentially happened is that we are seeing a small minor meltdown. The semiconductor index is down 6%, S&P 500 down about 1%. Nasdaq was down a couple of percent, now it's down 1.5%. For people who are not actually trading, what actually happened is very interesting. Basically, in Korea, you have Samsung Electronics, which has been for the last 27 years the largest company by market cap in Korea. And then you have SK Hynix, which over the last three years has, as of yesterday, overtaken Samsung. And this was seen as a top-of-the-bubble moment. Korea and Koreans and Korean retail investors are heavily leveraged. People are taking out mortgages on their homes to put money into the market.
And today is a big day in the memory space because Micron — the only American firm of the oligopoly in memory — is going to announce its earnings after hours today. The expectations are off the charts. There have been rumors that they're gonna come in at about $27 billion for the quarter. That would put them at a run rate of about $110 billion for the year, which makes them, I think, the fourth or fifth most profitable company in the United States. And they're only valued at $1.2 trillion. They could easily go to $2, $2.2, $2.5 trillion. And so we're at this moment where you may be seeing these memory firms start to overtake market champions like NVIDIA or Google.
Yesterday, Anthropic announced a deal with Micron — a long-term design partnership for memory. This is significant because Anthropic, to this point, was signing deals with hyperscalers. This is the first, I think, memory chip deal, and Micron is investing in Anthropic. Anthropic is buying from Micron — the usual circular trade that we've seen. It's interesting because NVIDIA has backed out of these deals now. NVIDIA no longer wants to do circular deals because the market has not been rewarding NVIDIA. The stock has basically been flat for about six months. Jensen is very, very frustrated. He's starting to change tactics because he wants the stock to move. Fifty percent of the cash flow is devoted to buybacks.
So he's really pushing the stock. Jensen needs to keep the whole story going. The last time the Koreans went down even like 5%, he announced immediately a strategic partnership with SK Hynix. So I would expect Jensen to come in as the plunge protection team, saving all the retail investors in South Korea somewhere in the next few days. SK Hynix and Samsung were down 12% overnight. These are trillion-dollar stocks at this point. Going down 12% is $100–$120 billion dollars evaporating into thin air. People get margin called. There are people who are doing 3x leveraged on this. So if a stock goes down 12%, these people are losing like 50%. So it's significant.
SpaceX has come down as well. SpaceX peaked at about $200. It's now down to $160. Down 20% — that's a bear market in the stock. Only 5% of the stock is floating right now. In the next 25 to 30 days, 25% becomes open to trading. Did Elon price the stock correctly? Are there enough buyers to maintain this $1.5 trillion valuation? Remains to be seen.
And that's basically the roundup of what happened on the markets. We have a lot of speculators now. There's a lot of people who are putting all of their money on this thing. There are temporary moves up and down, but overall it's been very positive. It's the greatest bull market in history. And perhaps this is the way that everyone else outside Silicon Valley and outside of the AI bubble gets to participate. It becomes this kind of overall great endeavor of humanity expressed by capital providers and investors putting all of their money and effort and their accumulated wealth into this thing. And every time you put a dollar into investment, it pulls forward the future by a dollar. So we are seeing this massive investment, and that massive investment itself is a sign of and a cause of acceleration.
11:40
Nathan Labenz: Yeah. There's obviously been a ton of talk over time about the possibility of an AI bubble, and of course that can have multiple meanings. I don't know a lot about this — I don't know just how broad this process of taking mortgages out on South Korean family homes and throwing it all in on a leveraged basis really is. But it does strike me that if you think Jensen has to come in and play market rescue on a 5% dip, that's a pretty fragile situation.
I guess if I have one expectation it's high consumer surplus. If I have two expectations, it might be high volatility. And so I do wonder if this is going to lead to a sort of unwinding on the financial side sooner rather than later, or a government backstop. I mean, I actually just asked Dean Ball this on the long podcast with him last week — are we already in a spot where AI is too big to fail? And he basically said, yeah, I think that's a very legitimate worry. He was kind of like, I don't wanna say that AI companies should be positioning themselves for this at all — on the contrary, I don't wanna see it happen. But, yeah, it's gonna be really tough if all of a sudden there's some sort of cascading contagion happening and this great final flourishing project of humanity is in jeopardy on a financial level. Then, yeah, the government probably just comes in and monetizes it some way or another.
But this kind of makes it feel like that could be happening sooner rather than later. I mean, a $100 billion — we've become so desensitized. I remember when those kinds of numbers were first tossed around back in the financial crisis in 2008, and it was like, holy moly, we're talking tens, hundreds of billions of dollars, this is total insanity. And now that's like a round for a foundation model company. So on one hand it's like, sure, $100 billion isn't what it used to be. On the other hand, there hasn't been that much inflation at the retail level, so it's still a lot.
And I just wonder how much can Jensen backstop? How much can other private actors absorb in terms of a dip? And if everybody's just super leveraged, maybe the smart money strategy — and again, I don't trade, so I'm not smart money — but it feels like there might be a winner out there somewhere in kind of betting on one of these tail events to happen sooner rather than later. That says nothing about the trajectory of models or the overall impact of AI in the zoomed-out big picture. But it just strikes me that this is not that big of a blip in the overall stock chart for it to be the kind of thing where the biggest company in the world feels like they need to come in and do something about it. And it would be what — at $100 billion, that is one-fiftieth, or 2%, of NVIDIA's market cap. So it's a backstop they could provide, but they can't do it that many times either. They only have so many financial bullets for this sort of maneuver. How fragile do you think this whole thing is financially?
16:07
Prakash: So I think there are varying degrees of fragility. And you start off from cash flows and near term. I think until 2028, everyone has the cash flows covered. You won't see a company going bust because it ordered too many things and can't pay for it. That's not gonna happen in the two-year mark. So what we are really kind of trying to figure out right now is the three, four, or five year mark. And that is also the place where you have this AGI thing, which could happen. 2028 is kind of when a number of the companies have said they are aiming for a recursively self-improving AI. It has struck me that beyond that RSI point, it is very difficult for anyone to make predictions.
I was talking to a friend of mine, and he was very bullish on SpaceX. He's AGI-build, and he expects SpaceX to be able to sell launch capacity to the AGI because SpaceX is gonna be the only entity with the physical capacity at that point. He had this vision that all the software gets solved and software doesn't have that much value, so the SpaceX stuff is gonna have value. But my question is: if you think RSI is gonna happen, if you think AGI is gonna happen, why wouldn't you think that we will get a transformers-level innovation every month or every week? Why wouldn't you think that as those innovations pile up, your need for energy use is gonna decline on a near-term basis? You're gonna get much more efficient models just based on algorithms.
And you know, in the limit you should be able to approach at least human levels of compute efficiency, which is running an Einstein-level brain on 40 watts. We are so far away from that right now. So at the very least, if you think all the software innovation and idea stuff gets figured out, you should be able to approach that. And if you approach that, what value does SpaceX have in the near term? The AGI is not gonna be capped on launch capacity. It's gonna be able to generate human-level brains at 40 watts.
The net of this is that the fragility is people panicking because they're leveraged — and they should be panicking, because there is leverage in the system. But the balance on the other side is these companies willing to put their cash flow into buying back shares. Jensen is putting 50% of cash flow into buybacks. Micron is gonna have to do that. All of these other firms are gonna have to do that. And as they do that, they reduce the number of shares in the market, and the stock price keeps going up. So they have to keep the stock price going up faster than what the leveraged speculators are expecting. It's an acceleration curve on the capital side too. And this is what I call the capital cycle — it is another cycle which is driving the acceleration process, and I think it's actually the cycle that matters most to policymakers right now. It's also the cycle that AI safety people are the least concerned with. I haven't seen a single AI safety person say, let's do a pause and somehow a market downturn won't happen.
21:52
Nathan Labenz: Yeah. Well, I can definitely tell you in sort of historical AI safety discourse — and I think this ship has sailed — but there was a time when people were like, we better get this under control before the whole economy depends on it working. And now we kind of are in that spot. So that opportunity's been missed. But I do think a lot of AI safety people would be willing to bite the bullet on a market downturn to get a pause. They maybe don't wanna advertise that too much as part of the package they're advocating for, for obvious reasons. Just like, if you wanna make AI even less popular, go out and tell everybody that all you need to do is adopt a vegetarian diet, and then you can have all the AI you want — that'll get the pitchforks out faster than they were already coming.
I do think the willingness to bite the bullet is there among many people, but it's a tough message to lead with. And it does seem like, more than anything else, that will be the thing that will force the administration's hand on letting Fable come back online and letting the other companies launch their next models.
Another thing I don't understand super well — I had a really interesting front row seat to the 2008 financial crisis. In that scenario, there was basically a sleight of hand, like fraud at the center of everything. Fraud is maybe a little strong — I don't know that the people doing it by and large understood what they were doing in the macro sense. But there was an alchemy to it that really didn't pass scrutiny in the end. We're gonna make all these kind of low-quality loans, package them up, and somehow that'll make them good. It just never really fundamentally made sense, and past a certain point, it was probably always going to end up in some painful unwinding.
In this case, it feels like a decent number of similar dynamics are there — certain asset classes being bid up, increasing amounts of leverage in the system, there might be some fragility. But is there something about this time around that kind of auto-stabilizes the whole thing in a way that didn't happen in previous bubbles? I think when you look at the cost per hour of an A100 in today's world, you can at least tell yourself some story that the fundamentals must be there. If old chips cost more to rent now than they used to, despite there being multiple new generations available, then somehow that's the ultimate backstop — the willingness to pay hourly for the chips to do tasks. And as long as that is there, and it doesn't seem like that is gonna unwind anytime soon, then maybe we can actually surf this wave all the way through the Singularity.
What do you think? Is there some — I'm unclear on whether the phenomenon like these leveraged Korean retail investors create enough of a shock potential that it could tip the whole market into a bad equilibrium? Or does the fact that the A100 price is still high kind of anchor everything, and yeah, we could sort of absorb whatever frothiness may pop up on top?
26:10
Prakash: So before I answer that question, this is the Micron stock chart. And the last time Micron was this high —
26:19
Nathan Labenz: I can point out a few other graphs that look kind of similar to that.
26:23
Prakash: Yeah. There's this $99.09. That was where Micron was at its last high. And it took all the way until the AI bubble started in 2023, and things started picking up, where they finally recovered their stock price. And now they're finally beyond that. So it's taken them 26, 27 years to get back to where they were.
I think in any upmarket, you always have companies which are forced to lie and defraud because they're pulling forward the future. Some of it becomes fake it till you make it. Some of it is managing the news story, the narrative. They're making some deals which may not actually deliver, but it's positive for the news flow. They need to raise capital, so they need to tell a good story.
That's Oracle deciding to invest in Abilene, Texas with OpenAI. You start off small, you announce a big thing. They announced with Donald Trump that they're gonna put a trillion dollars into this — they didn't have a trillion dollars. That was a fraud — an out-and-out narrative fraud. But then after that, they went and put in like $20 billion, started building out Abilene, Texas, and then watched for demand to come in. And as the demand came in, they increased. So there's this whole process of narrative to reality that takes place. And sometimes things collapse during the narrative phase, and then you're like, oh, you lied. Yeah, Larry Ellison lied with Trump there — we're gonna put in a trillion dollars this year. No, it's taken them three or four years.
But they do eventually, when things are good and they see positive reactions, put the money in. So I think sometimes when things collapse, we have this feeling that you guys lied, but that's just marketing at the end of the day. And if things had worked out a little bit differently, they would have gone ahead with it.
As a financial investor, I would say this absolutely echoes every single bubble in history. If I did not know the technology, and if I was not AGI-build — which most of the East Coast guys are not — I would say this is a bubble and these guys are defrauding you. And that is so clear as an East Coast financial investor because this is what every single bubble in history has looked like.
Being a technologist and being on the AGI-build, it looks completely different. It looks like one of the turns in history. And in finance, some of the great fortunes have been made by spotting these turns. This is like George Soros and breaking the back of the Bank of England — you can kind of start to spot the turn. And what has happened in the last 15 years is that a lot of people on the East Coast did not pick up the tech upturn. There are all these people who kept the 2008 collapse in mind and did not invest in Apple, did not invest in Google. And they've been in index funds — these are the guys providing you with alpha. Like, you get alpha as a tech investor because all these other people are stuck in index funds and they're just buying all the crap on the market. There are only maybe 10 companies worth investing in, and then they equally invest in every company. As a tech investor, you could just drop out the 490 companies and invest in the top 10, and you'll be fine.
This is what Chase Coleman at Tiger Global has done to great success. A lot of these guys have made money in the last 10 years. They just invested in like five or six big tech stocks and sat on it for ages. I know a hedge fund investor who only invested in Nvidia five years ago, just hangs on to Nvidia, makes all his returns and fund fees just from hanging on to Nvidia. I can do that as a retail investor and not pay the 2 and 20. Someone is paying this guy 2 and 20 to have bought Nvidia five years ago and just sit on it — and you can't say it's a bad thing because he's outperformed the S&P 500 by an order of magnitude.
So I think people are still deeply underinvested in technology, deeply underinvested in the AGI story. And it's impossible for most people who are not part of this online Twitter circle or part of the SF group to actually understand what's going on or what's gonna happen. This gives enormous advantage to just being part of the culture around AI, because you have the sense that things are gonna work, and then you have the ability to buy the dip with confidence. That's really driving it. Across the Twitter space there are a ton of people who have like 10x or 20x returns in the last year and a half. And once you clear 100% returns, your capital is repaid — all the excess is risk capital. You can do whatever you like with it.
So I think people should take more risks with their money. If they have the risk capital, yeah, put it in tech, put it in AI stocks, go for it.
33:20
Nathan Labenz: YOLO investment advice here on AI in the AM from Prakash. I'll maybe provide the more risk-averse side later. But —
33:28Interview30 min
Self-Improving Compute — Bing Xu and the PTX Kernel FactoryBing XuBing Xu, founder & CEO of INT21, makes the case that self-improvement should target the GPU infrastructure, not the model: his PTX Kernel Factory points agent swarms at the NVIDIA ISA below CUDA, matching expert libraries like QuACK on mature kernels and posting up to 59% speedups on newer ones — and, he argues, deepening NVIDIA's moat because the evolutionary loop depends on NVIDIA's profiling ecosystem.
Watch
As aired
Nathan and Prakash welcomed Bing Xu, founder and CEO of INT21, to discuss self-improving AI infrastructure. Prakash opened with a full bio intro — tracing Bing's path from co-authoring the original GAN paper and co-creating Apache MXNet and Meta's AITemplate, through founding HippoML (acquired by NVIDIA), to his current work building autonomous agent swarms that optimize GPU kernels at the PTX layer. The central premise of INT21, as Prakash framed it: an AI company's engineering capacity should scale with compute budget, not headcount.
Bing walked through the PTX Kernel Factory, INT21's first product. Unlike approaches that build domain-specific languages or layers above the GPU (like Triton or other DSLs), the factory generates inline PTX directly — the NVIDIA ISA sitting below CUDA — allowing the system to precisely model hardware performance and extract the maximum from each GPU. The optimization runs in two phases: a background self-improvement phase where the system accumulates knowledge across generations before any user request, and an interactive phase where a user specifies a kernel problem and success criteria (e.g. "beat CUTLASS on this shape"), and the agent swarm evolves toward that goal drawing on prior experience.
The conversation explored the CUDA moat question in depth. Nathan asked whether agent-based kernel generation might erode NVIDIA's software advantage by making any hardware equally optimizable. Bing argued the opposite: the same automation raises the moat, because the evolutionary loop depends critically on NVIDIA's ecosystem of profiling tools (NCU, accurate instruction counters, reliable drivers) that give agents clear directional feedback. Without that rich signal layer — which NVIDIA has invested in heavily and competitors have not matched — the search can't close the loop efficiently. On performance, Bing cited two benchmark categories: mature workloads like RMSNorm, where the factory systematically matches or slightly outpaces expert-written libraries like QuACK across more than 100 configurations, and newer workloads like Kimi Delta Attention where human expert optimization is still nascent, yielding up to 59% speedups with correctness validated across 580 test cases.
Later, Bing described the SwarmOS — INT21's cloud-native evolutionary substrate that can coordinate up to 10,000 agents simultaneously. The architecture mirrors an AlphaGo-style search: agents generate variations (proposals), each runs in its own isolated sandbox on real NVIDIA hardware for genuine feedback, and selective retention promotes the best outcomes while the evolution tree is preserved for backtracking. On model choice, Bing said GPT-5.5 has proven uniquely capable at escaping local minima during the search — other models tend to enter sycophantic agreement loops where agents reinforce each other rather than generating genuinely novel solutions. He also noted that Fable 5 refused a basic PTX-related query as potentially dangerous, which ruled it out for this use case. On coordination at scale, Bing said the SwarmOS mimics human organizational discipline: agents are given well-defined, specialized roles rather than peer-to-peer autonomy, which prevents the chaotic collapse that results from unstructured multi-agent communication.
Key moments
With this kind of ultimate automatic generation technology, the CUDA moat is even higher. This kind of evolution requires a lot of tools and ecosystem — an accurate profiler, a reliable driver — everything has to be there. NVIDIA has those, and because this is a closed loop, the ecosystem gets stronger.
Bing Xu38:31
GPT-5.5 is so smart on these very hard problems. Many times, when we use other models, the search gets stuck at a plateau. GPT-5.5 is able to break out and create innovative solutions to move the needle and get the entire process going forward.
Bing Xu52:45
If you give agents peer-to-peer communication ability, you'll create a mess. We want agents to be well-defined, doing well-defined work — just as human society has specialized into different functions. That way, the swarm operates toward the goal rather than collapsing in the middle.
Bing Xu55:23
Questions asked
35:50Can you give us a quick breakdown of what the PTX Kernel Factory is?
The PTX Kernel Factory is an agent swarm application that generates inline PTX kernels directly — the NVIDIA ISA sitting below CUDA. Unlike prior approaches that build domain-specific languages like Triton above the GPU, it generates PTX directly alongside CUDA, allowing the system to accurately model hardware performance and extract the best possible performance.
36:34How does your approach start? Do you give the agents a benchmark to beat?
The process has two phases. First, a self-improving phase where the system runs continuously before any user request, accumulating knowledge generation by generation. When a user engages, they specify a problem — for example, a GEMM kernel with a particular shape — and define success criteria, such as "beat CUTLASS." The agent swarm then draws on past experience and evolves toward that goal.
37:44Is the CUDA moat getting deeper or shallower as agents become able to auto-generate kernels?
Deeper, not shallower. The evolutionary optimization loop depends on rich feedback from the hardware ecosystem — accurate profilers like NCU, reliable instruction counters, solid drivers. NVIDIA has invested heavily in those tools. Competitors don't have an equivalent infrastructure, which means agents can close the loop efficiently on NVIDIA hardware but not elsewhere.
41:12What efficiency gains can you actually see — what numbers are you reporting, and how do you benchmark fairly?
Two categories. On mature, highly-optimized workloads like RMSNorm — used in every transformer today — the PTX Kernel Factory systematically matches or slightly outperforms expert-written libraries like QuACK across more than 100 configurations. On newer, less-optimized workloads like Kimi Delta Attention, where even frontier labs are still working, we see up to 59% speedup, validated across 580 test cases.
48:10Is the system evolutionary — like the AlphaEvolve-style approach that probes different parts of the possibility space rather than just sampling the LLM?
Yes. SwarmOS is a cloud-native evolutionary system that supports up to 10,000 agents in parallel. It generates variations as proposals, evaluates each agent's output on real NVIDIA hardware with genuine performance feedback, maintains an evolution tree that can be traversed forward and backward, and applies selective retention — promoting the best outcome and discarding the rest. It's explicitly an AlphaGo-style search applied to compute infrastructure.
52:19Have you hit local minima? How do you escape them?
Yes, often — when proposals can't break out of a plateau. The game changer has been GPT-5.5. Other models tend to enter sycophantic agreement loops where agents confirm each other rather than generating genuinely novel solutions. GPT-5.5 can actually identify what's wrong and push to out-of-the-box approaches that move the optimization forward.
55:10How do you coordinate hundreds or thousands of agents working simultaneously?
By mimicking human organizational discipline. The philosophy behind SwarmOS is based on how human organizations specialize and structure work. Agents are given well-defined, specialized roles rather than peer-to-peer communication ability — unconstrained P2P creates chaos. Keeping agents disciplined and task-scoped is what allows the swarm to converge on the goal instead of collapsing midway.
56:23How accessible is this cost-wise — can research projects use it, or does it only make sense for large production workloads?
Self-improvement makes it more affordable than it would otherwise be, because the system compounds past experience rather than starting from scratch each run. Most AI usage today wastes compute resolving the same problems repeatedly. By building a system that retains and applies prior discoveries, costs drop substantially as the knowledge base grows — the goal is to make the PTX Kernel Factory a product that works for everyone, not just large organizations running massive inference workloads.
Related
INT21 ↗Introducing INT21 and the PTX Kernel Factory ↗Bing Xu on X ↗Bing Xu on LinkedIn ↗
Full transcriptLightly edited · timestamps jump to YouTube
33:28
Nathan Labenz: For now, let's welcome our first guest.
33:32
Prakash: Hi, Bing.
33:33
Bing Xu: Hi. How are you?
33:36
Prakash: Very good. So I am super interested to hear about what you've been up to. I'm going to do a quick intro. Bing Xu is the founder and CEO of INT21, an artificial intelligence company that is fundamentally changing how the industry builds the foundational infrastructure powering modern compute. For over a decade, Bing has operated at the absolute forefront of machine learning engineering. He co-authored the original generative adversarial networks paper, created the Python package for XGBoost, and co-created widely used deep learning tools like MXNet and Meta's AITemplate. After NVIDIA acquired his GPU inference startup, HippoML, he served as a Distinguished Engineer at NVIDIA where he led breakthrough research on using AI agents to automatically generate highly optimized GPU code. Now he has stepped out of stealth with INT21 to solve the hardest bottleneck in scaling artificial intelligence — the software that tells the silicon exactly what to do. His company's first product, the PTX Kernel Factory, deploys an autonomous swarm of AI agents to write, test, and optimize low-level NVIDIA GPU code, achieving performance that openly surpasses expert human engineers. Bing operates on a simple but radical premise: an AI company's engineering capacity should scale with its compute budget, not its headcount. At a time when the entire industry is obsessing over the high-level capabilities of language models, Bing is focused strictly on the bare metal underneath — building self-improving systems where the compute literally improves the compute. Bing, I have to tell you I'm a huge fan because I have used XGBoost forever. Like, that Python package has saved my life so many times, so thank you.
35:38
Bing Xu: Yeah, you're welcome. We are now moving to the next stage — maybe we'll talk about that a little bit later.
35:50
Prakash: Absolutely. Bing, can you give us a quick breakdown on what the PTX Kernel Factory is and what you've been working on?
36:00
Bing Xu: The PTX Kernel Factory is an agent swarm application, and it generates PTX kernels — inline PTX kernels — directly. There are different prior approaches, like building a DSL — whether Triton or other layers above the GPU. The PTX Kernel Factory directly generates inline PTX with CUDA, so we can accurately model the performance and get the best possible performance from the GPU.
36:34
Prakash: And how does your approach start off? Do you start with something like, "here is a benchmark I want you to beat," and let the agents run? Or are you improving against a particular metric?
36:52
Bing Xu: This comes from two phases. The first phase is a self-improving phase. Before the user ever types a prompt, the system has actually been self-improving for some time, and it has gained a lot of knowledge generation by generation. So when the user engages with it, they set a problem — for example, "I want a GEMM kernel, I want the shape to be a specific shape, and I want the success criteria to be, say, faster than CUTLASS." And then the agent will explore past experience and do evolution over and over again toward that goal.
37:44
Nathan Labenz: I have a few big-picture strategic questions that I think you have a really unique take on. One that's very often debated is: is the CUDA moat getting deeper, or is it getting shallower? The argument that it would be getting shallower seems more intuitive in light of the kind of technology you're building — if I can spend a bunch of compute to write new kernels, can't I apply that same technique across any GPU provider? Doesn't that lead to a time when everything is super-optimized and we don't have to worry as much about which chip company's platform we're building on top of? Or am I off base somehow there?
38:31
Bing Xu: I think the CUDA moat is definitely there. And with this kind of ultimate automatic generation technology, I think it's actually even higher. The reason is that this kind of evolution requires a lot of tools and ecosystem to make it happen. We need an accurate profiler, a reliable driver — the whole ecosystem needs to be there. For the CUDA ecosystem, with these kinds of tools, everyone using CUDA can achieve better performance faster than ever. And because this is a closed loop, the ecosystem is better, and that will make it grow even stronger.
39:20
Nathan Labenz: So in other words, competitors to NVIDIA don't have the necessary primitives and abstractions in place to allow agents to make progress in an automated research way today?
39:42
Bing Xu: I think so. The hard part is how you can get real feedback from the hardware. This is a really high moat. For example, on NVIDIA GPUs, we have NCU, which can accurately tell the agent what the right direction is. For others — I'm not tracking exactly where they are now — but from my past experience, the software ecosystem, especially for agent-driven software development, has a large gap between them and NVIDIA.
40:25
Prakash: So it sounds as though NVIDIA has just thought more about developer experience and providing enough feedback from the chips themselves so that developers can improve how the chips are used. Is that a fair way to put it?
40:47
Bing Xu: Yes. NVIDIA's past investment in these tools is now helping agents move faster. This brings NVIDIA a unique advantage in the agent era — we can build more powerful agents on the NVIDIA platform more easily and deliver a better experience for all NVIDIA users.
41:12
Prakash: One question everyone has is: if you have this kind of PTX agent swarm that improves itself, how much efficiency gain can you actually see on these chips? What are the numbers on, say, a B200 or H100? And what metric do you use to measure where you're at, and where the asymptote might be?
41:44
Bing Xu: We use two categories of benchmarks, because benchmarking is the trickiest part. A lot of times, benchmarks are not fair — for example, some recent releases claim 300x faster than a GPU kernel, but that's not generally a fair comparison. So I use two kinds of fair benchmarks. The first is a mature workload — like RMSNorm. RMSNorm is used in every transformer today, so it is highly optimized. For this category, with more than 100 different workloads, we can say the PTX Kernel Factory has systematically reached the level of human expert libraries like QuACK — similar performance, or slightly faster by a few percentage points. That means even on mature workloads, the PTX Kernel Factory reaches expert level. The other category is newer workloads that human experts haven't yet optimized well. One example is Kimi Delta Attention, or full KDA. Not many people are optimizing that yet, and we can see up to 50 to 59% speedup on those. This is not a single data point — it passed 580 tests and use cases. So the evidence supports that the PTX Kernel Factory is achieving expert-level performance at scale, through self-improvement.
43:41
Prakash: Would that mean this is very helpful for people working with new attention mechanisms and new architectures, because they're no longer dependent on a small group of engineers to optimize their PTX code?
43:57
Bing Xu: Yes. That is our goal — we want to scale rare expertise to everyone. In the past, the bottleneck was that when someone had an interesting research idea, they didn't have a fast kernel to try it with. And even after getting a kernel, the road from research to production was very long. Now, with the PTX Kernel Factory, it's straightforward — everyone can just generate optimal kernels.
44:27
Prakash: So everyone gets an expert PTX-optimizing agent for themselves. How much do you think this speeds up the typical research or production process — maybe a couple of months?
44:48
Bing Xu: I think it depends on how quickly people adopt this approach. Currently, one challenge is that most people are still skeptical that AI can generate production-quality PTX. We need to build trust that AI can self-improve to generate expert-level PTX reliably. Even industry leaders — I read a tweet from Chris Manning a few days ago mentioning he was able to generate PTX with a model — but most of the industry expects this to be possible only a few years from now, not today. So we need some time to give people confidence that the PTX Kernel Factory is a reliable approach for generating expert-level kernels. Then we can talk about how this speeds up the entire process.
45:51
Nathan Labenz: Can you talk about how broad the optimization process is? I understand there can be a lot of different kinds of bottlenecks across the whole system. A lot of optimization work over time has been about stepping back from one very narrow problem and reworking things at a slightly higher level to route around a bottleneck, only to reveal the next binding constraint. How much of that kind of system-level exploration are you able to do versus staying very low-level at the most granular compute optimization?
46:53
Bing Xu: Do you mean how much the system can involve the broader compute system, versus only doing narrow PTX optimization?
47:03
Nathan Labenz: Yeah, exactly.
47:04
Bing Xu: I think this methodology is general, and PTX is just our first step toward a self-improving compute infrastructure. We chose PTX because it's the hardest problem — it's in the middle of the entire stack, as close as you can get to the hardware. And the same methodology can apply to the upper layers: frameworks, distributed settings. For example, on the GPU you have distributed shared memory, asynchronous programming, and all of these concepts map to real distributed settings. We can extend this with a similar methodology. PTX Kernel Factory is just the start.
48:10
Nathan Labenz: Is it an evolutionary algorithm? I'm recalling some kernel optimization work — I think it came out of Google — where they found a trick to do a matrix multiply with one fewer operation. And my understanding was it wasn't just having a language model generate new ideas, but there was also a scaffolded evolutionary system that made sure the model got out of distribution to avoid the repetitive, median nature of LLM outputs. Are you doing something similar — using an evolutionary layer to systematically probe different parts of the possibility space?
49:13
Bing Xu: Yes. We're building a cloud-native evolution system for agents — we call it SwarmOS. SwarmOS is able to support up to 10,000 agents doing evolution simultaneously, with a specialized sandbox and cloud infrastructure to support that. The process is evolutionary: the system first generates a set of variations — proposals. The next step is evaluation, using the agent system we have today. The swarm maintains an evolution tree, so we can trace back and explore different branches going forward. The second step is that each agent has its own compute environment and verifies results from the real hardware environment, getting real feedback. The last step is selective retention — promote the best outcome and discard the rest. In short, we are bringing an AlphaGo-style search into the compute infrastructure. PTX is the first domain, and we do this AlphaGo-style search — but it can apply to any compute infrastructure problem.
50:43
Prakash: You've called this agent optimization loop "merciless" — that the strict constraints are actually beneficial for evolving the agents forward. Can you elaborate on why you call it merciless, and why that strictness is a good thing?
51:38
Bing Xu: For GPU problems — and computer problems in general — the evolutionary loop works well because the feedback signal is clear. The GPU will tell you clearly whether something is faster or slower. If you put that in a distributed setting, you know clearly whether resource utilization is higher or lower. This gives the system a clear feedback signal to move forward. In other fields, the signal is not so clear.
52:19
Prakash: When you're optimizing against a single metric, a common issue is getting stuck in a local minimum or maximum and being unable to escape. Has that happened to you as you've run this evolved optimization process?
52:45
Bing Xu: This is a great question. A lot of the time, evolution will fail because the proposal is not able to break out of a local minimum. And we found one game changer: GPT-5.5. GPT is so smart at these very hard problems. Many times, when we use other models, the search gets stuck at a plateau. GPT-5.5 is able to break out and create innovative solutions to move the needle and get the entire process going forward.
53:28
Prakash: Are you actually using GPT-5.5 with your own harness and sandboxing system?
53:35
Bing Xu: Yes. The entire SwarmOS is backed by GPT, and we find that GPT is really strong on these hardest problems. No other model is able to match GPT today for this.
53:51
Prakash: I have to ask — did you try Fable when it came out? Was there any difference?
53:58
Bing Xu: Well, first, Fable rejected my request when I asked it what PTX is. Fable seemed to think asking "what is PTX" was a dangerous question.
54:15
Prakash: We know who's going to be the first one to be banned.
54:22
Bing Xu: In general, one characteristic of GPT that is particularly beneficial for the evolution system is that GPT has the ability to understand what is actually wrong. It's not just blindly agreeing. Otherwise, the swarm collapses. I think one reason multi-agent swarms haven't been widely adopted is that people have been using models that just agree with each other. Two agents say, "you're wrong" — "no, you're absolutely right" — and they fall into an infinite loop. GPT can push past that and get to genuinely out-of-the-box solutions.
55:10
Prakash: I have to ask — you've obviously deployed hundreds or thousands of agents working together. How do you coordinate their work? This is a problem we're all dealing with.
55:23
Bing Xu: We mimic human organization. A lot of the philosophy behind building SwarmOS and INT21 is based on human history — we're trying to reorganize human organizational structures for agents. This means we want the agents to be disciplined. One observation is that if you give agents peer-to-peer communication ability, you'll create a mess. We want agents to be well-defined, doing well-defined work — just as human society has specialized over time into different functions. That way, the swarm is highly likely to operate toward the goal rather than collapsing in the middle.
56:23
Nathan Labenz: Can I ask how much this costs? I'm wondering specifically whether it's so resource-intensive that you have to already have a well-established production workload to justify running it, hoping to save a small amount that pays off over a large volume of inference over time. Or could it become affordable enough that research projects could use it? I'm thinking back to Mamba 2 where a lot of the work was done at the kernel level to show that a new paradigm could be competitive on the available hardware. Where do you sit right now in terms of accessibility to small-scale or research projects?
57:27
Bing Xu: This comes back to why we're doing self-improvement. Because we compound past experience, it's actually affordable compared to starting from scratch each time. This is a big problem with AI usage today — organizations are not compounding across generations, not doing self-improvement. Every session is limited: you use the model, you may solve the same problem again and again. That's a big waste. But when you build a self-improving system that compounds the past, you learn from prior experience. If I encounter a new workload with a similar structure to something I've optimized before, I won't repeat the same mistakes. This dramatically reduces cost. That's how we're confident the PTX Kernel Factory can be a product that fits everyone — rather than a system where you push a button and ten thousand dollars disappears.
58:36
Nathan Labenz: I was doing some research this week into Liquid AI, and they have a somewhat rhyming approach for searching architecture space. One thing they report is that it's very important to have actual target hardware in the loop. Going back to your comment about the CUDA moat: am I understanding correctly that part of what makes CUDA's moat real is that you don't have to run the full workload on real hardware just to get a meaningful benchmark signal — that NVIDIA's tooling gives you a much cheaper path to a real performance estimate? Whereas other platforms would require you to truly run the workload on hardware to know where you stand?
59:34
Bing Xu: We still need real hardware to get fair feedback — all of the PTX Kernel Factory's evolution runs on real NVIDIA GPUs. The moat comes from the ecosystem around the GPU. When we run a kernel, how can we accurately know the cache utilization? How can we get the instruction count? NVIDIA provides the ecosystem and tools that help agents understand how the program is actually running on the hardware. That is key for both agents and human engineers to develop efficiently on NVIDIA hardware.
1:00:24
Nathan Labenz: Gotcha. So you are still actually running test kernels within the evolutionary optimization process.
1:00:33
Bing Xu: Yes.
1:00:34
Nathan Labenz: But it's the meta diagnostic layer — the rich signal NVIDIA provides on top of the raw execution result — that gives you what you need in a timely fashion, and that's what's missing from other platforms.
1:00:49
Bing Xu: Yeah. I think that is the advantage of the CUDA ecosystem.
1:00:55
Nathan Labenz: Same kind of question but for architectures. A big question over time has been whether the transformer — really the attention mechanism — is a truly unique special snowflake that we're unlikely to improve on, or whether it just won a hardware lottery and we're in a path-dependent state. Do you think significantly different architectures are likely to emerge as your technology matures and can optimize them much faster than they otherwise could have been? Will we see greater architectural diversity as a result?
1:01:53
Bing Xu: I think this comes from two angles. In my view, for attention mechanisms, they are good enough for a lot of workloads today. But if we want to expand AI technology to everyone, we have to make the cost lower. From an intelligence perspective, attention is doing quite well. But from a cost-per-person perspective, people may push for more innovation to reduce cost. They may change the first model — perhaps mixing full attention with lower-precision linear attention models — to make AI accessible to everyone. But overall, I feel that attention is really powerful and could take on a lot of the intelligence challenges we face today.
1:02:58
Nathan Labenz: Well, thank you, Bing, for being here. This has been a really interesting conversation, and it's good to get out of my comfort zone and try to get a little closer to the metal from time to time. I need an expert guide to help me do that effectively, and I thank you for being that guide today. We'll definitely be watching your progress.
1:03:18
Bing Xu: Thank you so much.
1:03:19
Prakash: Thank you, Bing. Huge fan.
1:03:20
Bing Xu: Yeah, thank you. Bye bye.
1:04:08Interview42 min
Europe 2031 — Michiel Bakker on AI Dependence and SovereigntyMichiel BakkerMIT Sloan professor and Google DeepMind researcher Michiel Bakker on Europe 2031, his viral month-by-month scenario of Europe sleepwalking into AI dependence — a fictional 2028 export-control beat that materialized within a day of publishing — and why Europe's real leverage lies in the ASML/IMEC semiconductor ecosystem and a middle-power coalition, if it acts while the window is open.
Watch
As aired
Michiel Bakker — assistant professor at MIT Sloan and senior research scientist at Google DeepMind, and co-author of the viral scenario Europe 2031 — joined the show the week after his piece published and the world caught up with it. Within a day of the scenario going live, the US government invoked emergency export controls restricting Anthropic's most powerful models for foreign nationals, turning a fictional 2028 forecast into a 2026 news headline. Michiel described the mix of vindication and alarm that came with watching the warning materialize faster than any of the authors had imagined.
He walked through the scenario's architecture: two fictional characters — Caroline, a Brussels policy official who believes in the institutions around her, and Christian, a European founder in Silicon Valley who is annoyingly, reliably right about timelines — whose friendship dramatizes the widening gap between the frontier and European governance. The scenario's premise is not that Europe takes a turn for the worse, but simply that it keeps doing things half-heartedly: big announcements that don't come to fruition, while the US consolidates control of the most powerful AI systems and China dominates robotics and manufacturing. Europe gets squeezed in between.
The hosts pressed Michiel on Europe's strategic options: whether regulating from the outside is viable (his answer: only if you have a seat at the table, which requires capability first), whether Europe could simply shelter under the US umbrella as it does with nuclear deterrence (his answer: AI is primarily an economic technology, and the nuclear analogy breaks down because protecting Europe militarily costs the US nothing, while sharing the best AI models does), and whether cultural capital — wine, leather, LVMH — could substitute for technological leverage (his answer: a firm no, because the Americans can run their economy without leather handbags, but no economy will be able to run without AGI access).
On the path forward, Michiel outlined two directions: get the basics right across the full AI stack — compute, data centers, energy, a credible model company — while also strengthening the choke-point leverage Europe already holds, particularly the semiconductor ecosystem around ASML in Eindhoven, IMEC in Belgium, and fabs in Germany. He also described the middle-power coalition opportunity, pointing to Taiwan (TSMC), Japan (materials), Korea (high-bandwidth memory), and the Netherlands (EUV lithography) as collectively holding meaningful cards. He closed with an update on the scenario's reception: cabinet members in The Hague, Brussels meetings, and high-school friends texting him to ask people to stop forwarding it — exactly the general-public reach the authors were after.
Key moments
I would be lying if I didn't say there was something inside us that made us feel a little vindicated. But obviously we're also very worried about this happening so much sooner than we thought it would. The fact that the US can pull this lever and is happy to pull this lever if they feel it makes sense is, of course, a scary fact for Europe.
Michiel Bakker1:06:15
Yes, LVMH is an important company. But ultimately, Americans can live and run their economy without leather handbags, and we won't be able to live and protect our citizens and run our economy without AGI access.
Michiel Bakker1:23:03
I may be even more excited about the random people who reached out than the politicians — they ultimately have a lot of power through being able to vote and advocate.
Michiel Bakker1:41:28
Questions asked
1:06:03What was your reaction when the US government announced export controls on Anthropic's models just one day after Europe 2031 published?
A mix of vindication and alarm. The scenario had imagined this kind of export-control move happening in 2028, and it started materializing within days of the piece going live. The team felt somewhat vindicated, but also genuinely worried that it was happening so much sooner than expected. Whether the controls are truly about safety or something else is still unclear — but the fact that the US is willing and able to pull that lever is a scary reality for Europe.
1:07:07Can you sketch out the broad contours of Europe 2031 — the characters, the premise, what happens?
Europe 2031 is a warning story inspired by AI 2027, aimed at helping Europeans viscerally feel what's at stake. It follows two fictional characters: Caroline, a Brussels EU policy official who is thoughtful and loyal to Europe but frustrated by slow institutions, and Christian, a European founder in Silicon Valley who is plugged into the frontier, annoyingly smug, but reliably right about timelines. Their friendship dramatizes the gap between where AI is going and where European governance is. The scenario doesn't put Europe on a worse path than it's currently on — it just models Europe continuing to do things half-heartedly: big announcements, little follow-through. The US ends up controlling the most powerful AI systems, China dominates robotics and manufacturing, and Europe gets squeezed in an uncomfortable position in between.
1:10:52Why isn't Europe's approach of regulating from the outside — GDPR, the AI Act — a sufficient strategy?
Regulation only works if you have a seat at the table, and a seat at the table requires capability. If Anthropic were in Paris and OpenAI in Berlin, Europe could meaningfully dictate terms. But right now, the lab ecosystem is in the US, and Europe doesn't have the leverage. Michiel cares deeply about AI safety and governance, but argues that in Europe, you need to build first to regulate effectively — because at some point, continued over-regulation could mean Europe simply loses access to the technology entirely.
1:16:08Why shouldn't Europe just shelter under the US umbrella the way it does with nuclear deterrence?
The nuclear analogy breaks down because AI is primarily an economic technology, not just a military one. Protecting Europe militarily costs the US nothing economically. But sharing the best AI models does. In a world where AI becomes the dominant source of scientific discovery and economic output, the US might decide to keep the best models for itself while still providing military protection under NATO. That's the scenario Europe needs to be able to prevent — and it can only do so by building enough leverage to make such a move less attractive.
1:25:23What are the highest-order moves European leadership could make right now to strengthen their position quickly?
Two directions. First, get the basics right across the full AI stack: data centers, a credible model company, and energy supply that isn't dependent on US LNG. You don't have to be frontier everywhere, but you should have some frontier position somewhere — and Europe already has that with ASML. Second, double down on where you already have leverage. The semiconductor ecosystem around ASML in Eindhoven, IMEC in Belgium, and fabs in Germany is genuinely world-class. Fund more chip companies, make sure ASML has the visas and land it needs, and build the kind of dense ecosystem around that hub that San Francisco has built around AI.
1:31:44Does it matter who owns the compute infrastructure in Europe — is American hyperscaler capacity on European soil better than nothing?
Yes, it's better than nothing, but with real trade-offs. Hyperscalers are extremely good at large-scale distributed training — something it would be hard for European providers to replicate quickly. So there may be a period where European governments need to make it attractive for US hyperscalers to build in Europe just to get the capacity. European-controlled compute would be best; American hyperscalers on European soil is a meaningful improvement over the status quo; and the status quo — almost all AI compute built in the US — is clearly not preferable.
1:36:25Is there a distinctively European vision for what AI should look like — values, governance, relationship to citizens — beyond just 'don't be left behind'?
Michiel is skeptical of the 'European values' framing. The differences between Dutch and American values are probably smaller than people think, and post-training has gotten good enough that a corrigible model can adapt to whoever is using it. The more important reason Europe should want its own frontier model is that AI models are ecosystems — they generate surrounding technology development, science, and education. A European model at the frontier would create that whole ecosystem domestically. Privacy mechanisms might be designed somewhat differently, but the AI itself wouldn't look very different from what's built in the US.
1:41:49Would lowering GDPR barriers to feed more European data into models help Europe exert more influence on AI — giving it more 'European values'?
No. Michiel thinks the European-vs-American values framing is a distraction. The real problem is not that models have non-European values baked in from pretraining — post-training has gotten sophisticated enough to shape model behavior regardless of pretraining provenance. Creating a fair-use regime so that European AI companies can access more data would be worthwhile, but for competitiveness reasons, not as a values-diffusion strategy.
Related
Europe 2031 ↗Europe 2031 — summary & recommendations ↗The Habermas Machine (Science, 2024) ↗AI Assistance Reduces Persistence and Hurts Independent Performance (arXiv) ↗Michiel Bakker on X ↗
Full transcriptLightly edited · timestamps jump to YouTube
1:04:09
Prakash: Ah, there we go. Hello. Hello.
1:04:11
Michiel Bakker: Hey. Thanks for having me.
1:04:14
Prakash: Hi. So let me just do a quick intro. We have with us today Michiel Bakker. He's an assistant professor at MIT's Sloan School of Management and a senior research scientist at Google DeepMind. His career is anything but typical. He studied quantum physics in The Netherlands, launched a real estate startup in Myanmar, and sold flowers online in London before diving into the absolute bleeding edge of artificial intelligence. Today, Michiel's research sits at the intersection of large language models, AI safety, and human society. He builds systems that explore how AI can help polarized groups make collective decisions and find common ground on deeply divisive issues. But he is here today because he recently co-authored Europe 2031, a highly detailed viral scenario that maps out what happens if the European continent fails to build its own AI infrastructure and becomes entirely dependent on The United States. The timing of this piece was staggering — exactly one day after it was published, the US government used emergency export controls to shut down Anthropic's most powerful models for all foreign nationals, turning Michiel's fictional warning into geopolitical reality overnight. He brings a rare dual perspective: the hands-on technical expertise of a DeepMind researcher, and the clear-eyed policy analysis of a global institutional strategist. Michiel, welcome to the show.
1:05:51
Michiel Bakker: Thanks. That's a very big intro. I hope I can live up to the expectations, and excited to be here. Thanks.
1:06:03
Prakash: What was your feeling after publishing the report when the US government announced the export policy controls on Anthropic?
1:06:15
Michiel Bakker: So we released the scenario not last Thursday, but the Thursday before — Thursday morning — and it was going somewhat viral. Some big accounts were picking it up. And then, obviously, a day later, Friday evening, the Anthropic news broke. That definitely helped us. I would be lying if I didn't say there was something inside us that made us feel a little vindicated. But obviously we're also very worried about this happening so much sooner than we thought it would. Right now it's a bit unclear what the exact cause is — whether it's actually safety concerns or whether it isn't. But the fact that the US can pull this lever and is happy to pull this lever if they feel it makes sense is, of course, a scary fact for Europe.
1:07:07
Nathan Labenz: So I've read the full scenario, but I'm not sure all of our listeners have. Maybe a good baseline for you to set would be to sketch out the broad contours of the scenario. You can introduce the characters if you want — the use of a woman in Brussels and a friend of hers in The United States was a pretty interesting way to tell the story. But the big themes — just want to make sure you have a chance to articulate them so everybody's on the same page.
1:07:40
Michiel Bakker: So Europe 2031 is basically a warning story. I think it's very obvious who we were inspired by, which is AI 2027, which had a huge impact — to everyone's surprise, within the AI ecosystem, but also outside of it. Loads of policymakers, but also, for example, my dad read it, who had never read anything about AI before. So we were inspired by that and thinking: in Europe, people aren't really feeling viscerally what's at stake, how big these AI developments are, and how big the power imbalance currently is between Europe and the US. So we asked: what happens if AI really does become the main source of economic power, military power, cyber power, and Europe keeps moving slowly? It's not that we put Europe on a worse path than it's currently on — it's more like: what if Europe keeps doing things somewhat half-heartedly? They announce big things, but often these things don't come to fruition.
In this scenario, the US ends up controlling the most powerful AI systems — that's not really a surprise to anyone. China dominates most of robotics and the manufacturing stack, and Europe gets squeezed in between in a very uncomfortable position. We tell it through two fictional characters. Caroline is a young EU policy official in Brussels — very thoughtful, very loyal to Europe, and very worried that the institutions around her are not moving fast enough. Christian is a European founder in Silicon Valley — very plugged into the frontier, a bit smug, but usually right much earlier than Caroline wants to admit. Their friendship shows the gap between these two worlds.
Christian sees AI progress up close and keeps warning her. Caroline tries to translate that urgency into policy-making, but feels the system is very slow. And I've heard this from many people who read it: Christian is just very, very annoying — this smug AI guy. But annoyingly, he's often right about timelines. He's often the guy who, a year later, says, 'You see, I told you agents would be coming.' I think this is a lot how Europeans experience the AI ecosystem from afar — this thing in the south where you have these AI lab leaders who are all a bit smug or a little too confident, but are often right. The scenario plays out over the next five years, and Europe sleepwalks into even stronger dependence than it's currently in.
1:10:52
Prakash: One of the ways European leaders have historically depended on — and seem to be depending on going forward — is the ability to regulate: to tell technology leaders what's allowed, what privacy rules they have to follow. Rather than thinking about being self-reliant, they're focusing on controlling the power of tech giants within Europe. Why is that not a strategy that looks tenable going forward?
1:11:41
Michiel Bakker: It looks tenable if Europe were more powerful. Imagine Europe was what the US is now — Anthropic was in Paris, OpenAI was in Berlin, the whole AI ecosystem was in Europe. Then, of course, Europe could say: the way you're handling your pretraining data isn't fair to the people who created it, so we now have new laws around that. The problem is that if you don't really have a seat at the table, it's very hard to regulate this technology.
I care a lot about safety and governance of AI — I think this is one of the biggest problems of our time. How do we actually have effective governance? How do we make sure these AI systems are safe, especially if they start improving themselves? I get a lot of questions in Europe: in the US you seem very focused on being stricter on safety and governance, but in Europe you seem like an accelerationist. The way I'm able to be both at the same time is that in Europe, we first need a seat at the table to have any regulatory power. Right now we just don't. And I think we can't just keep regulating, because at some point we will no longer even have access to this technology.
1:13:12
Nathan Labenz: We were talking yesterday about how in the social media scenario, there's no substitute for 500 million Europeans — when the business model depends on putting something of interest in front of human eyeballs, that market is too valuable to ignore. But it seems like a possibility that if European regulators make it too difficult to serve the European market, AI companies might simply not bother at all — or make a token effort. And that assumes, tell me if I'm getting the strategic analysis wrong, that there will be enough demand from other markets to keep all the GPUs running hot, so the power is inverted: the companies can say, we have an alternative to your market, so if you make this difficult, we'll let the US and other buyers bid up the GPU prices.
1:14:28
Michiel Bakker: Yeah, and currently we are in a massive compute crunch. Labs operate in a way where they often care more about future models than current revenue. Some numbers going around suggest roughly a third of compute goes to big training runs, a third to experiments, and a third to serving customers. It's probably gotten much more skewed toward revenue with AI agents — maybe revenue now takes a bigger part of the compute pile. But if revenue is only a third or half of your compute, and European revenue is a percentage of that, then if you can go faster on developing future models and moving toward recursive self-improvement by giving up the European market, that seems like a rational trade.
The other side is: as long as the playing field is leveled between the American labs with no European labs having unfair advantages, they can maybe all serve slightly weaker models to the European ecosystem and still get revenue there without worrying too much about regulation of their best models. As you say, Nathan, they could do a token effort — some smaller, compliant model served specifically for European customers.
1:16:08
Prakash: I have two interrelated questions. First: why should European leaders even care? Why not just sit under the US umbrella — similar to how they have for nuclear deterrence? Only two European powers, the UK and France, have actual nuclear weapons. The others are clearly technically competent — Enrico Fermi was from Italy — but have chosen not to acquire those capabilities. So why not sit under someone else's umbrella? And second: what can middle powers effectively do without breaking their long-term alliances with the United States?
1:17:13
Michiel Bakker: On the first — why not sit under the US umbrella — that's obviously the strategy we've been taking for decades. But nuclear weapons and AI are slightly different, because yes, AI is a very important military technology, but above all it gives you enormous competitive economic power. You can imagine that if AI becomes the dominant source of new scientific discovery and new goods and services competing globally, there might be a scenario where the US says: we're still happy to protect Europe militarily, we're still part of NATO, but economically, we are going to keep the best models for ourselves. The nuclear analogy breaks down there, because using nuclear power to protect Europe doesn't really come at an economic loss. So I think we need to at least balance things out such that it's less in US interest to take those kinds of measures.
I'll just charge my laptop here quickly. On what middle powers can effectively do: the Netherlands, for example, has ASML, which plays a critical role in the supply chain. Taiwan has TSMC. Japan has important materials for the semiconductor supply chain. Korea has high-bandwidth memory. So collectively, I think we're actually quite well positioned to play an important role in this AI ecosystem. The middle-power coalition is going to be important. There's also the scenario Dario highlighted in his recent essay — a coalition of democratic countries where the principle is that we give each other access to frontier technology. So there are scenarios where this middle-power coalition could work well, and collectively they do have some important assets.
1:20:09
Prakash: Let me give you a counterproposal. The largest French company right now is LVMH — Louis Vuitton Moët Hennessy — and it's not really selling technology; it's selling history and culture. A lot of European firms selling wine or cheese are really selling consumer products drawn from history and culture, not technology. You can't make more hectares of land in Bordeaux — the number of hectares is fixed. So European economic power doesn't necessarily go hand in hand with technological development. It's already at the apex of that human pyramid. Once you make enough money, you want to spend it on a vineyard in France.
1:21:38
Michiel Bakker: Living in the post-AGI world.
1:21:39
Prakash: Exactly. Maybe you're already living at the apex of the post-AGI world. And if you're already there, your influence on human affairs is through cultural supremacy, not technological development. Europeans have arguably already achieved that — you have whiskey makers in Japan, winemakers in China and India. That supremacy has already been achieved, and Europe is just waiting for everyone else to catch up to the European lifestyle.
1:22:17
Michiel Bakker: It's also funny because in Europe, the opposite perspective is often mentioned — that we live in American culture. The artists I grew up with are mostly American. The movies I loved growing up are mostly American. So I don't know if European culture is as dominant as you make it seem. But we definitely have an important strategic role to play there. I just feel that we have 500 million well-educated Europeans — we should be able to do more. We should be able to play a much more important role than just providing wine and leather handbags — that aren't even manufactured in Europe for the most part, well, maybe mainly in Italy and France. We have the universities, the companies, the entrepreneurial spirit that's still very much alive in some parts of Europe, and if we have the right ingredients we can do a lot more.
Yes, LVMH is an important company. But ultimately, Americans can live and run their economy without leather handbags, and we won't be able to live and protect our citizens and run our economy without AGI access.
1:23:52
Nathan Labenz: It sounds like the scenario I've heard from China over the last couple of decades — they've achieved an economies-of-scale advantage and ultimately a cost-basis advantage for comparable products, and they can now outcompete manufacturing sectors all around the world because a BYD is $10,000 and it's pretty good. If countries allow BYDs into their domestic market, it becomes really tough for their own automakers — which is presumably why I can't buy a BYD here in the US.
The biggest worry would be a similar move by the United States, but at the white-collar services level. You could even imagine a Bernie Sanders proposal passes and American households have stock in these AI companies. Then the new American imperialist move is to run the China play — except it's not manufacturing, it's engineering services that come in on an AI-powered basis, just way cheaper and potentially better than what German engineers can do. How do you think about the highest-order bits that European leadership could act on now? Not the full buffet, but the top couple of moves that would really strengthen the negotiating position of Europe as quickly as possible.
1:26:00
Michiel Bakker: I think there are two directions. One: make sure we have some role to play in the full stack. Have a good model company, have your own data centers — we barely have any AI compute. Have your own energy supply that's relatively independent from other countries. Right now we're mostly importing LNG from the US, which replaced Russian gas. Getting the basics right across the whole AI stack is important. You don't have to be absolutely frontier everywhere, but it's nice if in some parts of that stack you have frontier technology, and we do with ASML.
The other thing is: can you actually build out the positions where you have leverage? Around Eindhoven in The Netherlands you have ASML. Then in Belgium, close by, you have IMEC. Then in Germany there are a few fabs. We have quite a good semiconductor ecosystem. The US has now built a fab with TSMC and is getting ahead, but we're still ahead in our semiconductor ecosystem. Can we build that out — funding programs for more chip companies, making sure ASML has everything it needs, visas to hire more people, land to build new factories? Can we really strengthen the things we're really good at, and build ecosystems around this, similarly to how San Francisco has built an insane ecosystem around AI?
1:28:08
Prakash: This is interesting — I see a lot of European technologists working in the US. Leopold Aschenbrenner, for instance. I'd estimate roughly 10–20% of top AI researchers are probably European. Many went to some of the best schools in Europe and then came to grad school in the US — often told by their advisers in Europe to go. I've come across quite a few grad students who were told, 'You're too smart. You have to go to the US. There's nothing for you here anymore.' Do you see a way that can be changed?
1:29:41
Michiel Bakker: I had a PhD offer from TU Delft, where I did my undergrad and master's, and then I got offers from US schools. I got the MIT offer, and my master's adviser also said, please go to MIT — you're absolutely insane if you take the Delft offer. This happens a lot.
I don't know if we can change that on the timelines relevant for AI. The UK is one big exception — it has schools that are globally excellent and on par with the best American schools. There are other exceptions, like the Max Planck system in Germany and the University of Amsterdam, which has Max Welling and a whole group that was built around him. But largely, I think this is a problem we can't solve on the timelines that matter here.
That said, I can imagine that if there's a European frontier AI company that credibly says, 'We're going to build recursive self-improvement, we're going to get to the frontier,' and puts out a big banner in San Francisco saying, 'Guys, you've vested your OpenAI and Anthropic stock — it's time to go home' — that could rally quite a few people around that vision. If I speak with European researchers in the US, they often say, 'I'm here for a few years, but ultimately I want my kids to grow up in beautiful Europe.' So maybe there's something we can do there.
1:31:44
Nathan Labenz: How much do you think it matters who builds out the compute? American hyperscalers are going all over the world building data centers, and they'd surely be willing to build a lot in Europe if European governments said go for it. Would it be enough in some fundamental sense for this infrastructure to be on European soil — or is it still a problem if it's owned and operated by American companies?
1:32:34
Michiel Bakker: There's a clear trade-off. The hyperscalers are incredibly good at large-scale distributed training. It's going to be hard to do that with Navitas or other European providers that can maybe handle inference, but for multi-data-center large-scale training it's a different matter. So maybe we need US hyperscalers to help with that.
Additionally, having US hyperscalers on European soil is maybe not as good for independence and sovereignty as having European compute — or at least European data centers with NVIDIA, so still American compute but on European soil. A lot of NVIDIA compute has been sold for the next two or three years, and hyperscalers are going to build that somewhere. If we can make it more interesting for them to build in Europe, that's going to be capacity we couldn't otherwise get with European data centers alone.
Clearly, European-controlled compute would be best. American hyperscalers in Europe would still be better than the status quo. And the status quo — where almost all AI compute is built in the US — is obviously not preferable for Europe.
1:34:13
Prakash: What would be the major differences if Europeans had their own hyperscalers and their own compute? What policies — besides the Anthropic ban — would be different?
1:34:51
Michiel Bakker: The hyperscalers have ultimately also been the ones that funded a lot of the AI ecosystem in the US. Anthropic's funding comes largely from Amazon and Google. So they're important both as a compute layer and as funders and talent-trainers. They also have enormous know-how on how to build large distributed systems, which is very hard to replicate and critical to frontier model development.
The other thing is: if you look at the Dutch government, for example, everything runs on Azure. It's crazy how deep the dependencies on American software go in Europe. So if you would have a European hyperscaler that, in addition to data centers, also had a really good Google- or Microsoft-like enterprise layer on top — that would be even better and would change things even more.
Generally, I'm not against interdependencies — ideally we have as much interdependence as possible between Europe and the US, such that we all live happily ever after.
1:36:25
Nathan Labenz: I wonder if you could comment on what the special twist is that Europe could ultimately bring to the AI future. If there's a positive vision for the AI future with European characteristics — beyond just retaining bargaining power and economic parity — what are those uniquely European characteristics? How are we going to live? How will AI serve us versus dominate us? Is there a European angle on those questions that might inspire people to support this mission beyond the 'let's make sure Europe isn't left behind' framing?
1:37:23
Michiel Bakker: It's an interesting question, and I get it a lot from European policymakers. 'We want AI with our European values — we don't need American or Chinese values.' I always feel we exaggerate how different our values are from American values. Are Italian and Dutch values more similar to each other than Dutch and American values? I think the main reason is not that we'd get an AI that behaves very differently or has some better constitution because Europeans thought about it. If anything, there's now a Scottish person — Amanda Askell, I believe, is Scottish — who is thinking about Anthropic's values. So maybe we already have European values in those American models.
I think the main reason is that these AI models are ecosystems. Claude will ultimately also do biology, chemistry, chip design. Having a European model at the frontier would create a whole ecosystem of technology development, science, and education around those models and companies. That's the main reason we should want them.
There are subtleties around privacy — maybe we'd design privacy mechanisms differently in Europe, and maybe that would actually be better for people. But largely, I think the AIs wouldn't look so different if they were developed in Europe.
1:39:36
Nathan Labenz: Thanks for staying a little bit long with us. I think I only have one more. What has the response been from policymakers over the last week? Is the checkbook open? Are we about to see hundreds of billions of euros flooding in?
1:39:58
Michiel Bakker: Not yet, I think, but the response has been incredible. I won't mention names, but we've had politicians from very high up, from many different European countries, and even from the UK, Australia, and South Africa reach out wanting to chat. I'm going to The Netherlands tomorrow to talk with some cabinet members. I'm going to Brussels next week. There's been a lot of excitement — though 'excitement' is maybe a bad word because the scenario is somewhat bleak. But at least excitement around taking action.
The other thing I really liked: a friend from high school reached out and said, 'Can you please ask people to stop sending me your scenario? They don't even know I went to high school with you, and still three people sent it to me in the last two days.' That's ultimately what we were after — not just a policy document for people in Brussels or The Hague, but a scenario that helps people outside of AI understand what's at stake, what Europe's role currently is, how we might be able to change it, and to really start caring. I may be even more excited about the random people who reached out than the politicians — they ultimately have a lot of power through being able to vote and advocate.
1:41:49
Prakash: Maybe one last question: how much influence can you have with non-compute methods — for example, by preparing more data? Europe has strict privacy laws, GDPR, which have limited the data that goes into models. To a large extent, your influence on what models can do is determined by how much data you put into them. So to what extent is the solution for Europe to lower its barriers on data collection, allow more data from European citizens to be fed into the models — and if that is a solution, how do you put that across to European politicians who are on the other end of data protection?
1:43:11
Michiel Bakker: I personally think that's not the solution, for the reason I mentioned earlier — the values thing is not the main problem. It's not that we get American models with somehow non-European values. We're working very hard to make these systems corrigible. If you tell them, 'I'm from The Netherlands and these are my values,' the system should reflect those values. Early systems were very Reddit-like, but we've gotten quite good at post-training. Pretraining is more about getting general understanding of the world — that's not where most of the values are created or shaped.
For that reason — both because the American-vs-European values discussion is a bit of a distraction, and because adding to pretraining data won't have that much impact — I don't think that's the right lever. Though it would be great if Europe said, 'We're going to create a fair-use regime so that European AI companies can use a lot of data' — but that's a different reason from spreading our values.
1:44:51
Prakash: I actually just open up these questions without knowing what my own answer will be — I try not to be too attached to any answer because the answers change over time. Michiel, thank you so much for spending time with us. Europe 2031 definitely made an impact, and I hope you can continue to put across opinions and projects that make a difference. Thank you so much.
1:45:32
Michiel Bakker: Thank you so much for having me. Great to meet you. Have a great day. Cheers.
1:45:41Closing16 min
Close: The Geopolitics of AI Data, and a Tease of Gradual DisempowermentThe hosts extend the AI-geopolitics thread — middle powers gaining leverage by opening their data, the late-forward-pass 'cashing out' of language, and China's more-structured-than-assumed IP ecosystem — before teasing the next morning's guest, David Duvenaud, on 'gradual disempowerment.'
Watch
As aired
Nathan and Prakash wrapped up the June 23rd show by riffing on a pair of ideas sparked by their earlier guest segment. Nathan opened with a show-ops suggestion—adding a name-pronunciation audio field to the guest sign-up form—before both hosts leaned back into the geopolitics-of-AI-data thread. Nathan argued that middle powers like Brazil could gain outsized AI influence simply by placing their data on a 'silver platter' for frontier companies, citing internal mechanistic research suggesting language 'cashes out' late in a model's forward pass and leaving significant room for improvement across underrepresented languages. Prakash pushed the argument further, sharing a Roon tweet framing the coming explosion of passive data capture as 'a form of worship of human life,' and contending that middle powers willing to lower privacy barriers early could shape foundational model behavior in ways larger powers politically cannot.
The conversation pivoted to China's IP ecosystem—both hosts noted it is more commercially structured than commonly assumed, with an oligopoly of government-connected licensees enforcing Hollywood deals for new theatrical releases. Nathan wrapped up by teasing the next day's guest: David Duvenaud, an AI professor (University of Toronto) who spent time at Anthropic and came away deeply impressed by its culture, but troubled by a question that surfaced at its lunch tables—'we don't really have a plan for our success.' His concept of gradual disempowerment holds that even well-aligned AIs could systematically erode human macro-control, one sensible delegation at a time. Nathan described a recent workshop Duvenaud convened to stress-test visions for a good post-AGI society, and the pair signed off promising to hear the front-line report the following morning.
Key moments
We don't really have a plan for our success. Let's say we make these AIs and let's say they're really great—even quite aligned. How does that end up in a place where we are all actually still happy with the final state of things? That is where he says we don't have a good plan.
Nathan Labenz1:58:15
I would expect our whole lives to be recorded and the sheer signal density of your physical and digital presence to go up ten thousand times—agents and computers to sift through all of it. Really, it's a form of worship of human life.
Prakash1:50:02
Full transcriptLightly edited · timestamps jump to YouTube
1:45:42
Nathan Labenz: Note for our recursive self-improvement—we should maybe add an audio field to the guest sign-up form where they can pronounce their names, so we can make sure we're going to pronounce them in just the way they do. I always, for the podcast, ask the last thing before I welcome people and say their name: I ask them to say it to me. Doing it live we can't really do that, but we could potentially get a little audio capture on the sign-up form, and then we'll have no doubt how to say names. I didn't envy you for having to try that one—yeah, I did, hearing it first.
1:46:21
Prakash: Especially Dutch—because sometimes the spelling can be exactly the same as English but the pronunciation is different. So if I messed up, I hope you'll forgive me.
1:46:31
Nathan Labenz: I'm sure they will. But we can always take a note for our personal RSI. I'm with you a little bit. I've thought for a long time that if I'm a middle power of probably all kinds of different stripes, it's going to be really hard to compete on compute, obviously, and it's going to be really hard to compete on talent. I think Europe has a chance—the idea he mentioned of 'you've invested and it's time to come home,' I think that's at least somewhat viable. But those strategies obviously aren't available to countries in Africa or South America or most of the world. That said, the data side does seem like a place—I've proposed this and nobody takes the bait, but I've kind of said: if you're Brazil, for example, why don't you just put all your data on a silver platter, give it to the frontier companies, and say, here, with this you can serve us really well? And clearly they'll take it—they'll take at least some of it, maybe filter it, but they'll use it. And then you would seem to stand to benefit because you are more representative. Maybe it's not even a values thing, but just a pure utility thing—the familiarity with local culture, the ability to make better situationally aware recommendations, understanding how business works in different places. All these things are so key to getting practical day-to-day value out of models.
And from what I hear, especially from anyone outside the top few languages, they say, yeah, it doesn't really speak—I mean, it understands our language, it can translate back and forth, but it kind of speaks Polish like an American who learned Polish. The final word on this is not written, but we have seen a bunch of mechanistic and internal studies that suggest cashing out to a particular language is kind of late in the forward pass. The highest-abstraction representations are somewhat language-neutral—you put Polish in, it works its way up to a more language-neutral representation and then comes back down. But that cashing-out process definitely seems to be a place with a lot of room for improvement across many languages. So I find that more exciting for Brazil or Poland than our guest seemed to feel—maybe because I think of it more as a utility thing than a values-projection thing.
1:50:02
Prakash: Let me show you this tweet from Roon, which he's responding to Amanda Askell. I commented 'yes, but people don't ignore things on our scans—ignoring things on scans is our norm.' They were responding to the idea of the Midjourney scanner—an ultrasonic body scanner you could use every single day. Doctors immediately pointed out that there are a lot of benign tumors in your body, and when you spot them you might do something drastic like a double mastectomy for a benign finding. But Askell's counter was: not ignoring things on scans is our norm because until recently we only did scans when there was a clear need—if we move to a scan-more-often paradigm, the norms of what we do with that information will surely adjust.
Roon responded to that and said: I would expect our whole lives to be recorded and the sheer signal density of your physical and digital presence to go up ten thousand times—agents and computers to sift through all of it. Really, it's a form of worship of human life. And if you look at that as the endpoint—right now, maybe the top one percent of posters on the internet are being heard. That is the distance we're at. You should have passive observation of every single word, action, mouse click, screen movement; every single thing you've seen in your life should be passively recorded and the signal extracted. We're just not there yet, so we still have this long trajectory of data collection to go.
I think we worry too much about who controls the processing equipment and not enough about how we could be generating a lot more data and feeding it in. So much more signal gets extracted from the data than from the architecture—a similar transformer is doing all of the languages; the signal is really being extracted from which languages and which words are being fed in. So I think you could actually allow your country as a middle power to get recorded earlier, to be earlier on this path of providing training data, and then you have the impact of shaping the models way earlier. You have this huge advantage because the greater powers—the US and Europe—are probably not going to lower their privacy standards that much. You could have enormous impact by just providing, say, spoken-language data for your entire country and putting it on the internet. Right now there are so many barriers: even if you want to train on data in the US or Europe you have to scrub personally identifying information—an AGI-complete problem in itself. So I think there's a huge, huge opportunity for middle powers to just provide a lot of data and help these companies train the models.
1:53:58
Nathan Labenz: Yeah, I buy it. I think the data cleaning probably has to happen to a significant degree anyway—the companies don't really want the models to have too much PII memorized, and they don't want their corpus poisoned, so they're going to do a careful pass on whatever they're handed, if only for defensive purposes. But I don't think that negates your broader point that there does seem to be a significant opportunity there, and it's odd that we really haven't seen almost anyone try it. From what I understand, Japan might be the closest—they've kind of created a right-to-train type of default.
1:54:52
Prakash: That's right.
1:54:56
Nathan Labenz: Obviously we have a pretty strong right-to-train default in the US too—people asking for forgiveness rather than permission, as is often the American way. China I think has a lot of data, but it's mixed: they care a lot about individual privacy from companies—they don't promise to protect you from the government, but the government tries to protect you from companies. It's a kind of inverted paradigm. I spoke to a woman in China who does licensing of video data, going around to studios and TV stations and buying up old catalogs, then licensing to both Chinese AI companies and American ones. Strikingly, she has to actually license it—she cannot just go scrape it up. The IP rights are respected within China. Now, will they respect all the Hollywood studios' IP? Maybe a different answer. But from her telling it is not a total open wild west commercially.
1:56:34
Prakash: Surprisingly, for new movies, Hollywood rights are respected because China allows only a limited number of foreign movies into the country. There's a Communist Party-affiliated oligopoly with licenses to bring in roughly ten or twenty movies a year each, and they are heavily incentivized to prevent piracy of those movies. Hollywood studios sign deals with this oligopoly, they bring it in, make the money, and remit the proceeds—you're basically just paying a tax to these government-connected parties. It never falls under the US's FCPA rules, so whatever—that's the ballgame in China for new movies. For older movies it's a little different; they just make the price so cheap that it makes sense. So that's the case.
1:57:57
Nathan Labenz: Well, anything else we should cover today? I think we have a very interesting session lined up for tomorrow—we're going to have David Duvenaud, who is a professor of AI in Canada, University of Toronto.
1:58:10
Prakash: Right? Toronto or Quebec? He's in Milan, Quebec.
1:58:15
Nathan Labenz: I'm going to have to fact-check myself on that. University of Toronto—and he had previously taken some time off and gone to work at Anthropic, where he had very good things to say for the most part. He said it was, in fact, not one of the but the highest-functioning company he's ever seen, one of the healthiest work cultures—no surprise from his telling that they've been so successful in training great AIs and commercializing them. But the big thing he found in the lunch-table discussions there was: we don't really have a plan for our success. Let's say we make these AIs and let's say they're really great—even quite aligned, doing what we tell them and not taking over, pushing back in the right ways. How does that end up in a place where we are all actually still happy with the final state of things? That is where he says we don't have a good plan.
This has become packaged up—in one memorable paper—as 'gradual disempowerment': the idea that even if the AIs are all good, we will find ourselves run by them rather than humans retaining macro control, simply because at every decision point it will make sense to hand over more autonomous decision-making authority to AI systems. And if there's no stopping that, does it take us to a good place? His view, as I understand it, is that it's really hard to sketch out a scenario where we are happy with the result even if we solve the most important problems.
So he just put together, not too long ago, a weekend workshop—I wasn't able to attend but was quite sad to miss it. We'll get the front-line report on what the best ideas are for what life is supposed to look like in the event all the open research questions are answered in favorable ways, and we're left to structure society with a mix of humans more or less as we are today and powerful AIs that will naturally be best suited to sit in a lot of key decision-making positions. I think that will be really interesting. I'm hoping to hear some inspiring new ideas for what that life might look like, though the likely scenario is that there aren't too many credible visions that really stand up to scrutiny. But we'll have to hear his telling of it, and we can certainly offer our own and see how he'll shoot them down.
2:01:23
Prakash: Awesome—dropped off for a moment there, but let's circle back tomorrow and see what David has to say.
2:01:34
Nathan Labenz: We'll see you right back here tomorrow for another exciting edition of AI in the AM. Thanks, Prakash.
2:01:40
Prakash: Thank you, Nathan. Bye-bye.

Opening: A Quiet Day Through the Markets — GPT-5.6 Rumors, the Chip Selloff, and the 'Too Big to Fail' Bubble Question

With an unusually quiet 24-hour news cycle, the hosts opened on rumors and markets. Nathan relayed leaker-Discord chatter that GPT-5.6 had been pulled back from a planned launch — likely tangled in the uncertainty around the administration's export-control and model-release freeze — and that Gemini 3.5 Pro was underperforming the new benchmark bar, pushing Google's timeline back. Prakash then ran his markets segment: the semiconductor index was off 6%, and SK Hynix had overtaken Samsung Electronics as Korea's largest company by market cap for the first time in 27 years, a move some read as a bubble signal given heavy retail leverage. Micron was set to report after the bell (~$27B whisper for the quarter), one day after announcing a long-term design partnership with Anthropic — Anthropic's first memory-chip deal, with Micron investing in Anthropic. NVIDIA's stock had been flat for six months as it pulled back from circular investment deals, and SpaceX was down ~20% ahead of a large share unlock.

That set up an extended exchange on financial fragility. Nathan raised the 'too big to fail' framing from his Dean Ball conversation and asked how much shock-absorption capacity exists if leveraged retail investors trigger contagion. Prakash argued near-term AI cash flows are solid through 2028 but the 3–5-year window carries real uncertainty — especially if algorithmic efficiency gains erode the energy and hardware moats underpinning current valuations — and introduced his 'capital cycle' idea: companies must deploy buybacks fast enough to outrun the leverage in the system. Nathan noted that persistently high GPU rental prices (A100 hourly rates rising even as newer generations ship) might be the real fundamental backstop, while Prakash urged viewers with risk capital to lean into AI and tech.

Interview: Self-Improving Compute — Bing Xu and the PTX Kernel Factory

Bing Xu — founder & CEO of INT21, co-author of the original 2014 GAN paper, co-creator of Apache MXNet and Meta's AITemplate, and founder of HippoML (acquired by NVIDIA) — joined to argue that self-improvement should target the infrastructure, not the model. INT21's first product, the PTX Kernel Factory, points autonomous agent swarms at NVIDIA PTX, the ISA sitting below CUDA. Rather than building a DSL or a layer above the GPU (like Triton), the factory generates inline PTX directly, letting the system model hardware behavior precisely and extract maximum performance. Optimization runs in two phases: a background self-improvement phase that accumulates knowledge across generations before any request, and an interactive phase where a user names a kernel and a success bar (e.g. 'beat CUTLASS on this shape') and the swarm evolves toward it using prior experience. The framing premise, as Prakash put it: an AI company's engineering capacity should scale with its compute budget, not its headcount.

Nathan pressed the CUDA-moat question — wouldn't agent-generated kernels make any hardware equally optimizable and erode NVIDIA's software lead? Bing argued the opposite: the same automation raises the moat, because the evolutionary loop depends on NVIDIA's mature profiling ecosystem (NCU, accurate instruction counters, reliable drivers) to give agents clear directional feedback that competitors' toolchains can't match. On results, he distinguished mature workloads like RMSNorm — where the factory systematically matches or slightly beats expert libraries like QuACK across 100+ configurations — from newer workloads like Kimi Delta Attention, where expert optimization is still nascent and the factory posts up to 59% speedups with correctness validated across 580 tests. He described SwarmOS, a cloud-native, AlphaGo-style evolutionary substrate coordinating up to 10,000 agents, each running in an isolated sandbox on real NVIDIA hardware with the evolution tree preserved for backtracking. On models, Bing said GPT-5.5 has been uniquely able to escape local minima where other models collapse into sycophantic agreement loops, and noted that Fable 5 refused a basic PTX query as potentially dangerous. Coordination at scale, he said, mirrors human organizational discipline: tightly scoped, specialized agent roles rather than chaotic peer-to-peer autonomy.

Interview: Europe 2031 — Michiel Bakker on AI Dependence and Sovereignty

Michiel Bakker — assistant professor at MIT Sloan, senior research scientist at Google DeepMind, and co-author of the Habermas Machine — joined the week after his viral scenario Europe 2031 published, and the week the world began catching up to it. Within a day of the scenario going live, the US invoked emergency export controls restricting Anthropic's most powerful models for foreign nationals, turning a fictional 2028 forecast into a 2026 headline. Bakker described the mix of vindication and alarm, then walked through the scenario's design: two characters — Caroline, a Brussels official who trusts her institutions, and Christian, a European founder in Silicon Valley who is reliably right about timelines — dramatize the widening gap between the frontier and European governance. The premise isn't that Europe takes a dramatic turn for the worse, but that it keeps doing things half-heartedly while the US consolidates control of the most powerful systems and China dominates robotics and manufacturing, squeezing Europe in between.

The hosts pressed his strategic options. Regulating from the outside works only with a seat at the table, which requires capability first. The nuclear-umbrella analogy fails because AI is primarily an economic technology — protecting Europe militarily costs the US little, but sharing its best models is a direct economic transfer. And cultural capital is no substitute: as Bakker put it, Americans can run their economy without leather handbags, but no economy will run without AGI access. His path forward: get the basics right across the full stack (compute, data centers, energy, a credible model company) while pressing the choke-point leverage Europe already holds — the semiconductor ecosystem around ASML in Eindhoven, IMEC in Belgium, and German fabs — and building a middle-power coalition spanning Taiwan (TSMC), Japan (materials), Korea (high-bandwidth memory), and the Netherlands (EUV lithography). He closed on the scenario's reach: cabinet members in The Hague, meetings in Brussels, and ordinary people forwarding it — exactly the general-public reach the authors wanted.

Close: The Geopolitics of AI Data, and a Tease of Gradual Disempowerment

Nathan and Prakash closed by extending the AI-geopolitics thread. Nathan argued that middle powers like Brazil could gain outsized influence simply by putting their data 'on a silver platter' for frontier labs, citing mechanistic-interpretability research suggesting language 'cashes out' late in a model's forward pass — leaving real room to improve underrepresented languages. Prakash pushed further with a Roon tweet framing the coming explosion of passive data capture as 'a form of worship of human life,' arguing middle powers willing to lower privacy barriers early could shape foundational model behavior in ways larger powers politically cannot. The two also noted China's IP ecosystem is more commercially structured than commonly assumed. Nathan wrapped by teasing the next morning's guest, David Duvenaud — an AI professor and Anthropic alum whose concept of 'gradual disempowerment' holds that even well-aligned AI could erode human macro-control one sensible delegation at a time — and a workshop Duvenaud convened to stress-test visions for a good post-AGI society.