POPULARITY
Today, we check in a year after the first Unsupervised Learning x Latent Space Crossover special to discuss everything that has changed (there is a lot) in the world of AI. This episode was recorded just after AIE Europe, but before the Cursor-xAI deal.Unsupervised Learning is a podcast that interviews the sharpest minds in AI about what's real today, what will be real in the future and what it means for businesses and the world - helping builders, researchers and founders deconstruct and understand the biggest breakthroughs.Thanks to Jacob and the UL production team for hosting and editing this!Jacob Effron* LinkedIn: https://www.linkedin.com/in/jacobeffron/* X: https://x.com/jacobeffronFull Episode on Their YouTubeWe discuss:* swyx's view from the center of the AI engineering zeitgeist: OpenClaw, harness engineering, context engineering, evals, observability, GPUs, multimodality, and why conference tracks now reveal what matters most in AI* Whether AI infrastructure has finally stabilized: why “skills” may be the minimal viable packaging format for agents, why infra companies have had to reinvent themselves every year, and why application companies have had an easier time surviving model volatility* The vertical vs. horizontal AI startup debate: why application companies can act as the outsourced AI team for enterprises, why some horizontal companies still matter, and why sandboxes may be the clearest reinvention of classic cloud infrastructure for the AI era* The “agent lab” playbook: starting with frontier models, specializing for your domain, then training your own models once you have enough data, workload, and user behavior to justify the cost and latency savings* Why domain-specific model training is real, not just marketing: how companies like Cursor and Cognition can get users to choose their in-house models, and why search, domain specialization, and distillation are becoming more important* Open models, custom chips, and alternative inference infrastructure: why swyx has turned more bullish on open source, why non-NVIDIA hardware is suddenly getting real attention, and why every 10x speedup can unlock new product experiences* What it means to sell to agents instead of humans: why agent experience may mostly just be good developer experience by another name, why APIs and docs matter more than ever, and how pretraining-data incumbents are compounding advantages in an agent-first world* Why memory and personalization may become the next big wedge: today's models mostly reward frequency of mentions, but in the future, swyx expects product choice to be shaped much more by personalized memory systems* The state of the AI coding wars: why coding has become one of the largest and fastest-growing categories in AI, how Anthropic, OpenAI, Cursor, and Cognition have all ridden the wave, and why the category may still have more room to run* Capability exploration vs. efficiency: why the industry is still in a token-maxing, experiment-heavy phase where people are rewarded for spending more rather than less* Claude Code vs. Codex and the strange stickiness of coding products: why first magical product experiences may matter more than expected, and why the bigger mystery may be why only a few names have emerged as real winners so far* What the end state of the coding market might look like: two major players, a longer tail of niche products, and possible disruption if Microsoft, Mistral, xAI, or the Chinese labs push harder into coding* Where application companies still have room against the labs: why frontier labs are trying to expand into verticals like finance and healthcare, but still leave space for focused companies that own the workflow and the last mile* Why coding may be a preview of every other AI market: the first category to truly go parabolic, the clearest example of foundation model companies colliding with application companies, and a template for how future vertical AI markets may develop* Why AI valuations now feel unbounded: from billion-dollar ARR products built in a year to trillion-dollar market caps, swyx and Jacob unpack how the AI market has broken traditional startup intuitions about scale and durability* Consumer AI vs. coding AI: why ChatGPT's consumer category may have plateaued on frequency and product design, while coding continues to feel like a daily-use category with real momentum* The next product frontier beyond coding: consumer agents, computer use, and “coding agents breaking containment,” with swyx's thesis that 2025 was the year of coding agents and 2026 may be the year they begin to do everything else* Whether foundation models are really killing startup categories: why swyx is less worried for early founders, more worried for mid-size startups and traditional SaaS, and why building something ambitious may now be the best job interview for a frontier lab* AI vs. SaaS and the internal culture war around adoption: the tension between AI-native employees who want to rip out expensive software and skeptics who think quick AI-built replacements create fragile systems* Why traditional SaaS may be under real pressure: swyx's own experience spending six figures on event and sponsor management software, the temptation to rebuild it cheaply with AI, and the broader question of whether teams will trust custom AI-native replacements* Biosafety, security, and frontier model access: why swyx raised biosafety at a dinner with Anthropic's Mike Krieger, why Krieger argued security is the bigger issue, and what restricted model releases reveal about Anthropic vs. OpenAI* The era of giant models: why 10T+ parameter systems may only be a temporary rationing phase before bigger clusters arrive, why labs may increasingly keep their most powerful models private for distillation, and why scale alone no longer feels like a complete answer* Memory as the slowest scaling factor in AI: why context windows have improved far more slowly than people hoped, why million-token context still has not changed most real workflows, and why memory may be the key bottleneck for the next generation of systems* What swyx changed his mind on in the past year: becoming more bullish on open models, more convinced that the top tier of agent startups behaves very differently from the median AI company, and more optimistic about fine-tuning and specialized model adaptation* “Dark factories” and zero-human-review coding: the next frontier after zero human-written code, where models not only write the code but ship it without human review, forcing companies to rethink testing and verification from first principles* Why RL and post-training may matter more than people assumed: even if the resulting models get thrown out every few months, the data, workflows, and domain-specific improvements persist* Synthetic rubrics, Doctor GRPO, and multi-turn RL: why reinforcement learning is becoming much more domain-specific and multi-step than many people realize, opening the door to much deeper customization* The next frontier after coding: memory, personalization, and world models, including why swyx thinks world models matter not just for robotics or gaming, but for giving AI something closer to lived understanding* Fei-Fei Li, spatial intelligence, and the Good Will Hunting analogy: the idea that today's LLMs may know everything by reading it all, but still lack the lived experience that turns knowledge into a deeper kind of intelligenceTimestamps* 00:00:00 Intro preview: AI coding wars, startup pressure, and market structure* 00:00:28 Welcome to the Latent Space × Unsupervised Learning crossover* 00:01:17 What AI builders are focused on now: OpenClaw, harnesses, and infra* 00:04:33 Why AI infra is harder than apps, and where startups can still win* 00:06:39 Should companies train their own models?* 00:09:28 Open models, custom chips, and the new inference race* 00:11:25 Designing products for agents, not just humans* 00:16:49 The state of the AI coding wars in 2026* 00:19:27 Capability exploration, token-maxing, and why coding is going parabolic* 00:21:41 What the end state of the coding market could look like* 00:23:50 Where app companies still have room against the labs* 00:27:02 Why AI valuations and market swings feel unprecedented* 00:28:56 Consumer AI vs. coding AI, and why sticky products still matter* 00:32:28 What the next breakthrough product experience might be* 00:32:53 2026 thesis: coding agents break containment and eat the world* 00:35:27 Are foundation models wiping out startup categories?* 00:37:33 AI vs. SaaS, vibe coding, and internal team tensions* 00:40:01 Biosafety, security, and the politics of restricted model releases* 00:42:19 Giant models, compute constraints, and the limits of scale* 00:44:30 Memory as the real bottleneck in AI* 00:44:57 Why swyx changed his mind on open models* 00:47:44 Dark factories and the future of zero-human-review coding* 00:49:36 Why post-training and RL may matter more than people think* 00:51:50 Memory, world models, and the next frontier of intelligence* 00:53:54 The Good Will Hunting analogy for LLMs* 00:54:21 OutroTranscript[00:00:00] swyx: Isn't that crazy? That number is just mind boggling.[00:00:03] Jacob Effron: What is the state of the AI coding wars today?[00:00:05] swyx: We're in a phase of sort of like capability exploration. The general thesis that I have been pursuing now is that the same way that 2025 was a year coding agents 2026 is coding agents breaking containments to do everything else.[00:00:16] Jacob Effron: Do you worry about the foundation models just getting into a bunch of these startup categories?[00:00:21] swyx: Mid-size startups. Yes.[00:00:23] Jacob Effron: What do you think the end state of this market is[00:00:25] swyx: for the market structure to, to significantly change? There would be[00:00:28] Jacob Effron: today on unsupervised learning. We had a, a fun episode and what's really become an annual tradition, a crossover episode with our friends at Latent space.Swix and I sat down and we talked about everything happening in the AI ecosystem today. What we thought of the various changes at the model layer, what's happening in the infra world, the coding wars, and a bunch of other things. It's a ton of fun to do this with someone I really respect and another great podcaster in the game.Without further ado, here's our episode. Well switch. This is, uh, super fun to be back with another unsupervised learning, uh, latent space crossover episode.[00:01:02] swyx: Yeah,[00:01:02] Jacob Effron: I feel like a lot of places we could start, but you know, one thing I always find fascinating, uh, about the way you spend your time is you obviously are like at the epicenter of this engineering movement and community, and you run these events and conferences and put on these.Awesome talks and, and I think just have a great pulse on the zeitgeist of what's going on.[00:01:16] swyx: Yeah.[00:01:17] Jacob Effron: Maybe to, to start just what are the biggest topics people are thinking about right now?[00:01:21] swyx: Yeah, so I just came back from London, uh, where we did a IE Europe and we're doing roughly one per quarter now, which Yeah, you've[00:01:27] Jacob Effron: really up[00:01:27] swyx: the, hopefully[00:01:28] Jacob Effron: up the, up the pace.[00:01:29] swyx: It's trying. We're trying to match AI speed, youknow?[00:01:30] Jacob Effron: Yeah, exactly. The tops would be completely different, I imagine. Uh,[00:01:33] swyx: yeah. You know, I definitely curate the tracks, like you can see what I think. When you see the track list and the, the speakers that I invite, obviously Open Claw is like the story of the last four or five months, and then be, be just below that.I would consider harness engineering, context engineering to be two related topics in agents and rag. And then there's a long tail of Evergreen stuff like evals, observability, GPUs, uh, and uh, LM infra and just general, just in general. We also have other updates on like multimodality and, uh, generative media, let's call it.Um, but I definitely, the, the first three that I mentioned are top of mind people. Yeah.[00:02:13] Jacob Effron: I think harness is particular like, so interesting. Um, you know, there was this tweet from Harrison Chase, the, the lane chain, CEO, that, that caught my eye recently where he said, you know, it finally feels like we have stability, uh, around the infrastructure for, uh, you know, around ai.And I think what. He basically was implying his like, look over the past two, three years as a company at the epicenter of AI infrastructure, it was a bit like playing whack-a-mole, right? You were constantly moving around with, however, the building patterns were evolving[00:02:36] swyx: for Harrison for sure. Right? Like he's basically had to reinvent the company every year since he started Lang Chain.Right? It was Lang chain, Ang graph and LP agents and like, uh, I think he's like one of the most nimble, adept sharp people about this. Yeah. Yeah.[00:02:49] Jacob Effron: Saying now, now is finally the time stability[00:02:51] swyx: this. Yeah.[00:02:52] Jacob Effron: Yeah. Um, do you buy that or what have you kind of make of that take?[00:02:56] swyx: I think that. It, it's very expensive to say this Time is different sometimes, but when you're just writing code, like it's actually okay to just like try to make a call and I think it may not even matter if this call is right or not.Like I just don't even care that much because you can be right on a thesis, but if you don't, you don't figure out how to monetize the thesis, then who cares if you said something first that said, um, it does feel like, for example. Uh, we went through a lot of different ways of passion packaging integrations up with, uh, with agents.And it feels like we've landed at skills, which is like the minimal viable format. Yeah. Which is just a markdown file, uh, with some scripts attached to it, and I don't see how it can be more simple than that. And so there is some justification for. The stability around harnesses. I feel like there may be more adaptation with regards to maybe like the real time elements or subagents or memory or any of those like agent disciplines, let's call it in, in agent engineering.Uh, but if, if the thesis is that, okay, you just want agents are LMS with tools in the loop with a file system, what they can do. Retrieval with, with skills and all these like standard tooling that now seems to be relatively consensus then probably. That makes sense. Um, I just think like there's no point trying to stake your reputation on this thesis that we're there because if it changes again, just change with it.It's fine.[00:04:33] Jacob Effron: Yeah. It's always, you know, I've always been struck by how that is. Much more challenging for infrastructure companies and application companies. Like obviously I think, yeah. You know, on the application side you've seen, you know, Brett Taylor from Sierra Max, from Lara. Like, they're like, look, we build, you know, what's ahead of the models and we're willing to throw everything out every three months, you know, as the models get better and better.Exactly. Yeah. But the thing you at least have there is you have. Uh, you have an end customer, right? That's like decently sticky. Um, you know, they will mostly stick, you know, they'll, they'll give you a shot at least of, of building these things. What I've always found more challenging, uh, at, at the kind of like, you know, reinvent yourself every three months of the infrastructure layer, it's like, you know, developers are definitely a, a pickier audience maybe than an accounting firm or, uh, you know, a bank.Yeah. And so it's definitely a, a, a more challenging position to be in to, to have to constantly reinvent yourself.[00:05:17] swyx: Yeah. Yeah. Yeah. And, and like when they turn, it's like. Very complete. Like, they'll leave to like the, the hot new thing, uh, because there's like no defensibility, I guess. Like e even, even if you are a database, like, uh, people can migrate workloads off databases.Like it's, it's a, it's a known thing. Uh, so I think like basically what we're talking about is the vertical versus horizontal, uh, debate in, in AI startups. And uh, the way I think about it also is just that like when you are. Um, Lara, when you are a bridge, like you are the outsource AI team, right? You, you are, your job is to apply whatever state ofthe art AI methods.[00:05:55] Jacob Effron: Yeah. Like this translation layer between model capabilities and your[00:05:57] swyx: own customers. Yeah. To, to the end customers and like, well, if they didn't have you, they would've to hire in house and they're not gonna hire in house so they have you. And like, I think that's like a reasonable, like very robust to any whatever trends and, and discoveries that people make in, in the engineering layer.I do think like there is, um. It like sort of useful horizontal companies being built, but they're all. Very much like, sort of like the reinventions of classic cloud in the AI era and the, the primary one being sandboxes. Yeah. Um, which like, it's another form of compute guys, like, let's not get too excited about it.But I mean, like the, the workloads are enormous.[00:06:38] Jacob Effron: Right.[00:06:38] swyx: Yeah.[00:06:39] Jacob Effron: It's interesting, and I feel like as, as part of this, you know, the questions that folks are asking around infrastructure, there's a lot around, you know, the extent to which companies should have their own AI teams and what they should be doing in-house.And, you know, uh, I think there's questions around should people be training their own models? Should people be doing, you know, rl, uh, in-house based on the data they have? I feel like, you know, one has to evolve their takes on this every, every three months with paces. But where, where are you at on this today?[00:07:00] swyx: I think, well, I mean actually all models have gone up. Um, and obviously I'm involved in cognition and also cursors doing, doing, uh, a lot of own model training. And I think that that is some part of the, what I've been calling the agent lab playbook, where you start off with the state of the art models from, uh, from the big labs and you, uh, specialize for your domain.But once you have enough workload and enough high quality data from your users, then you can obviously train your own models and like save a lot on cost and latency and all that, all that good stuff. Um, you also get like a marketing bonus of like calling it some fancy name and putting out some research[00:07:38] Jacob Effron: from my seat.I can't tell how much of it is like actual, you know, value that's provided to the end user. And how much of it is that marketing bonus? Right. It seems some combination of the[00:07:45] swyx: I think it's both.[00:07:46] Jacob Effron: Yeah.[00:07:46] swyx: Um, no, no. There, there actually is real value. Um, and you, you know that for a number of reasons. Like one, even when it's not subsidized, people do choose it as like one of the top four or five.This is both composer two and, uh, suite 1.6 I one of the top five models. Like in a, in a fair market? In a free market, yeah. In a, in a, in a model switch. Or people do choose it and like, it's not subsidized. Like, so that's as good as it gets. Uh, but beyond that, like domain specific models, for example. For search with, with both, which both companies have absolutely makes, makes a ton of sense.Everyone says like, yeah, we should always, always do this. And honestly like, I think the infrastructure for that is becoming easier with, um, like thinking machines tinker thing as well as primary like, uh, lab stuff. Yeah, I mean like, this is one of those like reversal of the, the bitter lesson where you first bootstrap on the large models and the general purpose models to get big.And as you get very well-defined workloads that are just high quantity but not high variance, um, then you just distill down to a smaller model and run that on your own. Right. Which like totally makes sense.[00:08:50] Jacob Effron: What I'm less clear on is the kind of DIY RL use case, which I think is really mostly around, you know, improved, uh, quality for, for different things.Obviously there's probably like more efficient ways to, you know, get a smaller model that's that's faster and cheaper. And it'll be interesting to see whether. You know, obviously you had, you know, uh, two, three years ago this whole case of companies that were, you know, pre-training and claiming better outcomes in, in their domains than getting kind of cooked as each model iteration improved.You know, I wonder whether that's a, a similar story plays out in the, uh, in, in the, our all space. Yeah, for the focus on, on on pure outcomes and quality, not the cost side, which clearly your own models for cost at scale makes a ton of sense.[00:09:28] swyx: I think there are this, there are two sides of the same coin.Like you basically always want to hold, uh, quality constant or trade off a little bit of quality for a drastic decreasing cost. And that's true for everyone. Uh, one element I wanted to bring out, which is very much in favor of open models, is custom chips. So this would be cereus, but also talu. And then there's a huge range of stuff in between.This has been a huge story this past year on just like everything non Nvidia is getting bid up, including like freaking MatX is working for, which is very, which is very rewarding for me, but I think one of those things where like, oh, like the suddenly, because the number of alternative. Hard, uh, hardware is increasing and the inference that you can get is insanely high.Like, um, we're talking thousands of tokens per second instead of less than a hundred. So the trade off for qua quality doesn't hold as much anymore because the speed is so high.[00:10:24] Jacob Effron: Have you seen a lot of companies go all in on the alternative chip?[00:10:26] swyx: So cognition has Yeah. On Cerebras, uh, and, and so has OpenAIUm, uh, and so no, I don't think so beyond that, uh, and that, do you think that's like a, that's mostly, that's foreshadowing of, that's, yeah. I used to be kind of a skeptic in terms of like, okay, so what if I get my inference at a hundred to a hundred tokens per second sped up to 200 tokens per second. It's only two X faster.It's not that big a deal. Um, but when you, uh, I think every 10 x does unlock a different usage pattern. Um, and you, we have proof in Talas and, and some of the others. That you can actually, um, drastically imp improve inference speed and what happens from there? I don't even really know, like it's, it's so hard to predict when entire applications just appear at once.Yeah. Uh, and it also isn't that expensive, right? So like, um, this is one of those things where like, I, I think the, the investment cycle is gonna be multi-year. Um, and I. Would caution people to not dismiss it too, too quickly.[00:11:25] Jacob Effron: Yeah. I mean, one other like infra question I was curious to get your thoughts on is obviously it seems increasingly a lot of the cutting edge infra companies are building for agents as the buyers of their product or users of their product, right?[00:11:35] swyx: Ooh,[00:11:36] Jacob Effron: and[00:11:37] swyx: another huge theme. Yeah. Yeah.[00:11:38] Jacob Effron: And I'm trying to figure out like what. What, what do you have to do differently about selling into agents? Um, are they just the ultimate rational developers? Uh, or is there, you know,[00:11:46] swyx: no, absolutely not. Um, I think they are easily prompt, injected and, uh, very tuned towards like, basically com compounding existing winners.[00:11:57] Jacob Effron: Yeah,[00:11:57] swyx: so like if, like, congrats if you won the lottery for getting into the training data right before 2023, because now you're like installed in there for the foreseeable future. But yeah. Uh, you know, one stat that Versal, uh, CTO Malta dropped at my conference was that there are now, uh, 60% of traffic to Elle's, um, like app arch, like admin app architecture for like configuring versal applications, uh, is bought.It's not, it's not human. Uh, so like your primary customer is agents now. Um, and it's mostly co like mostly coding agents, mostly people using CLI on CP or whatever. But yeah, I mean, I think. More. I, I think step one, if it doesn't exist as an API that agents can use, it doesn't exist. Right, right. Which I think is like, uh, it's a good hygiene thing anyway, to, to make everything API available, but not as like an extra, um.Push on like products, people to not only work on the ui, um, you should probably work on the on SCLI stuff. Beyond that, I think honestly there is like, so I, I come from the sensibility of, I think everything that you are trying to do for agents experience now, which is the term that Matt Bowman and Nullify is trying to coin, is the same thing that you should have been doing for developer experience.That you should have had good docs, you should have had a consistent API, uh, that is. Mostly stateless. Um, you should have, I guess, discoverable or progressive disclosure or like search or like whatever. And so now that people have energy in like finding these customers to do that, that's great. Um, do I believe in.Extending beyond that into something like a EO, um, for gaming The chatbots? Not necessarily, but obviously there's gonna be huge advantages when people who figure out the short term wins. Yeah. And short term wins can compound.[00:13:43] Jacob Effron: Do you think these compounding advantages to like the, the pre-training data cutoff companies, like, you know, obviously over some period of time, I imagine that doesn't persist.And so as you think about like. I dunno, three, four years from now what the, you know, selection criteria end up being. Do you think it still mirrors exactly what you were saying before? Like it's exactly what you should have been doing all along to sell a good product to developers?[00:14:01] swyx: It could be, except that I think in three, four years we'll probably have much better memory and personalization.So then general a EO or GEO doesn't really matter as much. So I think whatever memory or personalization system we end up with will probably d determine what you end up choosing much more. Than, than what is currently the case, which is just frequency of mentions, let's call it. Yeah,[00:14:26] Jacob Effron: yeah.[00:14:26] swyx: Uh, so you just spa quantity and I think that's, I mean, that's something I'm looking forward to.I do think, like, like, you know, I, I think that the fundamental exercise to work through for yourself is if you start a new, um, sort of. Uh, disruptor company. Now there's a, there's a big incumbent that everyone knows, like, like superb base. Super base is like, kind of like the Postgres, like database, uh, incumbent.If you wanna start like new superb base, how would you compete with them? And I don't necessarily have the answer, but I, I, I do think like people, like resend like relatively new. I think they would start like 20, 23 and still there was, there was a recent survey where like, people. Checked what Claude recommends by default.If you just don't prompt it with anything, just say, gimme an email provider and says, resent as in like 70, 70% of each cases. Like the fact that you can get in there with like such a relatively short existence, I think is, is encouraging.[00:15:14] Jacob Effron: Yeah.[00:15:14] swyx: I do think like. Um, you do want to do whatever it is to, to like to, to get in that Very short mentions this because, um, it's not gonna be 20 of them, it's gonna be like three.[00:15:26] Jacob Effron: No, definitely. It feels like, uh, you know, probably more, more consolidation than ever. Uh, or, or kind of like, you know, uh, a winner take most market than maybe the, the, the physics of go-to market in the past. Yeah. Might have, uh, enabled.[00:15:38] swyx: The other thing also is like, semantic association is gonna be very important, uh, in the sense that like, you want to do like the combo articles where you're like, use my thing with for sale, with blah, blah.And like that all gets picked up in a, in a corpus. And so that's. Probably one thing that you, you wanna do? Well, I don't know what else. Uh, it's, it's, it's, it's one of those things where like, I think I feel, I feel I'm behind, uh, I don't know how you feel about this, but like,[00:16:04] Jacob Effron: I think AI is just everyone constantly feeling like they're behind some, uh,[00:16:08] swyx: yeah.With,[00:16:09] Jacob Effron: I wanna meet the person that doesn't feel behind,[00:16:11] swyx: but like with, with ax, right? Like, so, so like, my, my stance was that exactly what I said before, like everything that you, that you should do for agents is something that you should have done for humans anyway. Yeah. And so. To the extent that you're just getting it more energy to, to do things for agents, great.But like, uh, it's hard to articulate what new thing apart from just like more spam, um, that you should be doing. Anyway, that would be my take right now. Um, I I, I do think like there, there will be more turns at this. I think the personalization turn that is coming, um, will be big. And I don't know what that looks like because like basically we're kind of, we feel kind of tapped out on the memory side of things.[00:16:49] Jacob Effron: Yeah. I, I guess since we last chatted, you know, you, you took this role over at cognition, um, and you've obviously have a, have a front row seat to the AI coding space today. You know, I feel like coding in many ways. You know, people view it as this, like, I mean, besides being like the, the mother of all markets and this massive opportunity, I think it's kinda a preview of like, what's to come for many other spaces.Both. Yeah. You know, I feel like agents are most advanced in coding. I also feel like the, you know, competition between foundation models and application companies, you know, and, uh, mirrors what we may see in other spaces. And so maybe for our listeners, can you just lay out like what is the state of the AI coding wars today?[00:17:25] swyx: Um, it is massive, right? Like, uh, and I don't think necessarily, last time we talked about this, we appreciated the size of what[00:17:32] Jacob Effron: No, I wish we did.[00:17:33] swyx: I state of AI coding wars today, um, both opening eye philanthropic have made it their p serials to competing coding. Um, and. Tropic is like 2.5 billion in a RR just from Cloud Code.The way they recognize a RR is. Opt for debate, uh, open ai. I don't think the, a public number is known, but let's call it 2 billion as well. And then cursor is like, rumored to be 2 billion, you know? And, and those, those are like the public numbers that are known? Yeah. Um, so like huge markets that have just been created in the past one year.Like, like anthropic, just like Claude Code just recently celebrated their one year anniversary, which is, yeah, pretty nice. Um, so, and then I think, like the other thing that I see is there's, there's some other people who are like, oh, here's like the, the sort of relative penetration of, uh, Claude use cases, right?Like, and it's like coding 50% and then legal, whatever. Health, uh, it's like the, the remaining ones. And there was a very popular tweet that was like, okay, I'll look at the, the empty space and all these other use cases. If you are a new founder today, you should be betting on the other stuff because on, on a sort of catch up Yeah.Theory and my. Consider my, my pushback is the same pushback that, uh, I had on app over Google, which is like, well, well why is this time different? Like, why, if it went from let's say 10 to 50% in the past year, why can't I keep going? Uh, and like getting that wrong is actually a very painful one because you could have just did, did the momentum bet.Instead of the mean reversion bed. So I, I, I think that that is the, the state of things now that people are very, very much into psychosis. Um, they're are getting rewarded for spending more rather than spending less. And I think we're not in that phase of efficiency. We're in a phase of sort of like capability exploration.So I think people who are more crazy, who are more. Uh, creative, um, get rewarded comparatively. Yeah.[00:19:27] Jacob Effron: Well, it's interesting. I mean, it feels like behind these like token maxing, leaderboards and whatnot is this, it's like the first phase of this transition from a workforce perspective is you just gotta show your employer like, Hey, I, I use these tools.[00:19:37] swyx: Here's my nu number of tokens I cost, and that's it. They don't care about the quality. Right. It is, uh, maybe distasteful to someone who cares about the craft and, and all that. Um, but directionally everyone just wants you to go up regardless. And so, um, there it is not very discerning. It's, and it's probably very sloppy, but I think it's net fine because we're still probably underusing ai just in generally.Yeah. Um, and so I think that's like very interesting. Like we had on the podcast, uh, Ryan La Poplar from OBI, who spends a billion tokens a day. Yeah. Um, and that's for those county home, it's like something like 10,000 worth, $10,000 worth a day of API tokens. If they, they did market rates, um, and like most of us can't afford that.Yeah. But like. And, and, and probably a lot of what he does is slop.[00:20:25] Jacob Effron: Right.[00:20:25] swyx: But like, he's going to dis, he's like, if there were a new capability, he would discover it first before you because he was, he was trying and you were not trying. Right. And like, you only do things that work like, well, good for you.But like the, the people who are going to discover the next hot thing are living at the edge.[00:20:42] Jacob Effron: Right and increase in living at the edge of just having the compute budget to like run these experiments. I mean, kind of similar to what living at the edge on the research side has always been. You know, it was constrained in many ways by the amount of compute you had to run these experiments.It feels similarly on the, almost on the builder or like actualizing these tools now.[00:20:56] swyx: Yeah. The other thing that's, I mean, very obvious is philanthropic is kind of like the high price premium player. Um, that where, you know. Restricting limits or restricting model releases even is like the name of the game.Whereas Codex is like, come on in guys, use our SDK, use our login and we don't care. We're gonna reset limits. Whatever you do want to try to exploit the subsidies where you can get it. And definitely Codex is super subsidized right now. Gemini also very subsidized. Um, and. Comparatively, like, I think you should make, Hey, I guess while, while that's going on, it's not that bad to be a capabilities explorer on just the $200 a month plan from Cloud Code or from OpenAI.Um, and, uh, I I, I, my sense is that people aren't even there yet.[00:21:41] Jacob Effron: How do you think this, like, market ultimately plays? I mean, it's obviously such a big market that, you know, any slice of that market is interesting for, for anyone going after it. But I think what, what makes people so interesting in the coding market particularly is it feels like it's kind of this.Foreshadowing of what will happen in other, you know, any other kind of application market that the foundation models eventually turn to and are all their models against and gather data around. And so how do you think, you know, like does there end up being room for lots of different kinds of players or like, what do you think the end state of this market is and is that, do you think that's applicable to other markets?[00:22:10] swyx: I feel like there will be, I mean. Status quo is probably the most likely outcome, which is there are two big players and there's a small range of longer tail people that, um, fit other use cases that the, the two big players don't. That feels right to me. I think that, um, for it to, for the market structure to, to significantly change there would be, there needs to be significant change in like the economics or like the, the brand building or like the, the, the, the value propositions of the, of the companies involved and I.Haven't seen any in the last six months that, that have really changed the stories materially. So I feel like they would just keep going until something, something else happens. Something else happens, meaning like Microsoft wakes up and like goes like. Guys, we have GitHub, we have, uh, you know, we, we, we'll, we'll do something much bigger here than other, other than just copilot.Um, and, uh, that would be a big change. Um, MSL has put out a model now, and I was in a breakfast with, uh, Alex Wang, where they were like, yeah, like, we, we really, really want to go after the coding use case. We haven't done anything yet, but like, don't underestimate them. Right. Um, and, and similarly for the Chinese labs.Um, I think they're trying to go after it. Like ZAI is doing stuff. GLM uh, ZI and GLM is same thing. Um, uh, and, and so it's, so like everyone's trying to get a piece of that pie. I, I feel like the, the status quo has been pretty stable for the past, like almost a year I'll say.[00:23:39] Jacob Effron: Yeah. And is the room for the, not like, you know, for, for the application companies more on like the enterprise side or like where do the, where do the, like what surface area do the model companies leave for application companies?[00:23:50] swyx: Yeah, that's a good one. Um. It's very much evolving. Um, it, I, I, I will say because opening I did not have this, the, this level of attention on coding. Yeah. Uh, a year ago. We just don't have that much history. Right. Um, and it seems like, for example, so the big push at Open I now is the Super app. Um, is that a consumer thing?Is that like a products like. Portfolio rationalization thing, how much is that gonna take away attention from coding at the time when they actually do want to put more coding? I think it's, it's very unclear. So I do think like there's, there's all these, like in both big labs, there's. Uh, sorry. Both of the, and, and drop and, and deep minus and XAI are are separate cases.Um, they are trying to see the other time expansion areas. So cloud code for finance. Yeah. Um, uh, cloud cowork, all those, all those things. Whereas I think cursor and cognition are like comparatively just focused on coding and so I, I do think they leave space and I do think for the other verticals that also means the same thing.Right. That, uh, that they're not gonna be that. Um, intensely focused on, on, on that domain. Except for, I, I think I would mark out finance and healthcare as like the next ones, um, that they're clearly going after. Uh, I, I would say comparatively, healthcare seems more thorny. There, there, there've been some announcements about it, but like, I would respect the, the finance work a lot more just because like the, the path to money is a lot clearer.[00:25:12] Jacob Effron: Yeah, no, I mean, obviously like, I, I think, you know, maybe similar to, to the space that's being left in these other domains, you know, there's obviously. Uh, a lot that's required to actually implement these tools in enterprises, uh, versus, you know, maybe just giving them, uh, giving model access to, to folks outta the box.[00:25:27] swyx: Yeah, yeah. Yeah. So the, the agent lab thing is like, we'll do the last mile for you. Whereas I think the model labs tend to just trust the model and, and be minimalist about it. Both of them work.[00:25:38] Jacob Effron: Yeah.[00:25:38] swyx: I, I don't, I don't necessarily think one, uh, beats the other, uh, for every, for every use case. Um, all I, all I do know is that it does seem like.Uh, the large enterprises do want a dedicated partner that isn't just the model labs, which is kind of interesting.[00:25:55] Jacob Effron: We, we've been in this phase of, of pure capability exploration. And so I think nothing has been, you know, better for the large labs, right? I mean, they're always gonna be, uh, uh, the frontier of, of capability exploration.And so I think have a very good relationship with a lot of these enterprises. But ultimately over time, like. The, uh, the incentive structure of these labs is always gonna be maximal, you know, token consumption for, uh, for the end customers they work with. And there's just, I think, so few companies that have actually gotten to massive scale.Maybe coding again is the most interesting. So it's the first space that really is just completely gone, you know? Yeah. You must love it every day. Like absolutely insane. And. I think it[00:26:32] swyx: gets even. Okay. I mean, like, I think we, we say good things about crystal cognition, but the sheer liftoff of like both end UPIC and open ai.‘cause they, they, they have independent valuations. I mean, let's throw an XEI in there because it's now I ping at 1.2 trillion. That number is just mind boggling. Like I, I feel like in normal investing or normal startups, there's kind of like a ceiling market cap or valuation. Totally. That, that like you, you reach and you go like, all right, let's, it's gonna be chiller from now on.And these guys are not slow down. No.[00:27:02] Jacob Effron: Well, I also think the dynamic is fascinating about some of these later stage companies is, is, you know, in the past, I feel like in, in venture world, if you got to a certain level of scale, the question around you was really more a valuation question. And this is like why there was different phase, like, you know, types of venture people did and like the late stage growth people were just incredible at like, you know, a little bit of what's the ultimate market opportunity of this company, but also what's the right way to, to value it.Like we know it's, it's in some bands of an outcome that is like. Sure there's some variance to it, but it's like relatively understood what that bands is and then maybe you get over time surprised to the upside. Whereas any kind of like later, even the labs themselves, any later stage company, the bands of which that company might be worth right now, even in a year or two years are so massive because of how fast the ecosystem changes that it's like.Even for later stage companies, every three months could be an existential level event to the upside to the downside. Yeah. Um, and I think that, like, you are obviously seeing it in the, in the positive with code, which, you know, if you think about a company like philanthropic, you know, that. For a while, it was like unclear if they were going to have access to enough capital, um, to really stay in the, in the race, right?And then coding hit at the exact right time. They had the perfect model for it. They executed brilliantly. Um, and you know, now are, are, you know, uh, you know, one of the most valuable companies in the world.[00:28:13] swyx: Uh, at the same time, I, I don't find, I, I have zero sympathy for opening eye because they're crushing it and they're all rich.You know, this is like a high class champagne problem to have to, uh, to be number two at coding or whatever. Like, who cares? Like, you're, you're doing great.[00:28:27] Jacob Effron: Yeah. It's funny though. I can't even, I mean, you would be closer to this, uh, you know, even that you're in the AI coding space, but it's like a lot of people I talk to think Codex is just as good, if not better than Claude Code.Right. I think one thing that I've been really surprised by, and maybe, maybe Cloud Code is a better product in some ways, I'm curious your thoughts is just in consumer AI with chat GBT. You saw this big first mover advantage, right? Where admittedly today, like, I don't know, Claude Gemini. Great products.Not sure, not abundantly clear chat GBTs any better, but like. People stick with chat, GBT, it's the first thing to introduce them.[00:28:56] swyx: They stay, but they're not growing anymore. I don't know if you've seen[00:28:59] Jacob Effron: Right. But that to me is more of like a, a, a product problem than it is. They're not like, it's not like they've like lost share to someone else.My understanding is the overall problem with consumer AI today is much more of a how do you take this tool and, you know, for, for folks like us, like knowledge workers, it's like this incredible magic tool, but it's not necessarily a daily active use tool for a lot of people around the world today. And what are the like products?It's, it's kind of a category wide problem. Like in coding, for example, like. The entire space has gone parabolic. There may be some relative growth in, uh, in other consumer AI players, but it's not like consumer AI as a category is like going parabolic and they're not capturing most of that thing. I think it's actually the larger problem is much more, hey, the category has kind of hit a bit of a plateau of people haven't figured out how to bring, you know, tons more users on board.Yeah, yeah. Or increase the frequency of those users. And so it seems more of a category wide problem than it is, you know, a massive market share of change. I was gonna draw the comparison to, to the coding space where Claude Co is the first product, obviously, to introduce people to this magical experience.You know, by all accounts, codex is, is pretty damn close to as good, if not better. Um, but like still that first product, you, you would've thought that would not be a super sticky, uh, you know, product surface area. And it actually has, it turns out, I, it feels like the first lab to introduce you and experience really does, uh, keep a lot of, uh, a lot of the focus.[00:30:12] swyx: I, I think. M maybe it's like still, still early days. You know, Chad, BT is like three plus years old and Yeah. Cloud code is only one. Just turned a year. Yeah. So give it time, you know? Yeah. Like, yeah. I mean, definitely sometimes a lot of people have switched from to Codex. Maybe that will keep going. I, it's like really hard to tell.Uh, yeah. I, I, I do, I do think that. Because we are in this like, high volatility, high temperature phase. Um, the loyalty and stickiness to first movers and category creators, I don't think is as high as it might be in some other, uh, areas in our careers that we've looked at.[00:30:47] Jacob Effron: Yeah. Though, I mean, I've been surprised by the cloud code thing.I, I would've thought that, like, in many ways I always worried about the[00:30:52] swyx: enterprise. You think you would've been gone by now?[00:30:53] Jacob Effron: Not gone. But I would've, I I always worried that the, that the consumer business of these companies would be quite sticky. And then the enterprise API business. Uh, was actually like, you know, in some ways like your least loyal buyers, like they would, they would move to,[00:31:05] swyx: right, right.But, but they worked out that it wasn't the enterprise API it was enterprise product.[00:31:09] Jacob Effron: Totally. And maybe that was the, that was the secret that like, but the amount of lock-in or just default behavior that has happened in that space, uh, is, is more than I might've imagined with two products that by all accounts are pretty damn similar.Yeah.[00:31:22] swyx: No fight there. Uh, I will say I do think that Codex is still in like a catch up. Like in terms of personal experience. Um, the only thing I like out of, out of Codex is the, is like Spark and like yeah. Uh, the, I, I feel like the skills integration is a little bit better. I feel like, uh, the, the speed is a bit better.Maybe ‘cause it's in, is written in rust or whatever. Um, very minor things that you like. Almost like telling yourself rather than like objectively assessing between two, two of them. I, I, I do think, like vibes wise, I think that's going on. Um, the, the, you know, I, I feel like the, the missing questions, uh, in, in this whole debate is like, why is this so concentrated in only two names, right?Yeah. Like, um, how, where, like, where is the Gemini? You know, presence, where's the Xai presence? Um, and like they are trying, it's just they haven't made that much progress yet.[00:32:12] Jacob Effron: But what the, what the Claude Co moment does show, and it actually in some ways makes you a little more bullish on the potential for someone else to catch up because it does feel like if you're the first person to introduce some magical net new product experience, that that actually might be stickier than one might have imagined.[00:32:27] swyx: Right, right, right. Okay. Yeah.[00:32:28] Jacob Effron: And so it's, everyone can believe they have shot[00:32:29] swyx: that. What do you think that new product experience might be like? I, I, it's, it's like, and this is a failure of imagination on my part. Like, I always wonder, like, people always say this like, well, the, the thing that will save us is like being first to the next new thing.Like what is it?[00:32:41] Jacob Effron: Yeah.[00:32:42] swyx: It's like,[00:32:45] Jacob Effron: I dunno, something around like, uh, consumer agent, computer use, like hybrid. I think, obviously, I think we're like scratching the surface on the consumer side.[00:32:53] swyx: So my, my current theory is like the. Open claw is like a vision of things to come.[00:32:58] Jacob Effron: Totally.[00:32:58] swyx: Um, and uh, it's good that O open I has like the association with open claw, but by no means do they have the rights to win it.The general thesis that I have been pursuing now is that the year the same way that 2025 was the year of coding agents, 2026 is coding agents breaking containment to do everything else. Um, and so coding agents continue to still win, but because they generate software and software eats the world, so like, it's kind of like the trans.Associated property of like software, eat the world, coding agents, eat software, therefore coding agents eat the world. Um, which is like an interesting,[00:33:30] Jacob Effron: yeah, and breaking containment always an easier phase phrase in the consumer context than the enterprise one. You've seen people run these really cool, uh, experiments in their own personal lives.I think like,[00:33:37] swyx: yes.[00:33:38] Jacob Effron: Figuring out, you know, how you, obviously everyone's focused, you know, on the enterprise side now around how you create these experiences. I feel like the vibes, you know, people love to have these narratives of like, everything is completely shifted. It's like I actually, you know, open AI.Organizationally, uh, you know, volatility aside is, you know, great products, great team, great models like everyone else in the world is incentivized for there to be. Two, three more. Everyone would love more like great model companies. And so I feel like the, the natural forces of the world revolt when any one company, you know, is too much the star of the show, right?There's so many people in the ecosystem that are incentivized for that not to happen. And so I think I'd be shocked if we don't have. Uh, uh, reversion of vibes, not maybe completely the other way, but at least a little bit more equal at some point over the next six, 12 months.[00:34:24] swyx: I, I think there's just a kind of different stages when, when you talk about the world, one wanting more model companies, I talked think about like the neo labs.[00:34:30] Jacob Effron: Yeah.[00:34:31] swyx: And I mean, I don't know, is it fair to say none of them have really broken through in the past year?[00:34:35] Jacob Effron: I think that's totally fair,[00:34:37] swyx: which is rough. Um, and well, how are we gonna, how are we gonna grow that diversity in, in, in choice, like. Um, that's, this is it.[00:34:46] Jacob Effron: Yeah. It'll be really interesting to see what, what, what ends up happening with that.And you've seen, you know, folks like Nvidia, you know, very incentivized to make sure there's, there's a broader platform of, of other model providers.[00:34:57] swyx: I think, uh, I don't know people say this, but I, I, I don't think they try it hard. Nvidia tries harder to build neo clouds[00:35:05] Jacob Effron: Yeah.[00:35:06] swyx: Than neo labs.[00:35:07] Jacob Effron: Well, they try pretty damn hard to build neo Cloud, so[00:35:09] swyx: that's,[00:35:09] Jacob Effron: yeah.[00:35:10] swyx: But like, you know, let's call it like the, the core weaves of the world, much happier place in the, you know, than any neo lab built on top of them.[00:35:18] Jacob Effron: Yeah. That one might argue it's, it's easier to, to enable a neo cloud to be successful than it is. Uh, you can't will a neo lab into existence the same way you, soNvidia[00:35:25] swyx: has more direct control over it.Uh, for sure.[00:35:27] Jacob Effron: What else is kind of catching your eye today on the startup side? I mean, you worry, there's obviously this whole narrative of like, you know, the foundation models, you know, they announced a product and every stock goes down 15%. Like[00:35:36] swyx: Yeah.[00:35:37] Jacob Effron: Do you, do you worry about the foundation models just kind of eating into to a bunch of these startup categories?[00:35:43] swyx: Not really. I, I think actually like. As, uh, there's, there's, okay, there's, there's, there's the, there's the point of view of like being an investor in startups, and there's a point of view of like, do you wanna start something? And I think honestly, like the, the downside for all these is so. Minimal in, in a sense of like, the worst you do is you just get hired into one of these labs anyway.So I, I think the, the market for people who just do things and try things and try to execute in like a competent way, even if like it doesn't work out commercially, even if it just wasn't that great anyway. Like, but like that's your job interview to go into, into one of these things anyway, so, um, I don't feel that.From a, from a very, very small startup perspective, mid-size startups. Yes. Uh, I will say there's been a lot of dead, um, LM Infra, a lot of LM infra consolidation like the, the, uh, lang fuses of the world getting absorbed into, into click house. And I, I think. Like people have maybe worked out the domain specific playbook, uh, and like, I think that's okay.Um, and, and yeah, I'm not that, not that worried about, uh, okay. So, um, I, I would say I'd be more worried about traditional SaaS, like low NPSS. This is the whole AI versus SaaS debate that has, that's been going on. Uh, and, and like literally I'm going through that exact thing in my company where, so I like kind of.Thinking through this on a very visceral, visceral level, right? On one hand you have the people who say you vibe coders don't appreciate the amount of work that goes into A-A-C-R-M and like, yeah, you think you can rip out Salesforce? So did the 30 entrepreneurs before you, right? Like, like, you know, you classically underestimate the things that you don't.Deeply, no. And, and, and target audience is not you. Uh, at the same time, like we have never been able to build software so easily and customize software so easily and like Yeah, you're not gonna use 90% of the things in Salesforce. So like, yeah. What's the typical, so what have you, what[00:37:33] Jacob Effron: have you done internally?[00:37:34] swyx: So we have there the main SaaS that we do for event management and sponsor management. That's, and we paid 200 KA year for that. Not, not huge, but like chunky for, for, for my, my scale. Um, and like, yeah, I could probably spend 2000 and, and build like a custom version of that. Um, the, the, the trick has been dealing with my, the rest of my team and getting them on board.Yeah. ‘cause I'm the most ethical person on my team, but like, I can't make that decision myself. And I think in the same way I've been telling with other CEOs team leaders as well, it's like, well you can be super cloud pilled. You can be super LM psychosis and that you think that's okay, but you like you have to bring your team with you.And I think like there, the sort of widening disparity in LM psychosis in companies is causing real s real riffs because. And on one hand, on one hand, the people who are less AI native are not getting with the picture. They're not, they're actually like behind, they're actually not waking up to the fact that like you, everything you think is necessary is not actually that necessary.And in fact, exactly would be better of you if you just like held your nose and went in and when came out the other side. Yeah, only talking to agents in natural language and like your life would actually be better and you just, you're just like close-minded. There's that perspective. The other perspective is, oh, you vibe coder.You, you did this in a weekend and you got the 80% solution and now the rest of your employees. Have to pick up the rest of your s**t, right, that you, that you thought you were, you were such hot, amazing, uh, uh, at, but like, actually you didn't figure it out. And like, actually LMS are still useless at this and blah, blah, blah.So like, I think there's this huge debate going on in every company right now. Um, and like, um, you know, I have a small microcosm of it, but like, yeah, it, it's making me hesitate to, to pull the trigger. But like I will at some point, it's like maybe I've put it off for one year, but not like five. Yeah, but like, so, so like SaaS is definitely getting squeezed.Um, it does make me wonder, like, I, I do think that there's an opportunity for a more AI native, um, system of record thing that is not just Postgres. Um, or not just MongoDB, although both are very good. Maybe it's like a convex or like people Yeah. Bring up convex a lot. I don't know, like, like, I, I just feel like the sort of quote unquote firebase of, of AI apps isn't really a thing yet.Um, beyond what we have. Uh, which, which is fine. It's, it's, it's just. We could probably start in a more sort of rapid iteration cycle first before scaling up to like a Postgres or MongoDB, which are more sort of old tech. I was at a dinner with, uh, Mike Krieger, the CPO of en philanthropic, and, and he, we were just kind of going around the room going like, what are people most worried about?Yeah. And, uh, for me, uh, I, instead of security, I brought up biosafety. Yeah,[00:40:21] Jacob Effron: classic.[00:40:22] swyx: Um, actually, like I said, it was. Cliche and classic, and the rest of the table were, were like, what do you mean? Someone sitting at home can manufacture a virus that wipes out half of humanity,[00:40:32] Jacob Effron: almost like the OG Jeffrey Hinton.Like, this is why you should be scared.[00:40:35] swyx: I'm like, yeah, like the read the, you know, risk reports. Like this is like the thing. Um, I think, and Mike was just sitting there knowing he was sitting on Mythos and going like, actually it's security. Um, and I think like, um, I think the, there's, there's, part of it is.A very good marketing. Like too good. Yeah, like I would actually advise and topic to tune down the marketing because also it's, it is just a very good model and you don't have to make so many marketing claims around it. At the same time, it is not really a private model. If you give it to 40 companies.Each of whom have like 10,000 employees or whatever. Right. It's not, it's not private, it's, it's like there's bad actors in there.[00:41:18] Jacob Effron: Yeah. Hopefully, hopefully not as, uh, as bad as releasing it widely, but, uh, no, I mean, it's an interesting. You know, it's an interesting case study for how all, I mean, many model releases might, I mean, you know, this might be the first model release that looks like the rest of ‘em from from now on, right?[00:41:31] swyx: It, it, so it's, it's the, there's an overall product strategy, uh, for anthropic of like bundle, uh, you know, restrict access bundle, uh, product with model maybe.Whereas, uh, OpenAI has definitely been a lot more sort of. Philosophically aligned on like, we will just enable access everywhere and we don't know what you, what will come out of it. Right.[00:41:51] Jacob Effron: Right. Though, I mean, this current moment, uh, obviously the cynical take is also just ties to the amount of compute that both companies[00:41:56] swyx: Yeah.Right, right, right. Yeah, I think, I think that's true. I I do think like the, the, this is the, the, the scale, the dawn of like larger than 10 trillion parameter models is very interesting. I don't think it, I think it's a temporary phenomenon because we have much larger compute clusters coming online for everyone over the next like three, five years.It's, and this is like already written in, in the cards.[00:42:18] Jacob Effron: Yeah.[00:42:19] swyx: So to the extent that like, you know, will we have rationing of models, uh, above 10 trillion, uh, in like two years? I don't think so. I think everyone will have no, we'll just[00:42:29] Jacob Effron: have rationing of the next phase.[00:42:30] swyx: Right. Right. But like, that's as it should be almost like, um.My, my classic example, which I, this is just me theorizing, not anything confirmed by Google. When Google announced Gemini, they actually announced three sizes, which was Flash Pro Ultra. They never released Ultra. They only have Pro and Flash. Um, so my theory is they have ultra sitting in a basement and they just could distilling from it for, for flashing pro.Um, which like, yeah, I mean, I, I actually think that's. As it should be for any lab that they, that they do that.[00:43:02] Jacob Effron: Yeah. Just because those are the models that people actually wanna end up using. And it's just like cost prohibit.[00:43:06] swyx: It is more, yeah, it's cost. Yeah. It's, it's not the want, it's just, just, just the cost.Um, I do think, like, uh, it is interesting that, uh, for a while I was, I was considering the theory that models capped out at two, 2 trillion, and I think that's proving to be wrong. And well then if I'm wrong, how wrong? How wrong am I? Do we do 200 trillion? Do we do two quarter trillion, whatever? Um, and I don't think we have the straight answer to that, but like, uh, it's interesting that we are continuing to scale number of pers when everyone kind of assu like can see that we're not going to get like the next thousand or 1 million x from this paradigm.So like the others, like the alias of the world are working on other. Um, model architecture improvements. We need a different scaling law, I guess, because like, we're, I, I feel like people already already feel like we're tapped out on this. Like the, the end, the end state of this is we turn most of the world into data centers and like, I don't know.I don't know if we want that.[00:44:08] Jacob Effron: Yeah, I mean, uh, if the, if, if, if the return of intelligence are there, maybe, uh, maybe not so bad.[00:44:13] swyx: I, I, I think there, there's just a sheer amount of like, like un scalability that like is wrangling people's sensibilities right now. Um, especially in terms of like context lengths.Um, my classic quote is that context length is like the slowest scaling factor in, in lms.[00:44:30] Jacob Effron: Yeah.[00:44:30] swyx: Um, we, like, we took maybe. Three years to go from like 4,000 context length to a million and that's about it. Yeah. Like Gemini has had a million token context length for two years now. Um, and no one's using it.Like, so like yeah, it's memory. Memory is probably gonna be the, the biggest limiting constraint on all these things.[00:44:50] Jacob Effron: Yeah. Certainly seems that way. I guess I'm curious over the last year since you recorded last, like what's one thing you've changed your mind on?[00:44:57] swyx: I feel like I was kind of bearish on open models like last year.Um, in a sense of, like, I, I had just done the podcast with an Al[00:45:07] Jacob Effron: Yeah.[00:45:08] swyx: Of Braintrust where he, and he, I mean, you know, he has a good cross section of all the top AI companies and he says market share of open source is 5% and going down. Um, I think that's changed. I think it's going up. Um, and even if,[00:45:22] Jacob Effron: even though the capability gap does seem to be increasing.Spending on the[00:45:26] swyx: time. It's hard to tell. Yeah, it's, it's really hard to tell. ‘cause like, okay, for, for listeners, capability gap increasing is like on public benchmarks. And let's say you're comparing mythos versus like, I don't know, G-T-O-S-S or like GLM 5.1. And, um, it's, it is really hard to tell. ‘cause even if they were closing, you will also not believe that they were closing that much because it's very easy to gain the benchmarks.Yeah. So you just don't really, really know. Um, all you know is like. Uh, there's somewhat objective open router stats on like what people choose in a free market. And people do choose some of these open models in significant volume, except that a lot of them are heavily discounted. So you need to kind of like price adjust, uh, these things.So even if, even if that were true, which I, I'm not sure, like I, I, I feel like the numbers just up now instead of down. Uh, I think the. Separation between what the top tier agent labs
Best-known as the creator of ImageNet, we meet the Godmother of AI, Dr. Fei-Fei Li In the latest installment of our oral history project. She's a Chinese-American computer scientist and the creator of ImageNet - the dataset that made rapid advances possible in this field of AI that helps computers take meaningful information from things like photos and videos.We Meet: Stanford University's Fei-Fei Li, author of "The Worlds I See: Curiosity, Exploration, and Discovery at the Dawn of AI" and the founder of World LabsCredits:This episode of SHIFT was produced by Jennifer Strong with help from Emma Cillekens. It was mixed by Garret Lang, with original music from him and Jacob Gorski. Art by Meg Marco.
We've been on a bit of a mini World Models series over the last quarter: from introducing the topic with Yi Tay, to exploring Marble with World Labs' Fei-Fei Li and Justin Johnson, to previewing World Models learned from massive gaming datasets with General Intuition's Pim de Witte (who has now written down their approach to World Models with Not Boring), to discussing the Cosmos World Model with with Andrew White of Edison Scientific on our new Science pod, to writing up our own theses on Adversarial World Models. Meanwhile Nvidia, Waymo and Tesla have published their own approaches, Google has released Genie 3, and Yann LeCun has raised $1B for AMI and published LeWorldModel.Today's guests have a radically different approach to World Modeling to every player we just mentioned — while Genie 3 is impressive, its many flaws demonstrate the issues with their approach - terrain clipping, noninteractivity (single player, no physics/no objects other than the player move), and maximum of 60 second immersion. Moonlake AI (inspired by the Dreamworks logo) is the diametric opposite - immediately multiplayer, incredibly interactive, indefinite lifetime, capable of MANY different kinds of world models by simulating environments, predicting outcomes, and planning over long horizons. This is enabled by bootstrapping from game engines and training custom agents: In Towards Efficient World Models, Chris Manning and Ian Goodfellow join Fan-Yun in explaining why their approach to efficiency with structure and casuality instead of just blind scaling is sorely needed:SOTA models still show physical or spatial understanding glitches, such as solid objects floating in mid-air or moving “inside” other solid objects.If the goal is to plan for the next action, how often is a high-resolution pixel view necessary for modeling the world? Our bet is that there is a disproportionately large share of economically valuable tasks where such detail is not required. After all, humans with a wide variety of sensory limitations have little difficulty doing almost everything in the world. Furthermore, for a large number of purposes, describing a scene or a situation in a few words of language (“the car's tires squealed as it cornered sharply”) is sufficient for understanding and planning.Experiments also show that humans only partially process visual input in a top-down, task-directed way, often making use of abstracted object-level modeling. In almost all cases, partial representations combined with semantic understanding are sufficient.…If the goal is to facilitate the understanding of causality in multimodal environments, then the world model—whether it is used in the virtual world or the physical world—must prioritize properties such as spatial and physical state consistency maintained over long time periods, and an ability to evolve the world that accurately reflects the consequences of actions. That's what Moonlake is building.Game engines are the right starting point abstraction to efficiently extract causal relationships, and building the interfaces and community (including their new $30,000 Creator Cup) to kickstart the flywheel of actions-to-observations.We were fortunate enough to attend their sessions at GDC 2026 (the Mecca of Game Devs), and were impressed by the huge variety and flexibility of the worlds people were building with Moonlake's tools already! Live videos on the pod.Full Video Pod on YouTube!Timestamps00:00 Benchmarking Gets Hard00:47 Meet Moonlake Founders01:26 Why Build World Models03:12 Structure Not Just Scale05:37 Defining Action Conditioned Worlds07:32 Abstraction Versus Bitter Lesson14:39 Language Versus JEPA Debate20:27 Reasoning Traces And Rendering Layer37:00 Gameplay Over Graphics38:02 Fiction Rules And World Tweaks39:15 Code Engines Beat Learned Priors41:10 Diffusion Scaling Limits43:23 Symbolic Versus Diffusion Boundary46:14 Platform Vision Beyond Games50:24 Spatial Audio And Multimodal Latents54:23 NLP Roots Hiring And Moon Lake NameTranscript[00:00:00] Cold Open[00:00:00] Chris Manning: Think this whole space is extremely difficult as things are emerging now. And I mean, it's not only for world models, I think it's for everything including text-based models, right? ‘cause in the early days it seemed very easy to have good benchmarks ‘cause we could do things like question answering benchmarks.[00:00:20] But these days so much of what people are wanting to do is nothing like that, right? You're wanting to get some recommendations about which backpack would be best for you for your trip in Europe next month. It's not so easy to come up with a benchmark, and it's the same problem with these world models.[00:00:41] Meet the Founders[00:00:41] swyx: Okay. We're back in the studio with Moon Lake's, two leads. I, I guess there's other founders as well, but, sun and Chris Manning. Welcome to the studio.[00:00:54] Fan-yun Sun: Thanks. Thanks, Chris. Thanks for having us.[00:00:56] swyx: You've got, you guys have, come burst onto the scene with a really refreshing [00:01:00] new take of mold models.[00:01:01] I would just want to, I guess ask how you, the two of you came together. Chris, you're a legend in NLP and just AI in, in, in general. You're, you're his grad student, I guess[00:01:10] Fan-yun Sun: Actually my co-founder.[00:01:11] swyx: Oh, yeah.[00:01:12] Fan-yun Sun: I should give a lot of credit to my co-founder, Sharon. Yeah. She was, she was actually working with Professor Fe Androgyn and then she ended up working with, Ron and Chris Manning here.[00:01:22] And then, so I got connected through to Chris initially, actually through my co-founder,[00:01:26] What is Moon Lake?[00:01:26] swyx: what is Moon Lake? What, what is, actually, I'm also very curious about the name, but like why going into world models?[00:01:33] Fan-yun Sun: So I was working a lot. With actually Nvidia research during my PhD years on essentially generating interactive worlds to train reinforcement learning agents or embody EA agents.[00:01:44] And then there's two observations. One in academia and one in industry. An industry like folks at Nvidia are actually paying a lot of dollars to purchase these types of interactive worlds, whether it's for the sake of evaluation or training the robots, or policies or models. And [00:02:00] then, in academia, same thing is happening.[00:02:02] And more specifically, when I was actually working with Nvidia on the synthetic data foundation model training project, we were actually generating a lot of these synthetic data and showing that, hey, you can actually, these synthetic data are actually as useful as real world data when it comes to multimodal pre-training.[00:02:16] But then, like I said, there's a lot of dollars being paid out to like external vendors or, or like. Other folks to manually curate these types of data. It was very clear to us that, okay, on our way to, let's call it embody general intelligence models need to learn the consequences behind their actions, which means that they need interactive data and the demand for those types of data are growing exponentially.[00:02:38] But everybody's sort of thinking about it from a pure, say, video generation perspective or something else. But we feel like the true actually opportunity is actually building reasoning models that can do these things, like how humans do these things today. So that's a little bit on the genesis of Moon Lake, and I think the reason I got into world models was partly.[00:02:59] A philosophical [00:03:00] take of the on the world where I like, believe the simulation theory and stuff like that. But on the other, on the other hand, it's really just like, oh, like there's an opportunity there that I feel like nobody's doing it the way I think should be done.[00:03:10] Structure, Not Scale: The Vision[00:03:10] Chris Manning: I can say a little bit about that.[00:03:12] Yeah. So of the overall goal is the pursuit of artificial intelligence and most of my career has been doing that in the language space and that's been just extremely productive. As we all know, the story of the last few years, I don't have to tell about how much we've achieved with large language models, but, uh.[00:03:31] Although they have been extremely effective for ramping language and general intelligence, it's clearly not the whole world. There's this multimodal world of vision, sound, taste that you'd like to be dealing with more than just, language. And then the question is how to do it. And despite, a huge investment in the computer vision space, right, as the research field computer [00:04:00] vision has been for decades, far, far larger than the language space, actually.[00:04:05] I think it's fair. Say that, vision, understanding sort of stalled out, right? You got to object recognition and then progress just wasn't being made right? If you look at any of these, vision language models, it's the language that's doing 90% of the work and the vision barely works. And so there's really an interesting research question as to why that is and at heart, the ideas behind Moon Lake are an attempt to answer that, believing that there can be a really rich connection between a more symbolic layer of abstracted understanding of visual domains, which aren't in the mainstream vision models, which are still trying to operate on the surface level of pixels.[00:04:50] swyx: I think one of your blog posts, you put it as structure, not scale. Is that, a general thesis?[00:04:57] Chris Manning: Yeah. Well, scale is good too.[00:04:58] swyx: Yeah. Scale is good. Too[00:04:59] lot,[00:04:59] Chris Manning: [00:05:00] lots of data is good as well and scale, but nevertheless, you want the structure Yeah. To be able to much more efficiently learn.[00:05:07] swyx: Yeah. The other thing I really liked also is you put out an example of what your kind of reasoning traces look like.[00:05:12] Right. Which you would distill is the word that comes to mind. I don't even think that's a good, good description, but it would involve, for example, geometry, physics, affordances, symbolic logic, perceptual mappings, and what, what have you. But like that, that is the kind of example that involves, let's call it spatial reasoning, role model reasoning as as compared to normal LM reasoning.[00:05:35] Yeah.[00:05:36] Defining World Models vs Video Generation[00:05:36] Vibhu: But also like taking it a step back. So how do you guys define world models? A lot of people see okay, you can do diffusion, you can do video generation. But, you guys put out quite a few blog posts. You put out a essay recently, we can even pull it up about efficient world models. You have a pretty like structural definition here, but for the general audience that don't super follow the space, right.[00:05:55] What's, what's the difference in what we see from like a video generation model to [00:06:00] a world gen A simulator? How do you kind of paint that last[00:06:02] Chris Manning: year? Yeah, so I think this is actually a little bit subtle because, people look at these amazing generative AI video models, SAWA VO three, one of these things, and they think Genie, they think, oh, this is amazing.[00:06:17] This is we've solved understanding the world because you can produce these generative AI videos, but. The reality is that although the visuals do look fantastic, those visuals actually are accompanied by an understanding of the 3D world, understanding how objects can move, what the consequences of different actions are, and that's what's really needed for spatial intelligence.[00:06:49] So I mean, a term we sometimes use is that you need action condition, world models. That you only actually have a world model if you can predict, [00:07:00] given some action is taken, what is going to change in the world because of it. And in particular, that becomes hard over longer time scales. So if you're simply, trying to.[00:07:12] Predict the next video frame. That's not so difficult. But what you actually want to do is understand the consequences, likely consequences of actions minutes into the future. And to do that, you actually much more of an abstracted semantic model of the world.[00:07:32] The Bitter Lesson & Data Abstraction[00:07:32] swyx: Yeah, the question comes where you want to have more structure than is available in just predicting the next token.[00:07:41] And typically, well, let's, let's call it the experience of the last five years has been that is just washed away by scale, right? So what is the right middle ground here that, you don't ignore the bitter lesson, but also you. Can be more efficient than what we're doing today.[00:07:57] Chris Manning: One possibility [00:08:00] is, look, if we just collect masses and masses and masses and masses of video data, this problem will be solved.[00:08:11] Under certain assumptions that could be true, but there are sort of multiple avenues in which it could not be true. The first is what's really essential is understanding the, the consequences of actions producing an action conditioned world model. And if you are simply, collecting observational video data, which is the easy stuff to collect, when you're sort of mining online videos, you don't actually.[00:08:41] Know the actions that are being taken to see how the video is changing. And so if you are never collecting directly actions and you are having to try and infer them from what happened in the observed video, that's not impossible. But it's very [00:09:00] hard and it's not really established that you can get that to work at any scale yet.[00:09:05] And so there's a lot of premium on collecting action condition video data, which is part of why there's been a lot of interest in using simulation so that you can be collecting data where you do know the actions, which isn't quite limited supply, but there's also in the limit of as much data as you could possibly have.[00:09:28] Maybe the problem is eventually solvable, but. Even though we collect huge amounts of text data is always at a great level of abstraction, right? Language is a human designed, abstracted representation where there's meaning in each token and it's representing and abstraction of the world, right?[00:09:51] As soon as you are describing someone as a professor, and as soon as you are saying that they're condescending, right? These are very [00:10:00] abstracted descriptions of the world. It's not at what you're observing as pixel level, and to get to that kind of degree of abstraction, starting from pixels is orders and magnitude of extra data and processing.[00:10:14] And so, although, we absolutely want to exploit, get as much data as possible, use the bitter lesson. Nevertheless, if there are ways in which you can work with five orders of magnitude less data than people working purely from pixels, you're gonna be able to make a lot more progress, a lot more quickly.[00:10:34] And that's the bet here. And so you could just say that's only wanting to be able to, do it more efficiently, do it more quickly, do it more cheaply. But I think it's actually more than that, I think. One should be making the analogy to how human beings work at one level. You know? Yes, we have these high [00:11:00] resolution eyes and we can look and see a scene like a video, but all of the evidence from neuroscience and psychology is that most of what comes into people's eyes is never processed.[00:11:13] Right. That you are doing fairly fine ated processing of exactly what you're focusing on. But as soon as it's away from that of yeah, there's another guy over there that you've sort of only processing top down this very abstracted semantic description of the world around you. And so, that's what human beings are doing.[00:11:33] They're working with semantic abstractions and so. I think it is just the right representation. ‘cause we also have other goals we want to be able to do, real time worlds. So that means there's a limit to how much processing you can do and we want to do long-term planning and consistency. And again, that favors abstraction.[00:11:55] I mean, I guess there was actually a recent. Blog posts that [00:12:00] came out from our Friends of physical intelligence and, they were sort of heading in the same direction they were saying Oh, to the pay[00:12:06] swyx: pay model.[00:12:07] Chris Manning: Yeah. Yeah. To maintain a long term memory of what's happening in the world. So we can, do longer term we actually storing text of what is, been happening in the world.[00:12:19] Right. It is not such a successful strategy of trying to keep it all at a pixel level.[00:12:24] Vibhu: And yeah, I mean, you can see it in video models like that Temporal consistency. We're at a scale of train on, all the video data we have. We have it for maybe 30 seconds, a few minutes. That's not the same as a game state played for half an hour.[00:12:37] Right. I thought you guys break it down pretty well. You have a, you have a blog post about. Building multimodal worlds with an agent. I dunno if you guys wanna talk about this. This is one of the things I read, I[00:12:48] swyx: thought, yeah, it's the thing I talked about with the reasoning chain. Yeah.[00:12:51] Vibhu: So there's like different phases to this.[00:12:53] It seems like it's more of an agent, a scaffold, very different approach than just, type in a prompt and you, you don't have the same consistency. [00:13:00] It also, like, for people that are listening, I, I would highly recommend reading it. It breaks down the problem in a different light, right?[00:13:06] So like, what do you need to consider when you're talking about video, like world game models, right? How would, what do you need to consider? What are the factors? What are the elements? What's the state? So I don't know if you guys have stuff to talk about for this one.[00:13:19] Fan-yun Sun: Yeah. Actually, I wanted to add on a little bit Yeah.[00:13:22] On our previous point, which is just like, change topics so quickly. I, I do feel like sometimes people confuse like, oh, like we're taking an an, an method with abstraction. That means they don't believe in bitter lesson. Like that's just false, right? Like we are believed is a bitter lesson. But then I feel like the question that we always discuss is like, what is the right abstraction level today?[00:13:42] The analogy I like to make is like, let's just say we can encode and decode. Represent all of images, videos, audio and bytes. Then the most bitter lesson approached is to train a next byte prediction model as opposed to the next token prediction model where it's just like, okay, it's natively multimodal, can just, but it's like, yeah, like [00:14:00] to, to Chris's point, it's like the scale and computing you need to achieve that.[00:14:03] So that's why we always come back to like, okay, what is the most efficient way to do it? And reasoning models to the point of this blog post is a showcase of like, Hey, we're actually just like reasoning about the world and reasoning about. The aspects of the world that CAGR that matter for me to learn what I want to learn from this role model.[00:14:21] swyx: Yeah, it's like you're improving the en encoder of whatever you're, trying to model. And like a better representation would just represent the important things in less space. Yeah. Which would just be more efficient.[00:14:33] Fan-yun Sun: Yeah.[00:14:34] swyx: So yeah, I, I, I fully agree that it is not, antagonistic to, bitter lesson.[00:14:38] I do wanna wanna mention one more thing. Is there any philosophical differences with the JPA stuff that, Yun is working on? I gotta go there. You, you, you, you're, you're imagining like some latent abstraction. I'm like, okay, fine. Let's, let's talk about it, right? Like it's an elephant in the room.[00:14:52] Chris Manning: Yeah.[00:14:53] JEPA & Philosophical Differences with LeCun[00:14:53] Chris Manning: There are philosophical differences. Jan Lacoon is a dear friend of mine, but. [00:15:00] He has never appreciated the power of language in particular, or symbolic representations in general. Yarn is a very visual thinker. He always wants to claim that he thinks visually and there are no words, symbols, or math in his head.[00:15:21] Maybe that's true of yarn. It's certainly not the way I think. Um. But at any rate, the world according to yarn is the basic stuff of the, the world and of intelligence is visual and language is just. This low bit rate communication mechanism between humans and it doesn't have much other utility and it's far inferior to the high bit rate video, that comes into your eyes.[00:15:53] And I think he's fundamentally missing a number of important things [00:16:00] there. Think of this evolutionary argument looking at animals, right? That the closest analogies, the things with chimps, right? So chimpanzees, have fairly similar brains to human beings. They have great vision systems, they have great memory systems.[00:16:18] They've got, better memory than we do of short term memories. They can plan, they can build primitive tools that, humans. Massively ahead in what we understand about the world, what we can plan, what we can build. And essentially what took off for us was that humans managed to develop language and that gave a symbolic knowledge, representation, and reasoning level, which just, okay if this sort of vaulting of what could be done with the intelligence in brains.[00:16:59] So the [00:17:00] philosopher Dan de refers to language as a cognitive tool and argues that, humans unique among the creatures in the world have managed to build their own cognitive tools and language is the famous first example. But other things like, mathematics and programming languages are also cognitive tools.[00:17:21] They give you an ability to. Think in abstractions, in extended causal reasoning chains. And that allows you to do much more. And we use that for spatial representation and intelligence and planning and gameplay as well. So we believe, and this is, underlying the specific technologies that Moon Lake is making, that symbolic representations are powerful.[00:17:50] And you want to use that in your understanding of the visual world when you want a causal understanding, when you want to maintain long-term [00:18:00] consistency and prediction. And as I understand it, that's just not in ya Koon's worldview. So I think that's the fundamental philosophical difference. Then there's the specific model.[00:18:11] He's been advancing jpa, that's a reasonable. Research bed is a direction as to, to head for building out a model of the visual world. To my mind, it's sort of one reasonable research bed. It's not really established. It's the best one that everyone should be following,[00:18:32] swyx: at least developed at scale, at Meta.[00:18:34] But it's not just vision, right? Like, I mean, JPA is a, just joint admitting prediction can be applied to anything really. And people have done it. The argument is that there is a latent representation or that is probably more. Suited to the task, then why not let machines do it for us instead of predefining it at all?[00:18:50] And isn't something like a JPA shaped thing the right answer? And if not, why not?[00:18:55] Chris Manning: So I think there's a part of jpa that's right, which is [00:19:00] you do want to have a joint. Embedding that gives you a consistent model of the world. And Jan's argument is you can never get that from auto aggressive language models ‘cause they're sort of left to right churning out one token at a time.[00:19:22] I guess this is where we're the research arguments of the field, I'm not actually convinced that's right. ‘cause although the token production is this auto aggressive, process that's heading, left to right, I guess don't have to be left to right. But anyway, in sequence of tokens we could have right to left Arabic.[00:19:40] But although that's true, all of the weights of the model that are internal to the transformer, they are a joint model of the model's understanding of the world. And so I think you can think of the weights of the model as a form of. Joint representation, [00:20:00] and therefore it is plausible to think that could be the basis of a world model, which avoids, ya's objections.[00:20:10] swyx: I think I follow, and obviously that would touch on what Moon Lake eventually ends up doing as well. Right. Like, which it's hard to tell because you put out the end results, but we don't know the inputs that go into it. So it's, it's, that's something that we have to figure out over time.[00:20:25] Vibhu: Yeah. I mean, I guess this kind of breaks down some of the outputs. Do you wanna walk us through it?[00:20:31] Reasoning Traces & Interactive Worlds[00:20:31] Fan-yun Sun: Yeah. So this, this really just walks us through the reasoning traces of like, okay. So that just say, if we wanna build a world in this context, it's really just a game demo that, that shows the, the variety of interactions that this world model can build.[00:20:45] And yeah, it's really just a reasoning traces of like, okay it prompted to create a bowling game. Like how did it achieve what you saw? That level of causality, interaction and consistency, right? So yeah, this is almost just like a, an example of [00:21:00] like a reasoning traces. Very[00:21:01] swyx: detailed.[00:21:01] Fan-yun Sun: Yeah.[00:21:01] Vibhu: Very, very detailed.[00:21:02] You gotta you don't even realize it, right? Like when a video is generated, what happens when a ball strikes a pin, right? So first, like you, there's audio in that, like audio triggers happens, score increments, the world changes. Like pins have to start dropping. There's a timer that goes on. It's just like very similar to how now we're used to reasoning for language models.[00:21:20] There's a whole state of what happens. So geometry, physics, all this stuff. And then yeah, there's kind of that single prompt. So asset, ation all this stuff. It's like a, it's a nice view to see what's going on.[00:21:32] swyx: I think Sun is also too polite to point out that, both like Google's genie, demos as well as world Labs is marble, do not have interactive worlds.[00:21:41] Fan-yun Sun: That's the benefit of having a reasoning model, right? Like, because you can, you can say, oh, like maybe in this particular context, I want to learn how to bowl. And then you can say, okay, then what is it important when it comes to learning how to bowl? Okay, maybe it's like I need to understand the, the basic of like, physics and I want to throw it over [00:22:00] them.[00:22:00] I wanna know that when I, when it resets it's a new game. So I know that yeah, basically, you know to pick up the ball, you know that ball's gonna cause the pins to fall down. You know that what's important to this particular bowling game is to score and you know that the score corresponds to the number of pins that fell down.[00:22:19] So it's just like, if it's a model that sort of knows what it. Looks like, knows what a bowling game looks like, but doesn't actually allows you to practice over and over again and to understand that, oh, like what it takes to actually get a high score. Then it sort of doesn't actually allow you to learn what you set out to learn within the world model.[00:22:38] And I think this is really just one example of showing like the advantages of the approach that we're taking over most the, let's call it the zeitgeist, is today, when people talk about clinical role models,[00:22:51] Chris Manning: right? So it sort of seems like the question to ask when there's a world model is.[00:22:58] Can I not [00:23:00] only just wander around the world and look at the beautiful graphics, can I interact with the objects in the world and see the right consequences of actions?[00:23:11] Vibhu: And you also understand what the consequences would be if you do something right. So it's not just like, okay, there's one thing if I pick it up, something will happen.[00:23:19] But, there's 50 options and I know I can expect, I can infer what would happen if I do any of them. Right. So very different when you can actually see it play around with it.[00:23:28] swyx: There,[00:23:28] Beyond Unity: Cognitive Tools for World Building[00:23:31] swyx: there's two cheeky elements of that. I mean, the, the, the I guess, less ambitious one is, let's really establish for listeners, why is this fundamentally different than writing Unity code, right?[00:23:40] Like just creating a model to translate a prompt into Unity code[00:23:44] Fan-yun Sun: so there is an underlying physics engine. Yeah. In that sense, there's some overlapping things to Unity, but the way we think about it is like physics engine. Tools or code are cognitive tools like borrowing Chris's term, right? Like tools [00:24:00] that the model can employ as means to an end.[00:24:04] So today maybe you say, okay, in this particular context we care about physics, we care about the long-term causality consequences. Then yes, we deploy it, employ physics engine, and then maybe tomorrow we say, okay, we're we're training that. Just say drones where we only care about really fluid dynamics and the visual aspect of the world.[00:24:25] Then, then yeah, maybe we don't actually, the model actually doesn't have to use a physics engine. Or maybe it employs other types of representation or physics engine to achieve the task. So yes, writing code for Unity is sort of similar to a tool that our A model can employ, but our goal is for a model to take a representation conditioned reasoning.[00:24:46] Approach or process.[00:24:47] swyx: Yeah,[00:24:47] Fan-yun Sun: internally.[00:24:48] swyx: Yeah. Using these things as just like general two calls. Right. Which I think is very interesting. The other more ambitious one is, some kind of recursive element where it becomes multiplayer, right? Like here, there's a single player element, you're not [00:25:00] modeling any other people involved.[00:25:01] And that is a whole other thing.[00:25:04] Fan-yun Sun: But in fact, we can really do multiplayers. Oh yeah, okay. I haven't seen any double situations. So just actually just like prompt our, our model to say, Hey, like configure to multiplayer. Then it'll do like this. You'll be able to configure multiplayer[00:25:16] swyx: great[00:25:17] Fan-yun Sun: persistency database for you.[00:25:18] Easy. Yeah.[00:25:19] Vibhu: So what, what are like some of the current limitations in where we're at? So there's one approach of like, okay, scale up video predictors. Obviously there's data issues. With approaches like this, is it data constraints? What are like the next steps? Is it real time? Like, so there's one side of, write an agent to write Unity code, but okay, I want to be streaming a game real time.[00:25:38] I want to have characters being also like agent, but where, where do we kinda see this scaling up? Right?[00:25:44] Fan-yun Sun: Yeah, there's definitely a data constraint. Like the more data, the, the better. This reasoning model can almost basically act as humans to like operate a variety of tools and softwares to build whatever's necessary.[00:25:57] And then there's a sort [00:26:00] of fidelity constraint, which we're actually solving with another model, which we can talk about later. But it's like, it's not as easy to get to photorealism with the approach that we're taking. But we think there are better solutions to that, which is we can dive into later.[00:26:14] Later.[00:26:15] Vibhu: The one one thing you note here is it's a diffusion model, right? So there's, there's a few approaches, diffusion caution, splatting, yeah, so Ry diffusion model, you guys wanna[00:26:25] Fan-yun Sun: Yeah.[00:26:25] Vibhu: Introduce,[00:26:26] Fan-yun Sun: yeah, totally.[00:26:26] Rie: Neural Rendering & Skins for Worlds[00:26:26] Fan-yun Sun: So within our world modeling framework, we think there are two models that we train, right?[00:26:31] Like, there's the multimodal reasoning model that we just talked about that essentially handles. Mainly the, the causality, the persistency and logic determinism of the world. And then RY is our bet on saying, okay, like while all those model, can take care of all these things that we just talked about, it's limitations compared to existing, say, video models, is that it doesn't have as high of a pixel [00:27:00] ality right off the gate, right?[00:27:02] And EE is to say, Hey, we can actually take whatever persistent representation that we generate with our multimodal reasoning model and learn to restyle it into photo photorealistic styles or arbitrary styles you want. So this model is almost to say, Hey, I'm going to respect the persistency and interactivity of the world that you created, but my only job is to make sure that its pixel distribution is close to what we want.[00:27:29] Vibhu: Yeah.[00:27:30] swyx: Great example right there. You kept the KL divergence.[00:27:33] Fan-yun Sun: Oh. Where,[00:27:34] swyx: no, no. I mean this, this is a, a classic like, how you don't stray too far from the source material as you, you kept the kl, which is Oh yeah. Kind of cool. Yeah.[00:27:43] Fan-yun Sun: Yeah.[00:27:44] swyx: I mean, and the[00:27:44] Chris Manning: difference is, and I mean sun was pointing at this, where sort of saying it's in one way a more difficult path, but a better path that, typically the diffusion models are producing the whole scene and it looks lovely, [00:28:00] but there isn't spatial understanding behind it, which is allowing for the real time graphics gameplay, the spatial intelligence, understanding the consequences of worlds where this is, taking a path where it is assuming an abstracted semantic model of the world's state.[00:28:20] And then the diffusion model is then being used on top of that to produce the high quality graphics.[00:28:27] swyx: Is there an intended practical, or business use for this, or is it like a, like a demonstration of capabilities?[00:28:34] Fan-yun Sun: We actually believe that this is gonna be the next paradigm of rendering. So it's gonna replace how ra raizer, it's gonna replace DLSS today because it not only has these pixel prior that's learned from the world such that you can literally play any game in photo realistic styles, which is a lot of people's desire when they do GTA, right?[00:28:51] Like,[00:28:51] Vibhu: all the mods, all the people adding perfect lighting and all this.[00:28:54] swyx: So[00:28:54] Fan-yun Sun: skins[00:28:55] swyx: for worlds, let's call it[00:28:56] Fan-yun Sun: skins, let's call it skin for worlds. I,[00:28:58] Vibhu: it's also like, you can call it skin, you can call it [00:29:00] customization. You can play it how you want, right?[00:29:01] Fan-yun Sun: Yeah, exactly. And I think another thing that we really pointed out specific specifically in this blog is the programmability of it, right?[00:29:09] So what this means is that this render historically render is always a derivative of the game state, right? You're saying, oh, here's the game state, I'm rendering out a frame. But here I'm saying actually this render can be part of the gameplay loop. I can say something along the lines of, if upon getting 10.[00:29:26] Apples, I'm gonna, my weapon of choice, my bullet's gonna turn into apples. And that's, that's possible because we can say, we can basically dynamically have certain game state trigger the, the preconditions to the render such that the rendering is now part of the game loop too. One thing is to just say, okay, it's, it's, it's the appearance.[00:29:47] But the second thing is also to say there's these novel interactions that are possible because this render now has actually priors of the world.[00:29:57] swyx: It is up to the artist to figure out what to do with it.[00:29:59] Fan-yun Sun: It [00:30:00] is up to the creators. Yes.[00:30:01] swyx: Yeah.[00:30:01] Fan-yun Sun: And I also think that's actually another big argument that we're making and the reason that we're picking, taking the bet we're baking is that a lot of the times, whether it's for embody AI gaming, like you want a layer where human can inject their intentions.[00:30:15] So, for example, let's just say in the context of gaming, it's obviously like my creative intent, but maybe in the context of embodied ai, it's like, oh, like I take this foundational policy and I want to actually fine tune it to deploy in my house. So you want to almost say, inject, have a layer where human can say, oh, here's the distribution of things I want to create to achieve my goal.[00:30:35] And I think 3D graphics as it as it is today, is basic, the layer for people to say, Hey, what do I care about in this world? And it allows, basically human intent to be expressed in these worlds much more explicitly and distributionally as opposed to just saying, Hey, I'm gonna generate like, arbitrary.[00:30:54] And it's like just prompts,[00:30:55] swyx: it's one of those things where like, I think you, you're going to build up a series of models, right? [00:31:00] This is just one of, this is probably like the highest utility or heaviest, frequency one, I don't dunno what to call this. Where like you Yeah. You can immediately drop this in on any game and you don't need anything else that.[00:31:10] That you guys do. But, I, I could see, I could see that I think the, the human intent is something that people are not even used to because we're so used to static worlds or, worlds that just don't react, or, I don't know. It's, it, you're kind of blowing my mind right now with like, I'm, I wonder if you've talked to people at GDC Hmm.[00:31:27] And what are they gonna do with it?[00:31:30] Fan-yun Sun: Yeah. Now the stance that we take on this front is like, we're not gonna be more creative than our users to ship[00:31:35] swyx: it out.[00:31:35] Fan-yun Sun: Yeah. But we wanna make sure that we're building things in a way that really allows them to express their intent.[00:31:41] swyx: The thing that you said about, here's the distribution that I want.[00:31:45] I think text may be too low of a bandwidth to. To really demonstrate, because I, I, there, I'm, I'm probably just gonna want to drop in a bunch of, reference assets and then you can figure it out from[00:31:58] Vibhu: there. But you probably wanna do a, a mixture of [00:32:00] both, right? Like you throw in a few images. I wanted this style.[00:32:02] Yeah. I want it to look like this. So it, it's, it's a mixture, right?[00:32:05] Chris Manning: I, I think it's a mixture. I mean, yeah, I mean there's clearly a visual component of this, and it's not that, everything can be text. ‘cause of course you want to give a visual look, but there's also a massive amount of giving the overall picture of the look of the world and the behavior of things that you can express in a few words of text.[00:32:32] And it be very time consuming and difficult to do via visual means. So I think, yeah, you want a combination of both.[00:32:40] Evaluating World Models[00:32:40] Vibhu: So one question I kind of have is, how do we go about evaluating world models? So like, there's many axes, right? One is like, okay. I have preferences. How well do we adhere to prompts? One is the simulation.[00:32:50] One is like do things, is there core logic that's broken? So coming from we know how to evaluate diffusion, there's fidelity, there's [00:33:00] stuff like that. But what are some of the challenges that most people probably aren't thinking about?[00:33:04] Fan-yun Sun: Yeah, I think this is like a great question and probably one of the hardest questions in role models because like, I think it always comes back to what are you building this role model for?[00:33:13] And depending on your end goal and purpose, the evaluation should defer. So in the context of games, then the most direct way of measuring is how much behind are people actually spending in this world that you create? And if your goal is to say, for example, in the context that we just talked about, like, hey, deploying, deploying action in body, a agent, then your, your end.[00:33:33] Metric is then, okay, after training in these worlds that you generate how robust it is to when you actually deploy to the target environment. But then, it's, it's hard to measure these end metrics. So today people have like these proxy metrics that I call that basically try to measure what we really care about, which is the end metrics, but then frankly it's different for every use case.[00:33:57] Yeah,[00:33:57] Vibhu: which seems like quite a challenge, right? Like in [00:34:00] in language models or video models. Image models, your benchmarks are proxies, right? People aren't actually asking instruction, following tool use questions. They're proxies of how well it will do downstream. But for this, so like, should teams, should companies have their own individual benchmarks outside of games?[00:34:16] If you think of stuff like, okay, video production, movies, stuff like that, that also want to use world models. Should, should they sort of internalize like. Their own proxy. Is this something you guys do? Where, where does that connect[00:34:28] Chris Manning: go? Yeah, I think this whole space is extremely difficult as things are emerging now.[00:34:35] And I mean, it's not only for world models, I think it's for everything including text-based models, right? ‘cause in the early days it seemed very easy to have good benchmarks ‘cause we could do things like question answering benchmarks and could you answer the question based on these documents and the various other kinds of, do pieces of logical reasoning or math.[00:34:58] But again, these are sort of. [00:35:00] And there were sort of visual equivalents of things like object recognition, right? For these small component tasks. These days so much of what people are wanting to do also with language models is nothing like that, right? You're wanting to, have an interaction with the language model and get some recommendations about which backpack would be best for you for your trip in Europe next month.[00:35:25] And it's not the same kind of thing, right? And it's not so easy to come up with a benchmark as to does this large language model give you an effective interaction for guiding you in a good way for shopping, right? So, and it's the same problem with these world models. So if we take the game design case, well success is that a game designer can.[00:35:57] Produce what they are [00:36:00] imagining in a reasonable amount of time. And that's really the kind of macro task. That's a very hard thing to turn into a benchmark and I think a lot of this is actually going to turn into people walking, walking with their feet. Right? I mean, I guess that's what's happening, at the large language model level, right?[00:36:23] When people are choosing to use, GPT five or Gemini or clawed, individuals are trying out these different models and deciding, oh, I like the kind of answers that GT five gives me, or no, I feel like I get more accurate detail from Claude, right?[00:36:43] Vibhu: It's a lot of[00:36:43] Chris Manning: vitech, a lot of people just using it.[00:36:45] It's vibe checking. I realize that, but it's actually whether. People feel it's giving them utility in what they want. Right.[00:36:52] Vibhu: And the the interesting thing there is like a lot of people prefer the visual, right? This looks pretty, which is not the objective of what this is [00:37:00] for, right? It's if a, if a game designer is working on something, they care about the game engine, right?[00:37:04] The state, it's, it can look whatever. You can fix that up later. Or you can have a really good game state and you can quickly edit it to 20. 20 different versions, like Keep State,[00:37:14] Chris Manning: right?[00:37:14] Vibhu: So[00:37:14] Chris Manning: that's a really important distinction, for and for speaking to Moon Lake strength, right? So, yeah, great visuals are lovely to look at for a few seconds, but gains are really all about the concept, the game play.[00:37:33] And a lot of the time that doesn't actually even require great visuals. I mean, there are just lots of very successful games which have relatively primitive visuals, and there are other games where people have spent millions producing photo realistic, visuals, and the game sucks, right? So, keeping those two axes apart is really important in thinking about what's important in a [00:38:00] world model for different uses.[00:38:02] swyx: This conversation is reminding me of some game review and fiction discussions I've, had in my sort of non-AI related life. Some, for some people might know Brandon Sanderson, who's a very famous, fiction author, had, is is a big game reviewer. And he, he's a big fan of video games where you change one thing about a normal what you might assume about, about the world.[00:38:22] For example, Baba is you, I don't know if you might have come across that, where like the rules change as you play the game. And also like where, you can do things like reverse time selectively or like change gravity selectively. And I think this is also reminds, reminds me of other kinds of world models that are created by authors.[00:38:38] Where Ted Chang is, is my typical example where he'll take the world that, you know today, but change one thing about it and, but then create a consistent world based on that. Which is long-winded answer of me to, of. For me to say is it's it easy to create alternative roles that don't exist, but you change one thing and then let's, let's run a whole bunch of people through it to see if it works.[00:38:58] Chris Manning: My first dance will [00:39:00] be, that seems a lot easier and more conceivable to do using Techn technology like Moon Lakes than with some of the other world models out there, where the sun can actually make it happen. I'll let him give a second answer.[00:39:15] swyx: If I guess for you, you're constrained by the game engine tool, right?[00:39:18] Like at the end of the day, that's the, that's the thought, partner that you have. If I ask for something where like, if it never is allowed to reverse time or if gravity only ever works one way, then well that's it. But sometimes gravity might change,[00:39:33] Fan-yun Sun: but it's a lot easier to change with code as opposed to a model that is learned primarily on data of.[00:39:42] Real world and virtual worlds that are, I guess, like for example, junior, like there's actually trained on a lot of real world data and a lot of virtual gaming data, and it's hard to say maybe it's easier to say, okay, I wanna change the visuals in like the time period of, of the world. Like, you can't change gravity, for [00:40:00] example.[00:40:00] Vibhu: I feel like you can to light bounds, right? Everything comes down to like, code is a better way to execute it, but the models aren't that diverse and creative, right? You can say, okay, make gravity slower. It can do that, but it's limited to your representation of how you text it out, right? Like they're, they're only gonna do a few iterations, whereas programmatically, if there's a game engine under the hood, you can kind of go wild, right?[00:40:22] So one of the, I dunno, one of the limitations of most models is that they're very overtrained to one style. Right. And extracting diversity is pretty difficult. At least that's something we've seen.[00:40:35] Fan-yun Sun: I mean, are there examples you have in mind where you Existing models? Yeah. Like it would be easier to do that's not using code.[00:40:43] Certain types of creative intent or like transition state transitions,[00:40:47] swyx: Clipping, other models, other wo models are very good at clipping through things. Clipping my, my, my legs clipping through a rock because it's, it's just, it's just bad. [00:41:00] Like, you would have to struggle very hard with your stuff to actually make that happen.[00:41:04] Which I think is maybe a topic that you actually prepared on, Gian Splatting versus, the other stuff.[00:41:09] Vibhu: Yeah. Yeah. It's just for those not super familiar, right? There's a, there's gian splatting, there is diffusion. Like what works, what scales up. I feel like in February when Soro one came out the blog post was literally titled like,[00:41:21] swyx: you bring it up.[00:41:22] You never know.[00:41:23] Vibhu: World, world, video generation models are world simulators. It's super bitter lesson pilled. Yeah, emer, a lot of it is emergence, right? So, not to go through their blog post, basically their whole thing was as you scale up all this consistency, all this stuff just kind of solves, it's a very simple premise, right?[00:41:41] They just scaled up, diffusion, and from there, this is, this is Feb 2024, how much can we, it's already been two years, which is basically five years. How much more in AI time do we need to just scale up or, or do we hit a data cap? But I think we already talked about this a lot, right? Like this is back to the beginning discussion of what's [00:42:00] appropriate for the time.[00:42:01] And that seems like your approach, right?[00:42:03] Fan-yun Sun: Yeah. The point I'm trying to make is that they're very many, many different types of world simulators and like having a world simulator that can produce pixel coherency is very, very useful for games and, marketing and all these things, but it's not as useful as people think when it comes to causal reasoning.[00:42:25] When it comes to embodied ai. Yeah, like it this title is true. We're not saying that it's, it's like, not a great world simulator, but actually in the blog that we, we, we, we wrote, the bet is more so that there are gonna be disproportionately large share of value of real world tasks or, and virtual tasks where high resolution pixel fidelity is not needed.[00:42:47] Yes. Video models have their values.[00:42:50] swyx: Yeah. This is at the absolute limit of my physics understanding, but one example that comes to mind is basically having to solve like ba the equivalent of a three [00:43:00] body problem in a deterministic Well, where the video models, which is approximated good enough. Yeah.[00:43:08] Right. Like there's, there's some point at which your approach kind of runs into like the you now have to simulate the world. Please, thank you very much. And like you're trying to do that, but only to the extent that the game engine lets you and like game engines cannot do some things.[00:43:23] Fan-yun Sun: Yeah, no, I mean, I think the interesting or more technical question here actually is where do you draw the boundary between.[00:43:32] What's handled with, let's say, diffusion prior and what, when? What's handled with symbolic priors?[00:43:38] swyx: Yes.[00:43:38] Fan-yun Sun: Okay.[00:43:38] swyx: Okay.[00:43:39] Fan-yun Sun: Right. Let's go there. Because this, this boundary can actually be fluid. Like I think like maybe what you're trying to get at is like, okay, people are saying pixel prior, everything. But what we're saying is, okay, there's a boundary that we draw where this is where we think provides the most economical value for the domains and things that we care about today.[00:43:59] [00:44:00] And I actually do think, and it's something that we do internally all the time, which is like, okay, given new equations that we learn or new elements of the world and that we, we learn, or maybe some other knowledge that we acquire in the process of developing the models. Should we still be maintaining this line exactly as it is today?[00:44:22] Or should we move it a little bit left or a little bit right? Right. Like sometimes that we realize that, oh, like maybe customers or, or folks like want certain things that are better handled with preop pryor as opposed to, symbolic prior than,[00:44:34] swyx: yeah. Your, your skin thing is a, is a example moving it, right.[00:44:37] Yeah.[00:44:37] Or left. Yeah,[00:44:37] Fan-yun Sun: exactly.[00:44:38] swyx: I dunno what the, the left right is.[00:44:39] Fan-yun Sun: Yeah, yeah, yeah. No the, the model.[00:44:42] swyx: Yes.[00:44:42] Fan-yun Sun: Actually we have a few iterations of them. They're actually at slightly different[00:44:45] swyx: I know boundaries. You should, you should do that. That's a cool dimension to show.[00:44:49] Fan-yun Sun: Yeah.[00:44:50] swyx: Is quantum mechanics the diffusion prior of our world?[00:44:55] Right. It's like that's the boundary of classical mechanics versus quantum. Right? Like, that's it. At one [00:45:00] point God plays dice and the other point doesn't.[00:45:02] Fan-yun Sun: I dunno if Chris, you wanna say it, but I think, I think generally I feel like physics is better with symbol P priors.[00:45:08] Chris Manning: Even quantum physics.[00:45:09] Fan-yun Sun: Even quantum physics.[00:45:11] swyx: Yeah. This is starts against to, MLST territory is, is what I call it, where, he, he likes to get philosophical. We, we we're quite friendly.[00:45:18] Vibhu: I mean, we need to get, we need to get singularity. I heard some of that.[00:45:23] swyx: No, no, I think that is actually really helpful and man, I just want you to productize this like, as a product guy, I'm just like, oh, also[00:45:32] Vibhu: a gamer, I[00:45:33] swyx: wanna, it's like a researcher, like, it's cool.[00:45:35] Like this is a, the theoretical, like you have a very good, I don't know, like the way of thinking about these things, but I just wanna see you like, express it. I do think like your fundamentally things when, when you leave open new tools, like, okay, use, use human intent to incorporate it into how you render.[00:45:52] Artists are gonna have to take like two to three years to figure out what to do with this. And you just don't know.[00:45:57] Chris Manning: Right. But I think, this is, [00:46:00] gives a much more approachable and controllable world for the society, which is the beauty, the beauty of, NLP, that that will enable it to be adopted and used.[00:46:10] And we are very hopeful about that. Yeah,[00:46:13] Fan-yun Sun: yeah. Yeah. I mean, we are, we are very focused actually on commercialization in the sense that like we do, we do really believe in the data flywheel app approach. Yeah. Where, we put this in the hands of the creators and the users and then they will teach us when, what capability our model should improve.[00:46:27] And that's why we are, we are actually, like products and beta[00:46:31] swyx: Yeah. Focusing on gaming. What, what's like the adjacent thing to gaming[00:46:34] Fan-yun Sun: embody adjacent, basically. So maybe we can, we can I'll maybe start with where we see the platform in three years. Yeah. Which is like, okay. The users would tell us what they want to achieve.[00:46:45] The end goal could be, Hey, I just, I wanna make something to teach my kids the value of humility. Or it could be, Hey, I wanna fine tune my, drones to be really good at rescue situations. I could be vacuum robots. I want to like train [00:47:00] my manipulation or like vacuum robot to be very robust to my office, right?[00:47:04] But it's like, whatever it is, scenario robust to[00:47:06] swyx: my office[00:47:07] Fan-yun Sun: or like navigate very robustly in my office. But then it's like, whatever end goal that you want, our role model will say, okay, given what you want to achieve, let me generate a distribution of environments such that I can train and evaluate whatever it is you want.[00:47:24] Yeah. Right. Maybe for the purpose of games, it's just the end simulation and that's the end product for certain policies. It's like I can train it within these environments and then help you see where your policy is failing or not. Yeah. And then, so I think,[00:47:37] swyx: so in that case, much more of a training tool.[00:47:40] Than in other training[00:47:41] Vibhu: evaluation? Both. Right?[00:47:43] swyx: Sure. Same. Same thing.[00:47:43] Fan-yun Sun: Yeah, same thing. I think it's just this role model that allows people to train any policy that can act in any multimodal environments.[00:47:51] swyx: Would it be harder to reward hack? Is there an angle here where it is harder to reward hack? Like it's just, I'll just put it generally because I think that's a, that's obviously a key [00:48:00] problem that a lot of people face when in training agents in these environments, and I don't know, can you solve it?[00:48:07] Chris Manning: I think not necessarily. To the extent that there's a mis specified reward that. It seems like it could be hacked in a more symbolic world or in a more pixel based world. I dunno if Sun's got any thoughts, but I don't think that's really being solved.[00:48:26] swyx: The other thing that comes to mind is just you could just build a better sawa as a video generator model, right?[00:48:31] Because then you, you would move the diffusion, side a bit more further to the right. I think if I got the directionality correct. And that's it.[00:48:40] Vibhu: It's better on domains, right? Like on consistency over now, or for sure it exists versus something doesn't, right.[00:48:46] Chris Manning: So[00:48:46] swyx: yeah. Yeah. Is[00:48:49] Vibhu: is a question more like, like[00:48:51] swyx: I'm just riffing on like, how do you, what can you build, you know?[00:48:54] Oh, with the stuff that you have. I do think that the minor, the academic does go immediately to training [00:49:00] and in eval evaluation, but like art tends to take unusual directions. Like you might end up,[00:49:06] Chris Manning: okay. Yeah. But the question is, can you use this piece of software to develop compelling gameplay and. I don't think you can take SOAR and produce compelling gameplay, right?[00:49:19] If you want to have a world that you can wander around in a bit, you are good. But what are your abilities to have gameplay mechanics implemented the way you'd like them to be and to have things stay, with the long-term history of your gameplay that influences future actions. I think there's just nothing there for that.[00:49:39] swyx: Yeah, I do tend to agree. I, I'm just trying to sort of test the boundaries. I would also make the observation that as AAA games industry has developed the line between what is a movie and what is a game has blurred. And you, you, you do end up basically producing a two hour movie as part of your game.[00:49:57] Fan-yun Sun: No, honestly, there, there's so many actually [00:50:00] applications in adjacent markets that our world model can go into. Yeah. But yeah, it, it's sort of fun to riff, riff on. Although on the execution side, we we, we need to stay focused with like, okay, what are the capabilities we want to unlock over time?[00:50:11] And there's a roadmap for that. But yeah, if we're just riffing on sort of like the possibilities, I feel like, whether it's endless Yeah, it's like classic[00:50:18] swyx: and the embedding for a possibility and endless in my mind, it's very close. Yeah. I do wanna, focus on one, like weird choice. I, I don't know if it's weird.[00:50:28] Maybe I'm, I got something here. Audio, right? You could have just said no audio And audio in my mind has a lot of recursion, whereas in video you can just do recasting and that's much computationally much simpler. Audio just seems way harder. I don't know if you wanna just comment on just the special 3D audio.[00:50:46] Problem. Did you really have to do it? I guess you do to be immersive, but like a lot of people do treat it as like, well, you just stick a, a tt S model on top of[00:50:57] Vibhu: Well, there's a lot more to game audio than [00:51:00] just speech. Right. It's not just[00:51:01] swyx: tts. Yeah. Tts. S Fxt, GM Spatial in my mind Echoes[00:51:06] Chris Manning: Yeah.[00:51:06] swyx: And reflections.[00:51:07] And I, I don't even know what's, what else? I don't know what, what other problems in this space.[00:51:13] Fan-yun Sun: Yeah, I think this point like the, it's sort of a more, more pointing to the benefits of using an game engine as a tool that's available to the model, right? Because like part of the spatial audio is from the code that is underlying the simulation.[00:51:32] And while we do give our model access to other types of audio models as. Tools.[00:51:39] swyx: None of them would be spatial, I think.[00:51:41] Fan-yun Sun: But that's exactly sort of more 0.2. We're giving our model an abstraction or a suite of tools such that it's able to achieve that. And you can argue that sort of spatial is like a, like a emergence out of the, the tools that we and abstraction that we provide to the agents.[00:51:59] And I think that's the beauty of [00:52:00] this, this, this approach is like there's a lot of things kind of like how human's built technology and they're like Lego blocks that build on top of each other. And it's the same thing here. There's gonna be things that sort of just sort of emerges from being able to put these things together in like combinatorially interesting ways,[00:52:14] Chris Manning: right?[00:52:15] So this integrated audio model exploits the understanding and semantics of the Moon Lake world, right? And whereas in general for the Gen AI video models. There's no actual integration across to audio at all, right? That someone might stick some music or stick a soundscape or whatever else on top of their video.[00:52:44] So it's not a silent video, but they're in no way connected into a consistent world model. And there's nothing that's okay. An action is happening in the video. Therefore there should be a sound that's [00:53:00] coming from this part of the visual field.[00:53:03] swyx: Yeah.[00:53:03] Vibhu: Is that different than Sora too? Does it not have audio?[00:53:06] Not to say it's not like[00:53:08] swyx: amazing[00:53:08] Vibhu: isn't a spatial[00:53:09] swyx: audio.[00:53:09] Vibhu: It doesn't,[00:53:10] swyx: no. I've played around it with it enough. It just sounds like someone put an 11 laps voice on top of it and just tried to do the lip sync.[00:53:18] Vibhu: Oh, yeah. I've seen, okay. Generate a dog at the beach and reactions to big wave and move[00:53:23] swyx: around.[00:53:23] It's definitely like, so have the dog, have the dog move away from camera and see if the, the song goes down. It doesn't. ‘Cause they don't have facial audio.[00:53:32] Fan-yun Sun: We do want to basically like we, our moral model, like the one we're training is basically towards the goal of having a combined latent representation across all these different modalities.[00:53:42] Right? Such that it can like reason across these different modalities. So for example, if I close my eyes and like you play a video, you play a sound of like a car skidding away from me. I almost can like, visually extrapolate that trajectory in my mind. And I think that type of capability, we want our model to be able to reason, right?[00:53:59] And that's the reason that [00:54:00] we're sort of taking this multimodal reasoning approach. It's like we want this combine late in space that can[00:54:05] swyx: Yeah. Oh, you said late in space. We like that. Here we have to play the, the bell Every time that someone says late in space, no, you gotta train daredevil one. Where you, you, you, it's only audio, but you have to work out.[00:54:15] Where everything is.[00:54:19] Cool. I I think that that was, that was about it for our Moon Lake coverage. I do think that we have like a couple of, Chris Madden questions on, on IR and, just any, any other sort of attention topics or n NLP topics.[00:54:31] Vibhu: Okay.[00:54:31] swyx: Go ahead.[00:54:32] Chris Manning's Journey: From NLP to World Models[00:54:32] Vibhu: Well, no, I mean, yeah, it's just fun. We talked a bit about how you guys met, but you basically, you, you were like the godfather of NLP per se, right?[00:54:39] You spent the whole career from early embeddings, early early attention. You did 2015 attention for machine translation, everything. You, you had information retrieval, so RAG before rag, we just wanna shout that out and admire a lot of that. Right? So what prompted the switch over to world models?[00:54:56] How, how'd all that come about?[00:54:58] Chris Manning: To some answer it [00:55:00] is, the enthusiasms and creativity of students, but there's a bit of a history there, right? So, yeah. So clearly most of my career has been doing stuff with language and how I got into research was thinking, ah, this is just so amazing how humans can produce speech and understand each other in real time.[00:55:21] And somehow they managed to learn languages from their kids. How could this possibly happen? And so, yeah, starting off I was very focused on language, but as it sort of got into the 2000 and tens, I started, going, I'd been working on question answering, and then I started to get, interest in visual question answering.[00:55:42] And that was an area where it was very noticeable. That the visual understanding was bad. Right. These were the days when like, it sort of seemed like there's almost no visual [00:56:00] understanding. You were just getting answers that came from priors. So, if you asked how many people are sitting at the table, it'd always answer two regardless of how many, how many people you could see in the picture.[00:56:11] And so it seemed like, oh, these models actually aren't able to get semantic information outta
Stewart Alsop sits down with Karol, a 3D generalist and digital artist with 25 years of experience, to talk about the evolving landscape of 3D art — from sculpting in ZBrush to the deep technical rabbit hole of Houdini, and how AI tools like Claude are quietly reshaping creative workflows. The conversation wanders into bigger territory: the singularity, accelerationism, the philosophical roots of Silicon Valley's techno-anxiety (including the Roko's Basilisk thought experiment and the writings of Nick Land), the slow unraveling of Hollywood's cultural monopoly, and what decentralized creative tools mean for independent artists. Stewart also points Karol toward the work of Fei-Fei Li and World Labs as a window into where 3D world modeling is heading next.Timestamps00:00 — Karol's 25-year journey from Photoshop and 2D art into Cinema 4D and the world of 3D.05:00 — Why Houdini blew the ceiling off every other 3D program, and how node-based coding changed Karol's creative process entirely.10:00 — The tension between visual thinking and technical thinking, and how constant digital stimuli has degraded Karol's internal imagination.15:00 — Stewart reflects on Claude Code and how AI is about to dissolve the technical barriers in Houdini the same way it did for programming.20:00 — The Sphere in Las Vegas, projection mapping, drone polo, and Stewart's vision for intimate tech-integrated experiences.25:00 — Roko's Basilisk, fear-driven accelerationism, and why Latin America never caught the Silicon Valley doomsday bug.30:00 — Hollywood's cultural machine, shared Western boogeymen, and how decentralized 3D art is replacing the $100M production monopoly.35:00 — Karol's eclectic client roster: Utah Jazz, Apple, League of Legends, and a Buddhist temple in Los Angeles.40:00 — Gaussian splatting, photogrammetry, point clouds, and where world models are taking 3D next.45:00 — The freelance vs. studio dilemma, brutal VFX industry crunch culture, and Stewart's plan to own his entire podcast stack.50:00 — Poland's economic rise, the hollowing out of the Netherlands, and capitalism as an endless infection with no clear cure.Key InsightsHoudini as creative rebirth. After nearly burning out on conventional 3D software, Karol discovered that Houdini's node-based, code-driven architecture gave him something the other tools never could — a blank canvas with no ceiling. Rather than navigating a boat someone else built, he now builds the boat from scratch every time, which keeps the work perpetually challenging and alive.Visual thinking is under attack. Karol noticed his once-vivid internal imagination quietly degrading over the years, and traces it directly to the overwhelming volume of digital stimuli in modern life. His response has been aggressive minimalism — stripping back inputs, physical and digital, to try to recover the creative mental space he once had naturally.AI as a technical collaborator, not a replacement. Karol uses Claude daily, not to generate imagery, but to work through coding problems inside Houdini. He's clear that image generation is his job — what AI earns its place doing is explaining unfamiliar code and helping him push past technical blockers faster.The freelance paradox. Twenty-five years of independence has meant total creative freedom alongside real financial instability — months of silence followed by weeks of 16-hour days. Karol has never resolved this tension, but holds onto the freedom anyway, and sees it as increasingly important as surveillance and corporate control tighten.Roko's Basilisk explains Silicon Valley. Both Stewart and Karol land on the idea that the feverish, fear-driven energy behind tech accelerationism may trace back to this single thought experiment — the notion that if you don't help build the AI, it will punish you retroactively. Latin America, blissfully unaware of it, seems measurably calmer.Decentralization is ending Hollywood's monopoly. The same forces making software cheaper and AI more powerful are quietly dismantling the $100M barrier to cultural creation. Karol's career — spanning album covers, Apple, the Utah Jazz, and a Buddhist temple — is a living proof of concept for what independent 3D generalism can look like outside the studio machine.Owning your tools is a political act. Whether it's Karol resisting the pigeonhole of VFX studios or Stewart rebuilding his podcast infrastructure from scratch, both see the ability to own and control your own software and hardware as essential preparation for whatever comes next.
El primer episodio se titula Lo que la IA puede hacer por nosotros/a nosotros, y la dualidad del título es indicativa de dónde nos encontramos con la inteligencia artificial en 2024. Gates, así como los productores del programa Morgan Neville y Caitrin Rogers son los productores ejecutivos principales del programa hablan con varias personas involucradas en el desarrollo de software de IA, como el fundador de OpenAI, Greg Brockman. También hablan con expertos que estudian la tecnología, como el Dr. Fei-Fei Li de la Universidad de Stanford. Kevin Roose, un reportero de tecnología de The New York Times, es entrevistado sobre la historia que escribió cuando el chatbot de IA de Bing le dijo que quería estar vivo y que
El primer episodio se titula Lo que la IA puede hacer por nosotros/a nosotros, y la dualidad del título es indicativa de dónde nos encontramos con la inteligencia artificial en 2024. Gates, así como los productores del programa Morgan Neville y Caitrin Rogers son los productores ejecutivos principales del programa hablan con varias personas involucradas en el desarrollo de software de IA, como el fundador de OpenAI, Greg Brockman. También hablan con expertos que estudian la tecnología, como el Dr. Fei-Fei Li de la Universidad de Stanford. Kevin Roose, un reportero de tecnología de The New York Times, es entrevistado sobre la historia que escribió cuando el chatbot de IA de Bing le dijo que quería estar vivo y que
Jeetu Patel is the president and chief product officer at Cisco, where he leads a team of 30,000 people and is playing a central role in the massive AI infrastructure buildout happening right now. Previously, he spent five years as CPO at Box and 17 years running his own startup. Recently Jeetu organized an AI summit featuring industry leaders like Jensen Huang, Sam Altman, Marc Andreessen, and Fei-Fei Li.We discuss:1. How Cisco went AI-first across 90,000 employees2. His six-part framework for building great companies: timing, market, team, product, brand, distribution3. Why he says he couldn't have done this job without AI4. His “right to win” strategic framework5. His communication framework for preventing “packet loss” across an organization6. Why he flips “praise in public, criticize in private” and does the exact opposite7. The important communication lesson his mother taught him—Brought to you by:Sentry—Code breaks, fix it faster: https://sentry.io/lennyFramer—Build better websites faster: https://framer.com/lennySamsara—Saving lives with AI built for physical operations: https://samsara.com/lenny—Episode transcript: https://www.lennysnewsletter.com/p/ai-is-critical-for-humanitys-survival—Archive of all Lenny's Podcast transcripts: https://www.dropbox.com/scl/fo/yxi4s2w998p1gvtpu4193/AMdNPR8AOw0lMklwtnC0TrQ?rlkey=j06x0nipoti519e0xgm23zsn9&st=ahz0fj11&dl=0—Where to find Jeetu Patel:• X: https://x.com/jpatel41• LinkedIn: https://www.linkedin.com/in/jeetupatel• Website: https://blogs.cisco.com/author/jeetupatel—Where to find Lenny:• Newsletter: https://www.lennysnewsletter.com• X: https://twitter.com/lennysan• LinkedIn: https://www.linkedin.com/in/lennyrachitsky/—In this episode, we cover:(00:00) Introduction and welcome(04:15) Insights from Cisco's Al summit(08:45) Transforming Cisco into an Al-first company(15:33) What Cisco actually does in the Al infrastructure stack(19:09) The future of Al(24:36) Raising kids in the AI era(29:46) “Permission to play” framework(36:50) Lessons from great CEOs(42:02) Leading at scale(50:54) Why Jeetu inverts the ‘praise in public, criticize in private' rule(57:45) Surrounding yourself with good human beings(58:35) Lessons from loss(01:03:21) Career advice: platforms, hunger, and preparation(01:10:21) The six-part framework for building great companies(01:19:05) Lightning round and final thoughts—Resources and episode mentions: https://www.lennysnewsletter.com/p/ai-is-critical-for-humanitys-survival—Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email podcast@lennyrachitsky.com.Lenny may be an investor in the companies discussed. To hear more, visit www.lennysnewsletter.com
AI Unraveled: Latest AI News & Trends, Master GPT, Gemini, Generative AI, LLMs, Prompting, GPT Store
AI Unraveled: Latest AI News & Trends, Master GPT, Gemini, Generative AI, LLMs, Prompting, GPT Store
Listen to Full Audio at https://podcasts.apple.com/us/podcast/gemini-3-1-pros-reasoning-leap-microsofts-10-000-year/id1684415169?i=1000750546704
AI has mastered language, sort of. But the real world is way messier.In this episode of TechFirst, John Koetsier sits down with Kirin Sinha, founder and CEO of Illumix, to explore what comes after large language models: world models, spatial intelligence, and physical AI.They unpack why LLMs alone won't get us to human-level intelligence, what it actually takes for machines to understand physical space, and how technologies born in augmented reality are now powering robotics, wearables, and real-world AI systems.This conversation goes deep on: • What “world models” really are — and why everyone from Fei-Fei Li to Jeff Bezos is betting on them • Why continuous video and outward-facing cameras are so hard for AI • The perception stack behind robots and smart glasses • Edge vs cloud compute — and why latency and privacy matter more than ever • How AR laid the groundwork for the next generation of physical intelligenceIf you're building or betting on robotics, smart wearables, AR, or physical AI, this episode explains the infrastructure shift that's already underway.GuestKirin SinhaFounder & CEO, Illumixhttps://www.illumix.com
What if the next leap in artificial intelligence isn't about better language—but better understanding of space?In this episode, a16z General Partner Erik Torenberg moderates a conversation with Fei-Fei Li, cofounder and CEO of World Labs, and a16z General Partner Martin Casado, an early investor in the company. Together, they dive into the concept of world models—AI systems that can understand and reason about the 3D, physical world, not just generate text.Often called the “godmother of AI,” Fei-Fei explains why spatial intelligence is a fundamental and still-missing piece of today's AI—and why she's building an entire company to solve it. Martin shares how he and Fei-Fei aligned on this vision long before it became fashionable, and why it could reshape the future of robotics, creativity, and computational interfaces.From the limits of LLMs to the promise of embodied intelligence, this conversation blends personal stories with deep technical insights—exploring what it really means to build AI that understands the real (and virtual) world. Follow Fei-Fei Li:https://x.com/drfeifei Follow Martin Casado:https://x.com/martin_casado Stay Updated: If you enjoyed this episode, be sure to like, subscribe, and share with your friends!Find a16z on X: https://x.com/a16zFind a16z on LinkedIn: https://www.linkedin.com/company/a16zListen to the a16z Podcast on Spotify: https://open.spotify.com/show/5bC65RDvs3oxnLyqqvkUYXListen to the a16z Podcast on Apple Podcasts: https://podcasts.apple.com/us/podcast/a16z-podcast/id842818711Follow our host: https://x.com/eriktorenberg Check out everything a16z is doing with artificial intelligence here, including articles, projects, and more podcasts. Please note that the content here is for informational purposes only; should NOT be taken as legal, business, tax, or investment advice or be used to evaluate any investment or security; and is not directed at any investors or potential investors in any a16z fund. a16z and its affiliates may maintain investments in the companies discussed. For more details please see a16z.com/disclosures. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.
Are we witnessing an AI-fueled gold rush or the early signs of an epic crash? Listen to these hard-hitting discussions on bubbles, breakthroughs, and the real impact behind Silicon Valley's AI obsession. Time Magazine's 'Person of the Year': the Architects of AI The AI Wildfire Is Coming. It's Going to Be Very Painful and Incredibly Healthy. 'ChatGPT for Doctors' Startup Doubles Valuation to $12 Billion as Revenue Surges Trump Pretends To Block State AI Laws; Media Pretends That's Legal It's beginning to look a lot like (AI) Christmas Amazon Prime Video Pulls AI-Powered Recaps After Fallout Flub Could America win the AI race but lose the war? Google Says First AI Glasses With Gemini Will Arrive in 2026 Border Patrol Agent Recorded Raid with Meta's Ray-Ban Smart Glasses The countdown to the world's first social media ban for children US could demand five-year social media history from tourists before allowing entry Reddit making global changes to protect kids after social media ban - 9to5Mac There are no good outcomes for the Warner Bros. sale Paramount CEO Made Trump a Secret Promise on CNN in Warner Bros. Convo Whatnot's Schlock Empire Shows Digital Live Shopping Can Thrive in America The Military Almost Got the Right to Repair. Lawmakers Just Took It Away Apple loses its appeal of a scathing contempt ruling in iOS payments case Japan law opening phone app stores to go into effect Microsoft Excel Turns 40, Remains Stubbornly Unkillable - Slashdot Clair Obscur: Expedition 33 sweeps The Game Awards — analysis and full winners list Microsoft promises more bug payouts, with or without a bounty program An ex-Twitter lawyer is trying to bring Twitter back Host: Leo Laporte Guests: Iain Thomson, Owen Thomas, and Jason Hiner Download or subscribe to This Week in Tech at https://twit.tv/shows/this-week-in-tech Join Club TWiT for Ad-Free Podcasts! Support what you love and get ad-free audio and video feeds, a members-only Discord, and exclusive content. Join today: https://twit.tv/clubtwit Sponsors: shopify.com/twit NetSuite.com/TWIT ventionteams.com/twit zscaler.com/security helixsleep.com/twit
Are we witnessing an AI-fueled gold rush or the early signs of an epic crash? Listen to these hard-hitting discussions on bubbles, breakthroughs, and the real impact behind Silicon Valley's AI obsession. Time Magazine's 'Person of the Year': the Architects of AI The AI Wildfire Is Coming. It's Going to Be Very Painful and Incredibly Healthy. 'ChatGPT for Doctors' Startup Doubles Valuation to $12 Billion as Revenue Surges Trump Pretends To Block State AI Laws; Media Pretends That's Legal It's beginning to look a lot like (AI) Christmas Amazon Prime Video Pulls AI-Powered Recaps After Fallout Flub Could America win the AI race but lose the war? Google Says First AI Glasses With Gemini Will Arrive in 2026 Border Patrol Agent Recorded Raid with Meta's Ray-Ban Smart Glasses The countdown to the world's first social media ban for children US could demand five-year social media history from tourists before allowing entry Reddit making global changes to protect kids after social media ban - 9to5Mac There are no good outcomes for the Warner Bros. sale Paramount CEO Made Trump a Secret Promise on CNN in Warner Bros. Convo Whatnot's Schlock Empire Shows Digital Live Shopping Can Thrive in America The Military Almost Got the Right to Repair. Lawmakers Just Took It Away Apple loses its appeal of a scathing contempt ruling in iOS payments case Japan law opening phone app stores to go into effect Microsoft Excel Turns 40, Remains Stubbornly Unkillable - Slashdot Clair Obscur: Expedition 33 sweeps The Game Awards — analysis and full winners list Microsoft promises more bug payouts, with or without a bounty program An ex-Twitter lawyer is trying to bring Twitter back Host: Leo Laporte Guests: Iain Thomson, Owen Thomas, and Jason Hiner Download or subscribe to This Week in Tech at https://twit.tv/shows/this-week-in-tech Join Club TWiT for Ad-Free Podcasts! Support what you love and get ad-free audio and video feeds, a members-only Discord, and exclusive content. Join today: https://twit.tv/clubtwit Sponsors: shopify.com/twit NetSuite.com/TWIT ventionteams.com/twit zscaler.com/security helixsleep.com/twit
Are we witnessing an AI-fueled gold rush or the early signs of an epic crash? Listen to these hard-hitting discussions on bubbles, breakthroughs, and the real impact behind Silicon Valley's AI obsession. Time Magazine's 'Person of the Year': the Architects of AI The AI Wildfire Is Coming. It's Going to Be Very Painful and Incredibly Healthy. 'ChatGPT for Doctors' Startup Doubles Valuation to $12 Billion as Revenue Surges Trump Pretends To Block State AI Laws; Media Pretends That's Legal It's beginning to look a lot like (AI) Christmas Amazon Prime Video Pulls AI-Powered Recaps After Fallout Flub Could America win the AI race but lose the war? Google Says First AI Glasses With Gemini Will Arrive in 2026 Border Patrol Agent Recorded Raid with Meta's Ray-Ban Smart Glasses The countdown to the world's first social media ban for children US could demand five-year social media history from tourists before allowing entry Reddit making global changes to protect kids after social media ban - 9to5Mac There are no good outcomes for the Warner Bros. sale Paramount CEO Made Trump a Secret Promise on CNN in Warner Bros. Convo Whatnot's Schlock Empire Shows Digital Live Shopping Can Thrive in America The Military Almost Got the Right to Repair. Lawmakers Just Took It Away Apple loses its appeal of a scathing contempt ruling in iOS payments case Japan law opening phone app stores to go into effect Microsoft Excel Turns 40, Remains Stubbornly Unkillable - Slashdot Clair Obscur: Expedition 33 sweeps The Game Awards — analysis and full winners list Microsoft promises more bug payouts, with or without a bounty program An ex-Twitter lawyer is trying to bring Twitter back Host: Leo Laporte Guests: Iain Thomson, Owen Thomas, and Jason Hiner Download or subscribe to This Week in Tech at https://twit.tv/shows/this-week-in-tech Join Club TWiT for Ad-Free Podcasts! Support what you love and get ad-free audio and video feeds, a members-only Discord, and exclusive content. Join today: https://twit.tv/clubtwit Sponsors: shopify.com/twit NetSuite.com/TWIT ventionteams.com/twit zscaler.com/security helixsleep.com/twit
Are we witnessing an AI-fueled gold rush or the early signs of an epic crash? Listen to these hard-hitting discussions on bubbles, breakthroughs, and the real impact behind Silicon Valley's AI obsession. Time Magazine's 'Person of the Year': the Architects of AI The AI Wildfire Is Coming. It's Going to Be Very Painful and Incredibly Healthy. 'ChatGPT for Doctors' Startup Doubles Valuation to $12 Billion as Revenue Surges Trump Pretends To Block State AI Laws; Media Pretends That's Legal It's beginning to look a lot like (AI) Christmas Amazon Prime Video Pulls AI-Powered Recaps After Fallout Flub Could America win the AI race but lose the war? Google Says First AI Glasses With Gemini Will Arrive in 2026 Border Patrol Agent Recorded Raid with Meta's Ray-Ban Smart Glasses The countdown to the world's first social media ban for children US could demand five-year social media history from tourists before allowing entry Reddit making global changes to protect kids after social media ban - 9to5Mac There are no good outcomes for the Warner Bros. sale Paramount CEO Made Trump a Secret Promise on CNN in Warner Bros. Convo Whatnot's Schlock Empire Shows Digital Live Shopping Can Thrive in America The Military Almost Got the Right to Repair. Lawmakers Just Took It Away Apple loses its appeal of a scathing contempt ruling in iOS payments case Japan law opening phone app stores to go into effect Microsoft Excel Turns 40, Remains Stubbornly Unkillable - Slashdot Clair Obscur: Expedition 33 sweeps The Game Awards — analysis and full winners list Microsoft promises more bug payouts, with or without a bounty program An ex-Twitter lawyer is trying to bring Twitter back Host: Leo Laporte Guests: Iain Thomson, Owen Thomas, and Jason Hiner Download or subscribe to This Week in Tech at https://twit.tv/shows/this-week-in-tech Join Club TWiT for Ad-Free Podcasts! Support what you love and get ad-free audio and video feeds, a members-only Discord, and exclusive content. Join today: https://twit.tv/clubtwit Sponsors: shopify.com/twit NetSuite.com/TWIT ventionteams.com/twit zscaler.com/security helixsleep.com/twit
Are we witnessing an AI-fueled gold rush or the early signs of an epic crash? Listen to these hard-hitting discussions on bubbles, breakthroughs, and the real impact behind Silicon Valley's AI obsession. Time Magazine's 'Person of the Year': the Architects of AI The AI Wildfire Is Coming. It's Going to Be Very Painful and Incredibly Healthy. 'ChatGPT for Doctors' Startup Doubles Valuation to $12 Billion as Revenue Surges Trump Pretends To Block State AI Laws; Media Pretends That's Legal It's beginning to look a lot like (AI) Christmas Amazon Prime Video Pulls AI-Powered Recaps After Fallout Flub Could America win the AI race but lose the war? Google Says First AI Glasses With Gemini Will Arrive in 2026 Border Patrol Agent Recorded Raid with Meta's Ray-Ban Smart Glasses The countdown to the world's first social media ban for children US could demand five-year social media history from tourists before allowing entry Reddit making global changes to protect kids after social media ban - 9to5Mac There are no good outcomes for the Warner Bros. sale Paramount CEO Made Trump a Secret Promise on CNN in Warner Bros. Convo Whatnot's Schlock Empire Shows Digital Live Shopping Can Thrive in America The Military Almost Got the Right to Repair. Lawmakers Just Took It Away Apple loses its appeal of a scathing contempt ruling in iOS payments case Japan law opening phone app stores to go into effect Microsoft Excel Turns 40, Remains Stubbornly Unkillable - Slashdot Clair Obscur: Expedition 33 sweeps The Game Awards — analysis and full winners list Microsoft promises more bug payouts, with or without a bounty program An ex-Twitter lawyer is trying to bring Twitter back Host: Leo Laporte Guests: Iain Thomson, Owen Thomas, and Jason Hiner Download or subscribe to This Week in Tech at https://twit.tv/shows/this-week-in-tech Join Club TWiT for Ad-Free Podcasts! Support what you love and get ad-free audio and video feeds, a members-only Discord, and exclusive content. Join today: https://twit.tv/clubtwit Sponsors: shopify.com/twit NetSuite.com/TWIT ventionteams.com/twit zscaler.com/security helixsleep.com/twit
Are we witnessing an AI-fueled gold rush or the early signs of an epic crash? Listen to these hard-hitting discussions on bubbles, breakthroughs, and the real impact behind Silicon Valley's AI obsession. Time Magazine's 'Person of the Year': the Architects of AI The AI Wildfire Is Coming. It's Going to Be Very Painful and Incredibly Healthy. 'ChatGPT for Doctors' Startup Doubles Valuation to $12 Billion as Revenue Surges Trump Pretends To Block State AI Laws; Media Pretends That's Legal It's beginning to look a lot like (AI) Christmas Amazon Prime Video Pulls AI-Powered Recaps After Fallout Flub Could America win the AI race but lose the war? Google Says First AI Glasses With Gemini Will Arrive in 2026 Border Patrol Agent Recorded Raid with Meta's Ray-Ban Smart Glasses The countdown to the world's first social media ban for children US could demand five-year social media history from tourists before allowing entry Reddit making global changes to protect kids after social media ban - 9to5Mac There are no good outcomes for the Warner Bros. sale Paramount CEO Made Trump a Secret Promise on CNN in Warner Bros. Convo Whatnot's Schlock Empire Shows Digital Live Shopping Can Thrive in America The Military Almost Got the Right to Repair. Lawmakers Just Took It Away Apple loses its appeal of a scathing contempt ruling in iOS payments case Japan law opening phone app stores to go into effect Microsoft Excel Turns 40, Remains Stubbornly Unkillable - Slashdot Clair Obscur: Expedition 33 sweeps The Game Awards — analysis and full winners list Microsoft promises more bug payouts, with or without a bounty program An ex-Twitter lawyer is trying to bring Twitter back Host: Leo Laporte Guests: Iain Thomson, Owen Thomas, and Jason Hiner Download or subscribe to This Week in Tech at https://twit.tv/shows/this-week-in-tech Join Club TWiT for Ad-Free Podcasts! Support what you love and get ad-free audio and video feeds, a members-only Discord, and exclusive content. Join today: https://twit.tv/clubtwit Sponsors: shopify.com/twit NetSuite.com/TWIT ventionteams.com/twit zscaler.com/security helixsleep.com/twit
Dr. Fei-Fei Li (@drfeifei) is the inaugural Sequoia Professor in the Computer Science Department at Stanford University, a founding co-director of Stanford's Human-Centered AI Institute, and the co-founder and CEO of World Labs, a generative AI company focusing on Spatial Intelligence. She is the author of The Worlds I See: Curiosity, Exploration, and Discovery at the Dawn of AI, her memoir and one of Barack Obama's recommended books on AI and a Financial Times best book of 2023.This episode is brought to you by:Seed's DS-01® Daily Synbiotic broad spectrum 24-strain probiotic + prebiotic: https://seed.com/timHelix Sleep premium mattresses: https://helixsleep.com/timCoyote the card game, which I co-created with Exploding Kittens: https://coyotegame.com/Wealthfront high-yield cash account: https://wealthfront.com/timNew clients get 3.50% base APY from program banks + additional 0.65% boost for 3 months on your uninvested cash (max $150k balance). Terms apply. The Cash Account offered by Wealthfront Brokerage LLC (“WFB”) member FINRA/SIPC, not a bank. The base APY as of 11/07/2025 is representative, can change, and requires no minimum. Tim Ferriss, a non-client, receives compensation from WFB for advertising and holds a non-controlling equity interest in the corporate parent of WFB. Experiences will vary. Outcomes not guaranteed. Instant withdrawals may be limited by your receiving firm and other factors. Investment advisory services provided by Wealthfront Advisers LLC, an SEC-registered investment adviser. Securities investments: not bank deposits, bank-guaranteed or FDIC-insured, and may lose value.*For show notes and past guests on The Tim Ferriss Show, please visit tim.blog/podcast.For deals from sponsors of The Tim Ferriss Show, please visit tim.blog/podcast-sponsorsSign up for Tim's email newsletter (5-Bullet Friday) at tim.blog/friday.For transcripts of episodes, go to tim.blog/transcripts.Discover Tim's books: tim.blog/books.Follow Tim:Twitter: twitter.com/tferriss Instagram: instagram.com/timferrissYouTube: youtube.com/timferrissFacebook: facebook.com/timferriss LinkedIn: linkedin.com/in/timferrissSee Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.
Fei-Fei Li is a Stanford professor, co-director of Stanford Institute for Human-Centered Artificial Intelligence, and co-founder of World Labs. She created ImageNet, the dataset that sparked the deep learning revolution. Justin Johnson is her former PhD student, ex-professor at Michigan, ex-Meta researcher, and now co-founder of World Labs.Together, they just launched Marble—the first model that generates explorable 3D worlds from text or images.In this episode Fei-Fei and Justin explore why spatial intelligence is fundamentally different from language, what's missing from current world models (hint: physics), and the architectural insight that transformers are actually set models, not sequence models. Resources:Follow Fei-Fei on X: https://x.com/drfeifeiFollow Justin on X: https://x.com/jcjohnssFollow Shawn on X: https://x.com/swyxFollow Alessio on X: https://x.com/fanahova Stay Updated:If you enjoyed this episode, please be sure to like, subscribe, and share with your friends.Follow a16z on X: https://x.com/a16zFollow a16z on LinkedIn:https://www.linkedin.com/company/a16zFollow the a16z Podcast on Spotify: https://open.spotify.com/show/5bC65RDvs3oxnLyqqvkUYXFollow the a16z Podcast on Apple Podcasts: https://podcasts.apple.com/us/podcast/a16z-podcast/id842818711Please note that the content here is for informational purposes only; should NOT be taken as legal, business, tax, or investment advice or be used to evaluate any investment or security; and is not directed at any investors or potential investors in any a16z fund. a16z and its affiliates may maintain investments in the companies discussed. For more details, please see http://a16z.com/disclosures. Stay Updated:Find a16z on XFind a16z on LinkedInListen to the a16z Podcast on SpotifyListen to the a16z Podcast on Apple PodcastsFollow our host: https://twitter.com/eriktorenberg Please note that the content here is for informational purposes only; should NOT be taken as legal, business, tax, or investment advice or be used to evaluate any investment or security; and is not directed at any investors or potential investors in any a16z fund. a16z and its affiliates may maintain investments in the companies discussed. For more details please see a16z.com/disclosures. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.
Fei-Fei Li and Justin Johnson are cofounders of World Labs, who have recently launched Marble (https://marble.worldlabs.ai/), a new kind of generative “world model” that can create editable 3D environments from text, images, and other spatial inputs. Marble lets creators generate persistent 3D worlds, precisely control cameras, and interactively edit scenes, making it a powerful tool for games, film, VR, robotics simulation, and more. In this episode, Fei-Fei and Justin share how their journey from ImageNet and Stanford research led to World Labs, why spatial intelligence is the next frontier after LLMs, and how world models could change how machines see, understand, and build in 3D.We discuss:* The massive compute scaling from AlexNet to today and why world models and spatial data are the most compelling way to “soak up” modern GPU clusters compared to language alone.* What Marble actually is: a generative model of 3D worlds that turns text and images into editable scenes using Gaussian splats, supports precise camera control and recording, and runs interactively on phones, laptops, and VR headsets.* Fei-fei's essay:on spatial intelligence as a distinct form of intelligence from language: from picking up a mug to inferring the 3D structure of DNA, and why language is a lossy, low-bandwidth channel for describing the rich 3D/4D world we live in.* Whether current models “understand” physics or just fit patterns: the gap between predicting orbits and discovering F=ma, and how attaching physical properties to splats and distilling physics engines into neural networks could lead to genuine causal reasoning.* The changing role of academia in AI, why Fei-Fei worries more about under-resourced universities than “open vs closed,” and how initiatives like national AI compute clouds and open benchmarks can rebalance the ecosystem.* Why transformers are fundamentally set models, not sequence models, and how that perspective opens up new architectures for world models, especially as hardware shifts from single GPUs to massive distributed clusters.* Real use cases for Marble today: previsualization and VFX, game environments, virtual production, interior and architectural design (including kitchen remodels), and generating synthetic simulation worlds for training embodied agents and robots.* How spatial intelligence and language intelligence will work together in multimodal systems, and why the goal isn't to throw away LLMs but to complement them with rich, embodied models of the world.* Fei-Fei and Justin's long-term vision for spatial intelligence: from creative tools for artists and game devs to broader applications in science, medicine, and real-world decision-making.—Fei-Fei Li* X: https://x.com/drfeifei* LinkedIn: https://www.linkedin.com/in/fei-fei-li-4541247Justin Johnson* X: https://x.com/jcjohnss* LinkedIn: https://www.linkedin.com/in/justin-johnson-41b43664Where to find Latent Space* X: https://x.com/latentspacepodFull Video EpisodeTimestamps00:00:00 Introduction and the Fei-Fei Li & Justin Johnson Partnership00:02:00 From ImageNet to World Models: The Evolution of Computer Vision00:12:42 Dense Captioning and Early Vision-Language Work00:19:57 Spatial Intelligence: Beyond Language Models00:28:46 Introducing Marble: World Labs' First Spatial Intelligence Model00:33:21 Gaussian Splats and the Technical Architecture of Marble00:22:10 Physics, Dynamics, and the Future of World Models00:41:09 Multimodality and the Interplay of Language and Space00:37:37 Use Cases: From Creative Industries to Robotics and Embodied AI00:56:58 Hiring, Research Directions, and the Future of World Labs Get full access to Latent.Space at www.latent.space/subscribe
Fei-Fei Li and Justin Johnson are cofounders of World Labs, who have recently launched Marble (https://marble.worldlabs.ai/), a new kind of generative “world model” that can create editable 3D environments from text, images, and other spatial inputs. Marble lets creators generate persistent 3D worlds, precisely control cameras, and interactively edit scenes, making it a powerful tool for games, film, VR, robotics simulation, and more. In this episode, Fei-Fei and Justin share how their journey from ImageNet and Stanford research led to World Labs, why spatial intelligence is the next frontier after LLMs, and how world models could change how machines see, understand, and build in 3D. We discuss: The massive compute scaling from AlexNet to today and why world models and spatial data are the most compelling way to “soak up” modern GPU clusters compared to language alone. What Marble actually is: a generative model of 3D worlds that turns text and images into editable scenes using Gaussian splats, supports precise camera control and recording, and runs interactively on phones, laptops, and VR headsets. Fei-fei's essay (https://drfeifei.substack.com/p/from-words-to-worlds-spatial-intelligence) on spatial intelligence as a distinct form of intelligence from language: from picking up a mug to inferring the 3D structure of DNA, and why language is a lossy, low-bandwidth channel for describing the rich 3D/4D world we live in. Whether current models “understand” physics or just fit patterns: the gap between predicting orbits and discovering F=ma, and how attaching physical properties to splats and distilling physics engines into neural networks could lead to genuine causal reasoning. The changing role of academia in AI, why Fei-Fei worries more about under-resourced universities than “open vs closed,” and how initiatives like national AI compute clouds and open benchmarks can rebalance the ecosystem. Why transformers are fundamentally set models, not sequence models, and how that perspective opens up new architectures for world models, especially as hardware shifts from single GPUs to massive distributed clusters. Real use cases for Marble today: previsualization and VFX, game environments, virtual production, interior and architectural design (including kitchen remodels), and generating synthetic simulation worlds for training embodied agents and robots. How spatial intelligence and language intelligence will work together in multimodal systems, and why the goal isn't to throw away LLMs but to complement them with rich, embodied models of the world. Fei-Fei and Justin's long-term vision for spatial intelligence: from creative tools for artists and game devs to broader applications in science, medicine, and real-world decision-making. — Fei-Fei Li X: https://x.com/drfeifei LinkedIn: https://www.linkedin.com/in/fei-fei-li-4541247 Justin Johnson X: https://x.com/jcjohnss LinkedIn: https://www.linkedin.com/in/justin-johnson-41b43664 Where to find Latent Space X: https://x.com/latentspacepod Substack: https://www.latent.space/ Chapters 00:00:00 Introduction and the Fei-Fei Li & Justin Johnson Partnership 00:02:00 From ImageNet to World Models: The Evolution of Computer Vision 00:12:42 Dense Captioning and Early Vision-Language Work 00:19:57 Spatial Intelligence: Beyond Language Models 00:28:46 Introducing Marble: World Labs' First Spatial Intelligence Model 00:33:21 Gaussian Splats and the Technical Architecture of Marble 00:22:10 Physics, Dynamics, and the Future of World Models 00:41:09 Multimodality and the Interplay of Language and Space 00:37:37 Use Cases: From Creative Industries to Robotics and Embodied AI 00:56:58 Hiring, Research Directions, and the Future of World Labs
Send us a textMicrosoft Partnership, Academy Updates, and Spatial Intelligence | Human-Centered AI Ep. 008Three signals from the AI frontier this week. Each one reshapes how we think about AI readiness.SIGNAL 1: Microsoft PartnershipDTJ is now Microsoft's official training partner for AI education in Japan—working with government officials and policymakers. When governments invest in AI literacy (not just tools), it confirms: this skill is baseline now.SIGNAL 2: Academy ConfidenceOur graduates walk into interviews ready when asked "How do you think about AI trade-offs?" They've built conviction, not memorized answers. December and January cohorts now open.SIGNAL 3: Spatial Intelligence Is LiveDr. Fei-Fei Li's work on AI that understands 3D space just dropped. Take one photo, AI generates a navigable 3D environment. For manufacturing, logistics, robotics—the next wave isn't coming. It's here.THREE CRITICAL INSIGHTS:1. AI Literacy Moved From Vertical to Horizontal - This isn't specialized anymore. It's baseline. Every role. Every level.2. Confidence Is the Competitive Advantage - Technical knowledge is optional. Strategic conviction about AI is not.3. The Frontier Keeps Revealing Itself - While most organizations are figuring out ChatGPT, AI just moved from digital to physical.We watch the frontier so you don't get blindsided. 37 minutes that translate what's coming into what it means for your work.LINKS:
In this episode, we dive into NASA's first test flight of the ultra-quiet X-59 supersonic jet, explore the futuristic Phantom transparent 4K monitor, and break down World Labs' breakthrough 3D world-modeling AI. We also cover TypeScript's unexpected rise in the AI era, the world's first mass delivery of humanoid factory workers, and how you can now run powerful open-source AI models locally. It's a packed show full of aviation, robotics, and cutting-edge tech that's reshaping the future. Want to be a Guest on a Podcast or YouTube Channel? Sign up for GuestMatch.Pro Thinking of buying a Starlink? Use my link to support the show. Don’t tell me you’ve been using the same password for every site? You’ll thank me later, Get 1Password. Subscribe to the Newsletter. Email Ray if you want to get in touch! Like and Follow Geek News Central’s Facebook Page. Support my Show Sponsor: Best Godaddy Promo Codes $11.99 – For a New Domain Name cjcfs3geek $6.99 a month Economy Hosting (Free domain, professional email, and SSL certificate for the 1st year.) Promo Code: cjcgeek1h $12.99 a month Managed WordPress Hosting (Free domain, professional email, and SSL certificate for the 1st year.) Promo Code: cjcgeek1w Support the show by becoming a Geek News Central Insider Full Summary In episode 1852 of the Geek News Central podcast, host Ray Cochrane welcomes listeners back after a brief hiatus, explaining the delay due to personal and professional commitments. He kicks off the show by discussing an exciting breakthrough from NASA: the successful test flight of the X-59, an experimental aircraft designed to quiet the sonic boom, potentially paving the way for commercial supersonic flight over land. Ray notes that the X-59, which resembles a swordfish, recently completed its first test flight in California, focusing on functionality rather than speed. It is intended to gather data on the aircraft’s noise impact on communities, indicating a significant step towards improving commercial travel times. After this, Ray thanks the podcast’s sponsor, GoDaddy, highlighting their hosting services and mentioning various promotional offers. He encourages listeners to support the show directly through the GoDaddy links, emphasizing their reliability in supporting the podcast. Following the sponsor message, Ray transitions into another topic, discussing a new prototype transparent 4K monitor named the Phantom developed by Virtual Instruments. The monitor is designed to allow users to see their environment through the screen while achieving remarkable brightness levels. Next, he introduces an innovative AI model called Marble developed by Fei Fei Li's startup, World Labs. Ray explains that this platform enables users to generate 3D worlds from simple prompts, marking a shift towards spatial intelligence in AI, which is essential for gaming, robotics, and visual effects. Ray then moves on to discuss TypeScript’s rise in the programming world, which has overtaken JavaScript and Python as the most used language on GitHub due to its compatibility with AI-assisted coding. He continues with news about UbiTech’s Walker S2 humanoid robots, which have begun mass delivery to factories, signifying a major milestone in manufacturing automation and the potential implications for the labor market. Ray finishes with information on the growing trend of running local open-source AI models on personal computers. He emphasizes the privacy advantages of using models like Llama and Mistral locally without relying on cloud providers. In closing, Ray reflects on the episode’s diverse topics and invites listener feedback regarding the content. He expresses gratitude for their support and encourages them to send comments or suggestions for future episodes. Ray ends by wishing everyone a good night and promising to return with more episodes soon. Show Links NASA X-59 Quiet Supersonic Test Flight Phantom Transparent 4K Monitor Fei-Fei Li's World Labs Launches Marble TypeScript's Rise in the AI Era (Hejlsberg Interview) UBTECH's First Large Delivery of Humanoid Workers How to Run Your Own Local Open-Source AI Model The post From NASA's X-59 to Humanoid Workers: The Future Is Getting Weird # 1852 appeared first on Geek News Central.
This episode is sponsored by AGNTCY. Unlock agents at scale with an open Internet of Agents. Visit https://agntcy.org/ and add your support. How will AI evolve once it can understand and reason about the 3D world, not just text on a screen? In this episode of Eye on AI, host Craig Smith speaks with Fei Fei Li about the rise of spatial intelligence and the world models that could transform how machines perceive, imagine, and interact with reality. We explore how spatial intelligence goes beyond language to connect perception, action, and reasoning in physical environments. You will hear how models like Marble build consistent and persistent 3D spaces, why multimodal inputs matter, and what it takes to create digital worlds that are useful for robotics, simulation, design, and creative workflows. Fei Fei also explains the challenges of long term memory, continuous learning, and the search for training objectives that mirror the role next token prediction plays in language models. Learn how spatial reasoning unlocks new possibilities in robotics and telepresence, why classical physics engines still matter, and how future AI systems may merge perception, planning, and imagination. You will also hear Fei Fei's perspective on the limits of current architectures, why true understanding is different from human understanding, and how world models could shape the next generation of intelligent systems. Stay Updated: Craig Smith on X: https://x.com/craigss Eye on A.I. on X: https://x.com/EyeOn_AI
Our 225th episode with a summary and discussion of last week's big AI news!Recorded on 11/16/2025Hosted by Andrey Kurenkov and co-hosted by Michelle LeeFeel free to email us your questions and feedback at contact@lastweekinai.com and/or hello@gladstone.aiRead out our text newsletter and comment on the podcast at https://lastweekin.ai/In this episode:New AI model releases include GPT-5.1 from OpenAI and Ernie 5.0 from Baidu, each with updated features and capabilities.Self-driving technology advancements from Baidu's Apollo Go and Pony AI's IPO highlight significant progress in the automotive sector.Startup funding updates include Incept taking $50M for diffusion models, while Cursor and Gamma secure significant valuations for coding and presentation tools respectively.AI-generated content is gaining traction with songs topping charts and new marketplaces for AI-generated voices, indicating evolving trends in synthetic media.Timestamps:(00:01:19) News PreviewTools & Apps(00:02:13) OpenAI says the brand-new GPT-5.1 is ‘warmer' and has more ‘personality' options | The Verge(00:04:51) Baidu Unveils ERNIE 5.0 and a Series of AI Applications at Baidu World 2025, Ramps Up Global Push(00:07:00) ByteDance's Volcano Engine debuts coding agent at $1.3 promo price(00:08:04) Google will let users call stores, browse products, and check out using AI | The Verge(00:10:41) Fei-Fei Li's World Labs speeds up the world model race with Marble, its first commercial product | TechCrunch(00:13:30) OpenAI says it's fixed ChatGPT's em dash problem | TechCrunchApplications & Business(00:16:01) Anthropic announces $50 billion data center plan | TechCrunch(00:18:06) Baidu teases next-gen AI training, inference accelerators • The Register(00:20:50) Meta chief AI scientist Yann LeCun plans to exit and launch own start-up(00:24:41) Amazon Demands Perplexity Stop AI Tool From Making Purchases - Bloomberg(00:27:32) AI PowerPoint-killer Gamma hits $2.1B valuation, $100M ARR, founder says | TechCrunch(00:29:33) Inception raises $50 million to build diffusion models for code and text | TechCrunch(00:31:14) Coding assistant Cursor raises $2.3B 5 months after its previous round | TechCrunch(00:33:56) China's Baidu says it's running 250,000 robotaxi rides a week — same as Alphabet's Waymo(00:35:26) Driverless Tech Firm Pony AI Raises $863 Million in HK ListingProjects & Open Source(00:36:30) Moonshot's Kimi K2 Thinking emerges as leading open source AIResearch & Advancements(00:39:22) [2510.26787] Remote Labor Index: Measuring AI Automation of Remote Work(00:45:21) OpenAI Researchers Train Weight Sparse Transformers to Expose Interpretable Circuits - MarkTechPost(00:49:34) Kimi Linear: An Expressive, Efficient Attention Architecture(00:53:33) Watch Google DeepMind's new AI agent learn to play video games | The Verge(00:57:34) arXiv Changes Rules After Getting Spammed With AI-Generated 'Research' PapersPolicy & Safety(00:59:35) Stability AI largely wins UK court battle against Getty Images over copyright and trademark | AP News(01:01:48) Court rules that OpenAI violated German copyright law; orders it to pay damages | TechCrunch(01:03:48) Microsoft's $15.2B UAE investment turns Gulf State into test case for US AI diplomacy | TechCrunchSynthetic Media & Art(01:06:39) An AI-Generated Country Song Is Topping A Billboard Chart, And That Should Infuriate Us All | Whiskey Riff(01:10:59) Xania Monet is the first AI-powered artist to debut on a Billboard airplay chart, but she likely won't be the last | CNN(01:13:34) ElevenLabs' new AI marketplace lets brands use famous voices for ads | The VergeSee Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.
The brilliant computer scientist Fei-Fei Li is often called the Godmother of AI. She talks with host Reid Hoffman about why scientists and entrepreneurs need to be fearless in the face of an uncertain future.Li was a founding director of the Human-Centered AI Institute at Stanford and is now an innovator in the area of spatial intelligence as co-founder and CEO of World Labs. This conversation was recorded live at the Presidio Theatre as part of the 2025 Masters of Scale Summit.Subscribe to the Masters of Scale weekly newsletter: https://mastersofscale.com/subscribe See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.
Dr. Fei-Fei Li is known as the “godmother of AI.” She's been at the center of AI's biggest breakthroughs for over two decades. She spearheaded ImageNet, the dataset that sparked the deep-learning revolution we're living right now, served as Google Cloud's Chief AI Scientist, directed Stanford's Artificial Intelligence Lab, and co-founded Stanford's Institute for Human-Centered AI. In this conversation, Fei-Fei shares the rarely told history of how we got here—including the wild fact that just nine years ago, calling yourself an AI company was basically a death sentence.We discuss:1. How ImageNet helped spark the AI explosion we're living through2. Why world models and spatial intelligence represent the next frontier in AI, beyond large language models3. Why Fei-Fei believes AI won't replace humans but will require us to take responsibility for ourselves4. The surprising applications of Marble, from movie production to psychological research5. Why robotics faces unique challenges compared with language models and what's needed to overcome them6. How to participate in AI regardless of your role—Brought to you by:Figma Make—A prompt-to-code tool for making ideas realJustworks—The all-in-one HR solution for managing your small business with confidenceSinch—Build messaging, email, and calling into your product—Transcript: https://www.lennysnewsletter.com/p/the-godmother-of-ai—My biggest takeaways (for paid newsletter subscribers):https://www.lennysnewsletter.com/i/178223233/my-biggest-takeaways-from-this-conversation—Where to find Dr. Fei-Fei Li• X: https://x.com/drfeifei• LinkedIn: https://www.linkedin.com/in/fei-fei-li-4541247• World Labs: https://www.worldlabs.ai—Where to find Lenny:• Newsletter: https://www.lennysnewsletter.com• X: https://twitter.com/lennysan• LinkedIn: https://www.linkedin.com/in/lennyrachitsky/—In this episode, we cover:(00:00) Introduction to Dr. Fei-Fei Li(05:31) The evolution of AI(09:37) The birth of ImageNet(17:25) The rise of deep learning(23:53) The future of AI and AGI(29:51) Introduction to world models(40:45) The bitter lesson in AI and robotics(48:02) Introducing Marble, a revolutionary product(51:00) Applications and use cases of Marble(01:01:01) The founder's journey and insights(01:10:05) Human-centered AI at Stanford(01:14:24) The role of AI in various professions(01:18:16) Conclusion and final thoughts—References: https://www.lennysnewsletter.com/p/the-godmother-of-ai—Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email podcast@lennyrachitsky.com.—Lenny may be an investor in the companies discussed. To hear more, visit www.lennysnewsletter.com
Are current AI models smart enough to rule the world — or just house cats with fancy vocabulary?This week, a tectonic shift is happening in AI: Meta's chief scientist Jan LeCun quits to chase world models, Fei-Fei Li launches Marble, a spatial intelligence engine, and DeepMind drops CMA-2, a self-taught gamer bot that might be the blueprint for AGI.Meanwhile, OpenAI releases GPT-5.1 — and China's Kimi K2 and Ernie 5.0 roll out shockingly powerful, ultra-low-cost models. The AI race isn't just about intelligence anymore — it's about who can afford to scale.If you lead a business, this episode explains why spatial intelligence, not language, may soon be your competitive edge. The next wave of AI isn't just about better answers, it's about deeper understanding, real-world interaction, and models that scale affordably. If you're not watching spatial intelligence, you're already behind.About Leveraging AI The Ultimate AI Course for Business People: https://multiplai.ai/ai-course/ YouTube Full Episodes: https://www.youtube.com/@Multiplai_AI/ Connect with Isar Meitis: https://www.linkedin.com/in/isarmeitis/ Join our Live Sessions, AI Hangouts and newsletter: https://services.multiplai.ai/events If you've enjoyed or benefited from some of the insights of this episode, leave us a five-star review on your favorite podcast platform, and let us know what you learned, found helpful, or liked most about this show!
AI Unraveled: Latest AI News & Trends, Master GPT, Gemini, Generative AI, LLMs, Prompting, GPT Store
AI Daily News Rundown November 14 2025:Welcome to AI Unraveled, Your daily briefing on the real world business impact of AIIn today's daily AI News Rundown:
Fei-Fei Li and Justin Johnson are pioneers in AI. While the world has only recently witnessed a surge in consumer AI, they have long been laying the groundwork for the innovations transforming industries today.With the recent launch of Marble, the first product from their company World Labs, we are revisiting this conversation to explore the ideas that started it all. World Labs is focused on spatial intelligence, building Large World Models that can perceive, generate, and interact with the 3D world. Marble brings that vision to life, allowing anyone, from individual creators to major platforms, to generate 3D scenes directly from text or image prompts and turn complex 3D creation into a simple, creative process.In this episode, a16z general partner Martin Casado talks with Fei-Fei and Justin about the journey from early AI winters to the rise of deep learning and multimodal AI. From foundational breakthroughs like ImageNet to the cutting-edge realm of spatial intelligence, they discuss the evolution of the field and what is next for innovation at World Labs. Timecode:0:00 – The Next Decade of AI2:45 – Origins: Backgrounds of the Founders6:50 – The Rise of Deep Learning & ImageNet8:00 – Algorithmic Unlocks: Compute, Data, and Supervised Learning12:00 – From Predictive to Generative AI16:20 – The Journey to Spatial Intelligence18:35 – Defining Spatial Intelligence21:15 – 3D Data, Computer Vision, and Breakthroughs23:15 – Reconstruction vs. Generation in Computer Vision24:45 – Spatial Intelligence vs. Language Models29:00 – Applications: Virtual, Augmented, and Physical Worlds39:55 – Building World Labs: Team and Vision41:55 – The North Star: Measuring Success in Spatial Intelligence Resources:Learn more about World Labs: https://www.worldlabs.aiLearn more about Marble: https://Marble.WorldLabs.aiFind Fei-Fei on Twitter: https://x.com/drfeifeiFind Justin on Twitter: https://x.com/jcjohnssFind Martin on Twitter: https://x.com/martin_casado Stay Updated: If you enjoyed this episode, be sure to like, subscribe, and share with your friends!Find a16z on X: https://x.com/a16zFind a16z on LinkedIn: https://www.linkedin.com/company/a16zListen to the a16z Podcast on Spotify: https://open.spotify.com/show/5bC65RDvs3oxnLyqqvkUYXListen to the a16z Podcast on Apple Podcasts: https://podcasts.apple.com/us/podcast/a16z-podcast/id842818711Follow our host: https://x.com/eriktorenbergPlease note that the content here is for informational purposes only; should NOT be taken as legal, business, tax, or investment advice or be used to evaluate any investment or security; and is not directed at any investors or potential investors in any a16z fund. a16z and its affiliates may maintain investments in the companies discussed. For more details please see a16z.com/disclosures. Stay Updated:Find a16z on XFind a16z on LinkedInListen to the a16z Podcast on SpotifyListen to the a16z Podcast on Apple PodcastsFollow our host: https://twitter.com/eriktorenberg Please note that the content here is for informational purposes only; should NOT be taken as legal, business, tax, or investment advice or be used to evaluate any investment or security; and is not directed at any investors or potential investors in any a16z fund. a16z and its affiliates may maintain investments in the companies discussed. For more details please see a16z.com/disclosures. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.
Forget text-to-video AI, we're rapidly moving into the text-to-world-generating AI models. What if you showed up to your Airbnb and the fridge was already fully stocked? It seems like there is NO uncanney valley when it comes to AI generated music. And does the Big Short guy have a point when he concern trolls about the AI CAPEX buildout? Fei-Fei Li's World Labs speeds up the world model race with Marble, its first commercial product (TechCrunch) Airbnb Will Test Adding Instacart Grocery Delivery to Its App in Services Push (Bloomberg) Are you listening to bots? Survey shows AI music is virtually undetectable (Reuters) 50,000 AI tracks flood Deezer daily – as study shows 97% of listeners can't tell the difference between human-made vs. fully AI-generated music (MusicBusinessWorldwide) The AI Bubble Is Ignoring Michael Burry's Fears (Bloomberg) Learn more about your ad choices. Visit megaphone.fm/adchoices
The AI Breakdown: Daily Artificial Intelligence News and Discussions
Today on the AI Daily Brief, NLW covers two stories that may signal a major shift in the AI landscape: Yann LeCun's departure from Meta and Dr. Fei-Fei Li's new argument that spatial intelligence and world models—not just LLMs—will define the next era of AI, exploring what world models actually are, why some researchers think they're essential for robotics, science, and creativity, and how this connects to Meta's internal reorg. Plus in the headlines: Eleven Labs' celebrity voice marketplace, SoftBank's Nvidia liquidation, AMD's push to challenge Nvidia, Blue Owl's $3B Stargate investment, and the surprising surge in Meta AI traffic.Brought to you by:KPMG – Discover how AI is transforming possibility into reality. Tune into the new KPMG 'You Can with AI' podcast and unlock insights that will inform smarter decisions inside your enterprise. Listen now and start shaping your future with every episode. https://www.kpmg.us/AIpodcastsRovo - Unleash the potential of your team with AI-powered Search, Chat and Agents - https://rovo.com/AssemblyAI - The best way to build Voice AI apps - https://www.assemblyai.com/briefBlitzy.com - Go to https://blitzy.com/ to build enterprise software in days, not months Robots & Pencils - Cloud-native AI solutions that power results https://robotsandpencils.com/The Agent Readiness Audit from Superintelligent - Go to https://besuper.ai/ to request your company's agent readiness score.The AI Daily Brief helps you understand the most important news and discussions in AI. Subscribe to the podcast version of The AI Daily Brief wherever you listen: https://pod.link/1680633614Interested in sponsoring the show? sponsors@aidailybrief.ai
Jason Howell and Jeff Jarvis discuss Yann LeCun's possible Meta exit, SoftBank unloading its Nvidia stake for an OpenAI investment, an AI-generated artist topping Billboard's country chart, OpenAI's talks with Washington over federal loan guarantees, Perplexity's stance on companion chat bots, Apple reportedly licensing Google's Gemini, Amazon launching Kindle Translate, and Google Photos expanding with Nano Banana features. CHAPTERS: 0:03:33: Meta chief AI scientist Yann LeCun plans to exit to launch startup, FT reports 0:09:03: Cambrian-S: Towards Spatial Supersensing in Video: by Li, LeCun, et al 0:10:11: Fei-Fei Li's World Labs speeds up the world model race with Marble, its first commercial product 0:16:45: SoftBank Sells Its Nvidia Stake for $5.8 Billion to Fund OpenAI Bet 0:20:10: Anthropic Is on Track to Turn a Profit Much Faster Than OpenAI 0:27:40: OpenAI discussed government loan guarantees for chip plants, not data centers, Altman says 0:29:19: @sama: I would like to clarify a few things. 0:38:51: Country's No. 1 Digital Song Is an AI Smash, But Who Is Breaking Rust? Jeff's Arxiv Showdown 0:47:13: How Similar Are Grokipedia and Wikipedia? 0:49:24: Brain Organoid Computing 0:49:45: What We Can Learn From Brain Organoids 0:51:18: LLM-Based Multi-Agent System for Simulating and Analyzing Marketing and Consumer Behavior 0:52:00: Shareholder Democracy with AI Representatives 0:53:00: No. 10's synthetic voters 0:55:03: Perplexity's CEO says he's worried about AI companionship apps: 'Your mind is manipulable very easily' 0:56:20: tangentially related; might not mention: Tim Wu and Cory Doctorow's NPCs: Non-Player Consumers 0:57:44: Apple Plans to Use 1.2 Trillion Parameter Google Gemini Model to Power New Siri - Bloomberg 0:59:05: Amazon launches an AI-powered Kindle Translate service for e-book authors 1:02:41: 6 new things you can do with AI in Google Photos 1:04:00: Remix makes sending photos to friends even more fun on Google Messages. 1:05:08: MotionStream AI Learn more about your ad choices. Visit megaphone.fm/adchoices
Now on Spotify Video! As a Stanford AI scientist, Dr. Fei-Fei Li realized that artificial intelligence had advanced to a point where it was transforming society faster than most people could understand. Confronted with the ethical, social, and economic risks of this rapid growth, she felt a deep responsibility to guide AI toward serving humanity. This inspired her to co-found the Stanford Institute for Human-Centered AI, developing a framework that prioritizes humankind. In this episode, Dr. Fei-Fei shares how we can harness AI responsibly and design technology that enhances, not replaces, human potential. In this episode, Hala and Dr. Fei-Fei will discuss: (00:00) Introduction (02:33) The Evolution and Limits of Artificial Intelligence (09:56) How AI Models Like ChatGPT Are Trained (14:12) Dr. Fei-Fei's Journey and Responsibility in AI (19:15) How Computer Vision Brings AI to Life (25:59) Ethical AI, Human Dignity, and the Future of Work (32:57) The Three Pillars of Human-Centered AI (35:10) Confronting Fears of AI in Action (39:59) AI in Business: How Entrepreneurs Can Thrive Dr. Fei-Fei Li is a professor of computer science at Stanford University and co-director of the Stanford Institute for Human-Centered AI. Her groundbreaking work in computer vision AI has shaped how machines see and understand the world. Dr. Fei-Fei is the author of The World's I See, a memoir that weaves together her personal journey with the history and development of artificial intelligence. Sponsored By: Indeed - Get a $75 sponsored job credit to boost your job's visibility at Indeed.com/PROFITING Shopify - Start your $1/month trial at Shopify.com/profiting. Quo - Get 20% off your first 6 months at Quo.com/PROFITING Revolve - Head to REVOLVE.com/PROFITING and take 15% off your first order with code PROFITING Merit Beauty - Go to meritbeauty.com to get your free signature makeup bag with your first order. DeleteMe - Remove your personal data online. Get 20% off DeleteMe consumer plans at to joindeleteme.com/profiting Spectrum Business - Visit Spectrum.com/FreeForLife to learn how you can get Business Internet Free Forever. Airbnb - Find yourself a cohost at airbnb.com/host Resources Mentioned: Dr. Fei-Fei's Book, The Worlds I See: bit.ly/WorldsISee Stanford Human-Centered AI Institute Website: hai.stanford.edu/ Active Deals - youngandprofiting.com/deals Key YAP Links Reviews - ratethispodcast.com/yap YouTube - youtube.com/c/YoungandProfiting Newsletter - youngandprofiting.co/newsletter LinkedIn - linkedin.com/in/htaha/ Instagram - instagram.com/yapwithhala/ Social + Podcast Services: yapmedia.com Transcripts - youngandprofiting.com/episodes-new Entrepreneurship, Entrepreneurship Podcast, Business, Business Podcast, Self Improvement, Self-Improvement, Personal Development, Starting a Business, Strategy, Investing, Sales, Selling, Psychology, Productivity, Entrepreneurs, AI, Artificial Intelligence, Technology, Marketing, Negotiation, Money, Finance, Side Hustle, Startup, Mental Health, Career, Leadership, Mindset, Health, Growth Mindset, AI Marketing, Prompt, Generative AI, AI for Entrepreneurs, AI Podcast
This episode was recorded at https://www.imaginationinaction.co/ Get access to metatrends 10+ years before anyone else - https://qr.diamandis.com/metatrends Eric Schmidt is the former CEO of Google; Chair and CEO of Relativity Space. Fei-Fei Li is an AI researcher & professor at Stanford University; Co-director at Stanford Human-Centered AI Institute. _ Connect with Peter: X Instagram Connect with Eric: X Linkedin His latest book Connect with Fei-Fei Li X Linkedin Her latest book Listen to MOONSHOTS: Apple YouTube – *Recorded on October 27th, 2025 *The views expressed by me and all guests are personal opinions and do not constitute Financial, Medical, or Legal advice. Learn more about your ad choices. Visit megaphone.fm/adchoices
No Priors: Artificial Intelligence | Machine Learning | Technology | Startups
2025 has thus far been a year of great leaps and advances in AI technology. And Sarah and Elad have spoken with some of the most enterprising founders and scientific minds in the field of AI today. So we're revisiting a few of our favorite conversations on No Priors so far in 2025 – Winston Weinberg (Harvey), Dr. Fei-Fei Li (World Labs), Brendan Foody (Mercor), Dan Hendrycks (Center for AI Safety), Noubar Afeyan (Flagship Pioneering), Brandon McKinzie and Eric Mitchell (OpenAI o3), Isa Fulford (OpenAI), Arvind Jain (Glen), and Dr. Shiv Rao (Abridge). Sign up for new podcasts every week. Email feedback to show@no-priors.com Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil Chapters: 00:00 – Episode Introduction 0:21 – Winston Weinberg on Leaning into New Capabilities 02:01 – Dr. Fei-Fei Li on Spatial Intelligence 04:13 – Brendan Foody on AI Disruption in the Workforce 06:10 – Dan Hendrycks on the Geopolitics of Superintelligence 08:06 – Noubar Afeyan on Entrepreneurship 10:38 – Brandon McKinzie and Eric Mitchell on Reasoning Models 12:41 – Isa Fulford on Training Deep Research 13:49 – Arvind Jain on Innovating Enterprise Search 16:21 – Dr. Shiv Rao on AI's Human Impact 18:58 – Conclusion
If/Then: Research findings to help us navigate complex issues in business, leadership, and society
This week on If/Then, we're sharing an episode of What's Your Problem?, a show from Pushkin Industries where entrepreneurs, engineers, and scientists talk about the future they're trying to build—and the problems they must solve to get there. Hosted by former Planet Money co-host Jacob Goldstein, each conversation explores the challenges and breakthroughs shaping the next wave of innovation.In this episode, Goldstein speaks with Fei-Fei Li, Stanford computer scientist, former Chief Scientist of AI and Machine Learning at Google, and one of the most influential figures in the field of computer vision. Li reflects on her pioneering work developing ImageNet, the massive dataset that helped spark the modern AI revolution, and the “north star” questions that have guided her research from neuroscience to machine learning.Together, they trace how a single insight about how humans see the world led to a paradigm shift in artificial intelligence—and how Li's vision continues to shape the way we teach machines to see, learn, and collaborate with us.More Resources: • Fei Fei Li • Stanford Institute for Human-Centered Artificial Intelligence (HAI) • ImageNet • What's Your Problem?If/Then is a podcast from Stanford Graduate School of Business that examines research findings that can help us navigate the complex issues we face in business, leadership, and society.Chapters: (00:00:00) Introducing “What's Your Problem?” Kevin Cool introduces the Pushkin Industries podcast hosted by Jacob Goldstein.00:00:45 — What Is Computer Vision? Jacob Goldstein and Fei-Fei Li explain how machines learn to see and interpret images.00:03:18 — Real-World Uses of AI Vision Li shares examples from healthcare, robotics, and environmental science.00:05:06 — Discovering the Science of SeeingHow human vision research inspired Li's lifelong “north star” in AI.00:09:56 — Creating ImageNet Li builds a massive image database that transforms computer vision research.00:13:29 — Defining 30,000 Visual Concepts How cognitive science helped shape ImageNet's massive scale.00:16:41 — Building the Dataset by HandLi's team uses global crowdsourcing to label millions of images.00:19:38 — The 2012 Breakthrough Jeff Hinton's neural network shatters records and sparks the deep learning era.00:22:19 — Data Meets Hardware Li reflects on how big data and GPUs converged to power modern AI.00:24:55 — Lightning Round with Fei-Fei Li Quick insights on resilience, mentorship, and the future of human-AI collaboration.See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.
Author and experience designer Caitlin Krause joins Coffey & Code to unpack digital wellbeing beyond buzzwords: agency over algorithms, the “presence pyramid,” culture cornerstones (dignity, freedom, invention, agency), and practical ways to design for authentic connection in an age of AI, agents, and XR. The conversation spans data ownership, interoperable “internet of agents” ideas like Project NANDA, the loneliness epidemic, and responsible product choices that reduce harm and increase belongingEpisode At A Glance:Define “digital wellbeing” without the hype: aligning intention and attention; context over one-size-fits-all rulesAgency over algorithms: opting into platforms and practices that honor user choice, not just engagement metricsPresence Pyramid & somatic awareness: embodied practices that translate across 2D, XR, and spatial environmentsCulture Cornerstones: dignity → freedom → invention → agency as a repeatable loop for teams and communitiesFrom silos to interoperability: why open protocols for AI agents matter (e.g., Project NANDA)Designing for belonging: move beyond performative social to ambient, low-stakes co-presence that reduces lonelinessSafety first: name harms clearly; pair AI with human support paths and mood check-ins after useResources Mentioned:Digital Well-being (book) by Caitlin Krause; also: Designing Wonder, Mindful by DesignPresence Pyramid (framework)Project NANDA: Networked AI Agents and Decentralized ArchitectureStanford HAI; MIT Media Lab; AR in ActionResearch/voices referenced: Esther Perel, Sherry Turkle, Brené Brown, Fei-Fei Li, Ramesh Raskar, David EaglemanOn anthropomorphism of AI (NPR segment)988 Suicide & Crisis Lifeline (US) EPISODE CREDITS:Produced and edited by Ashley Coffey. Cover art designed by Ashley Coffey.Headshot by Brandlink MediaIntroduction music composed and produced by Ashley Coffey LINKSFollow Coffey & Code on Instagram, Facebook, Linkedin, and YouTube for the latest emerging tech updates! Subscribe to the Coffey & Code Podcast wherever you get your podcasts to be notified when new episodes go live. © 2025 Coffey & Code Podcast. All rights reserved. The content of this podcast, including but not limited to text, graphics, audio, and images, is the property of Ashley Coffey and may not be reproduced, redistributed, or used in any manner without the express written consent of the owner. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.
Every year, the second Tuesday in October is designated as Ada Lovelace Day as a tribute to its namesake, Ada Lovelace, the 19th century mathematician and pioneering computer programmer who collaborated with Charles Babbage on the design of his remarkable mechanical computer, the Analytical Machine. To celebrate Ada Lovelace Day 2025, Alice and Paola are dedicating this special episode of Design Emergency to celebrating her achievements and those of other remarkable women who have honoured Ada's legacy in different ways, making crucial contributions to the digital age. .Some of them have designed and delivered transformational advances in technology, such as Britain's ingenious female code-breakers at Bletchley Park during World War II, Ida Holz, the Uruguayan computer scientist and engineer who pioneered the internet in Latin America, and Stacy Horn, who designed one of the first online communities in ECHO..Others have developed inspiring ways of improving existing systems: both by alerting us to new possibilities, and by identifying or defusing unexpected dangers, as the Chinese-born, US-based computer scientist Fei-Fei Li has done, and the Kenyan tech designer and activist, Juliana Rotich. While Jay-Ann Lopez, founder of the global network of Black Girl Gamers and new media pioneer, Lynn Hershman Leeson, are at the forefront of challenging stereotypes and championing diversity, inclusivity and equity within tech design, thereby helping to make it fitter for purpose and to realise its true potential. .We hope you'll enjoy this episode. You can find images of the projects Alice and Paola describe on our Instagram @design.emergency. Please join us for future episodes of Design Emergency when we will hear from inspiring global design leaders who are in the forefront of forging positive change..Design Emergency is supported by a grant from the Graham Foundation for Advanced Studies in the Fine Arts Hosted on Acast. See acast.com/privacy for more information.
In a special Future of Everything podcast episode recorded live before a studio audience in New York, host Russ Altman talks to three authorities on the innovation economy. His guests – Fei-Fei Li, professor of computer science and co-director of the Stanford Institute for Human-Centered AI (HAI); Susan Athey, professor and authority on the economics of technology; and Neale Mahoney, Trione Director of the Stanford Institute for Economic Policy Research – bring their distinct-but-complementary perspectives to a discussion on how artificial intelligence is reshaping our economy.Athey emphasizes that both AI broadly and AI-based coding tools specifically are general-purpose technologies, like electricity or the personal computer, whose impact may be felt quickly in certain sectors but much more slowly in aggregate. She tells how solving one bottleneck to implementation often reveals others – whether in digitization, adoption costs, or the need to restructure work and organizations. Mahoney draws on economic history to say we are in a “veil of ignorance” moment with regard to societal impacts. We cannot know whose jobs will be disrupted, he says, but we can invest in safety nets now to ease the transition. Li cautions against assuming AI will replace people. Instead, she speaks of AI as a “horizontal technology” that could supercharge human creativity – but only if it is properly rooted in science, not science fiction.Collectively, the panel calls on policymakers, educators, researchers, and entrepreneurs to steer AI toward what they call “human-centered goals” – protecting workers, growing opportunities, and supercharging education and medicine – to deliver broad and shared prosperity. It's the future of the innovation economy on this episode of Stanford Engineering's The Future of Everything podcast.Have a question for Russ? Send it our way in writing or via voice memo, and it might be featured on an upcoming episode. Please introduce yourself, let us know where you're listening from, and share your question. You can send questions to thefutureofeverything@stanford.edu.Episode Reference Links:Stanford Profile: Fei-Fei LiStanford Profile: Susan AtheyStanford Profile: Neale MahoneyConnect With Us:Episode Transcripts >>> The Future of Everything WebsiteConnect with Russ >>> Threads / Bluesky / MastodonConnect with School of Engineering >>> Twitter/X / Instagram / LinkedIn / FacebookChapters:(00:00:00) IntroductionRuss Altman introduces live guests Fei-Fei Li, Susan Athey, and Neale Mahoney, professors from Stanford University.(00:02:37) Lessons from Past TechnologyComparing AI with past technologies and the bottlenecks to their adoption.(00:06:29) Jobs & Safety NetsThe uncertainty of AI's labor impact and investing in social protections.(00:08:29) Augmentation vs. ReplacementUsing AI as a tool to enhance, not replace, human work and creativity.(00:11:41) Human-Centered AI & PolicyShaping AI through universities, government, and global collaboration.(00:15:58) Education RevolutionThe potential for AI to revolutionize education by focusing on human capital.(00:18:58) Balancing Regulation & InnovationBalancing pragmatic, evidence-based AI policy with entrepreneurship.(00:22:22) Competition & Market PowerThe risks of monopolies and the role of open models in fair pricing.(00:25:22) America's Economic FunkHow social media and innovation are shaping America's declining optimism.(00:27:05) Future in a MinuteThe panel shares what gives them hope and what they'd study today.(00:30:49) Conclusion Connect With Us:Episode Transcripts >>> The Future of Everything WebsiteConnect with Russ >>> Threads / Bluesky / MastodonConnect with School of Engineering >>>Twitter/X / Instagram / LinkedIn / Facebook Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.
This week on “Paul, Weiss Waking Up With AI,” Katherine Forrest and Anna Gressel explore the rapid advancements and real-world applications of AI world models, highlighting cutting-edge developments like Google DeepMind's Genie 3, Fei-Fei Li's World Labs and Microsoft Muse. ## Learn More About Paul, Weiss's Artificial Intelligence practice: https://www.paulweiss.com/industries/artificial-intelligence
In this episode of Become Your Own Boss, Monica welcomes back business partner and resident AI expert Ethan King to unpack the latest AI breakthroughs that every small business owner should be using right now. From AI agents that can take real actions on your behalf, to study modes that quiz your team (or your kids), to building full-blown apps without writing a line of code—this episode will open your eyes to what's really possible with today's AI tools. If you've ever thought “I wish I had an app for that,” or “there's got to be an easier way,” this conversation is for you.Episode Quote: Artificial intelligence is not a substitute for human intelligence. It is a tool to amplify human creativity and ingenuity. ~Fei-Fei Li.What You Will Learn in This EpisodeHow to use AI agents to handle tasks like a digital assistantHow to train your team using ChatGPT's new study modeHow to run deep research on people and companies in minutesHow to build a working app or game using one simple promptHow to use AI to create videos, images, and even Super Bowl-worthy adsHow to get your AI model to truly know you for smarter, faster resultsGuest: Ethan King - Speaker, Entrepreneur, Author & more.Grab the book ChatGPT to Double your Business in 90 DaysEpisode Sponsor - Zeus' Closet Helpful Entrepreneurial Resources from Become Your Own BossSubscribe to the Level Up Living NewsletterMonica FREE ebookGet your Become Your Own Boss PlannerWays to reach Monica:Instagram: @becomeyourownbosspodcastEmail: monica@monicaallen.com
Historian Andrew Roberts is joined by former U.S. Secretary of State and current Director of the Hoover Institution Condoleezza Rice for a dive deep into today's international hotspots—including Russia's invasion of Ukraine, rising tensions with China over Taiwan, and the complex relationships between Russia, China, Iran, and North Korea. Their discussion also covers how leaders draw lessons from history, what might tip the world into a new Cold War, and how nations might address these evolving challenges. Director Rice also gives her thoughts on the rapid rise of artificial intelligence— which includes her recommended read, The Worlds I See by Fei-Fei Li, the Founding Co-Director of Stanford's Human-Centered AI Institute.
In this episode, Carrol Chang, CEO of Andela, and Juliana Ospina, Global EdTech Lead at the International Finance Corporation (IFC), join Bob Hawkins from the World Bank to explore how digital skills and AI are unlocking new job opportunities for youth, especially in Africa. They unpack the rise of demand-driven, scalable training models, the power of global talent marketplaces and the importance of public-private collaboration. From coding in remote villages to the future of work shaped by AI, this conversation looks at how to build more inclusive, connected, and future-ready education systems.Learn more:Andela – Global Talent MarketplaceInternational Finance Corporation (IFC) – Education Open Talent by John Winsor (book)The Worlds I See by Fei-Fei Li (book)World Bank Podcast: Ethiopia's Education and Skills for Employability (EASE)A podcast produced by Lucia Blasco.
No Priors: Artificial Intelligence | Machine Learning | Technology | Startups
In this episode of No Priors, Sarah and Elad are joined by Dr. Fei-Fei Li, AI pioneer, co-director of Stanford's Human-Centered AI Institute, and founder of World Labs. Fei-Fei shares why she's building at the intersection of embodiment and intelligence, and what today's AI systems are still missing. From the early days of ImageNet to her vision for the next generation of robotics, she unpacks the human and technical motivations behind World Labs. They also discuss the challenges of 3D world modeling, her approach to building exceptional teams, and the special qualities that have led her students like Andrej Karpathy to make major breakthroughs. Show Notes: 0:00 Why and what Dr. Fei-Fei Li is building 3:00 World models at World Labs 6:44 Missing gaps in the AI future 9:16 Robotics and physical intelligence 16:15 Greatest challenges of 3D 19:08 Fei-Fei's work in PhD in ImageNet 23:05 Special moments in Dr. Li's career 29:33 Building teams 32:05 Human-centered AI
What if the next leap in artificial intelligence isn't about better language—but better understanding of space?In this episode, a16z General Partner Erik Torenberg moderates a conversation with Fei-Fei Li, cofounder and CEO of World Labs, and a16z General Partner Martin Casado, an early investor in the company. Together, they dive into the concept of world models—AI systems that can understand and reason about the 3D, physical world, not just generate text.Often called the “godmother of AI,” Fei-Fei explains why spatial intelligence is a fundamental and still-missing piece of today's AI—and why she's building an entire company to solve it. Martin shares how he and Fei-Fei aligned on this vision long before it became fashionable, and why it could reshape the future of robotics, creativity, and computational interfaces.From the limits of LLMs to the promise of embodied intelligence, this conversation blends personal stories with deep technical insights—exploring what it really means to build AI that understands the real (and virtual) world.Resources: Find Fei-Fei on X: https://x.com/drfeifeiFind Martin on X: https://x.com/martin_casadoLearn more about World Labs: https://www.worldlabs.ai/ Stay Updated: Let us know what you think: https://ratethispodcast.com/a16zFind a16z on Twitter: https://twitter.com/a16zFind a16z on LinkedIn: https://www.linkedin.com/company/a16zSubscribe on your favorite podcast app: https://a16z.simplecast.com/Follow our host: https://x.com/eriktorenbergPlease note that the content here is for informational purposes only; should NOT be taken as legal, business, tax, or investment advice or be used to evaluate any investment or security; and is not directed at any investors or potential investors in any a16z fund. a16z and its affiliates may maintain investments in the companies discussed. For more details please see a16z.com/disclosures.
Dr. Fei-Fei Li, known as the godmother of AI, talks to Margaret Hoover about the ethical development of artificial intelligence and the challenge of regulating the rapidly advancing technology.Li, who recently received a lifetime achievement award at the Webbys for her AI research, explains why she focuses her work on “human-centered AI” and how she believes human dignity can be protected as AI progresses.Li discusses the role of government funding in academic research and the importance of diversity in science, and she outlines a pragmatic approach to AI governance rooted in science, rather than science fiction.Li, co-founder of Stanford's Human-Centered AI Institute, comments on the AI race between the U.S. and China, the concerns raised by potential military applications of the technology, and whether it is safe to place AI in the hands of children.Support for “Firing Line for Margaret Hoover” is provided by Robert Granieri, Vanessa and Henry Cornell, The Fairweather Foundation, Peter and Mark Kalikow, Cliff and Laurel Asness, The Meadowlark Foundation, The Beth and Ravenel Curry Foundation, Charles R. Schwab, The Marc Haas Foundation, Katharine J. Rayner, Damon Button, Craig Newmark Philanthropies, The Philip I Kent Foundation, Annie Lamont through The Lamont Family Fund, Lindsay and George Billingsley, The Susan Rasinski McCaw Fund, Cheryl Cohen Effron and Blair Effron, and Al and Kathy Hubbard. Corporate funding is provided by Stephens Inc.
The late biologist E.O. Wilson said that “the real problem of humanity is the following: We have Paleolithic emotions, medieval institutions, and god-like technology. And it is terrifically dangerous.” Wilson said that back in 2011, long before any of us were talking about large language models or GPTs. A little more than a decade later, artificial intelligence is already completely transforming our world. Practitioners and experts have compared A.I. to the advent of electricity and fire itself. “God-like” doesn't seem that far off. Even sober experts predict disease cures and radically expanded lifespans, real-time disaster prediction and response, the elimination of language barriers, and other earthly miracles. A.I. is amazing, in the truest sense of that word. It is also leading some to predict nothing less than a crisis in what it means to be human in an age of brilliant machines. Others—including some of the people creating this technology—predict our possible extinction as a species. But you don't have to go quite that far to imagine the way it will transform our relationship toward information and our ability to pursue the truth. For tens of thousands of years, since humans started to stand upright and talk to each other, we've found our way to wisdom through disagreement and debate. But in the age of A.I., our sources of truth are machines that spit out the information we already have, reflecting our biases and our blind spots. What happens to truth when we no longer wrestle with it—and only receive it passively? When disagreeable, complicated human beings are replaced with A.I. chatbots that just tell us what we want to hear? It makes today's concerns about misinformation and disinformation seem quaint. Our ability to detect whether something is real or an A.I.-generated fabrication is approaching zero. And unlike social media—a network of people that we instinctively know can be wrong—A.I. systems have a veneer of omniscience, despite being riddled with the biases of the humans who trained them. Meanwhile, a global arms race is underway, with the U.S. and China competing to decide who gets to control the authoritative information source of the future. So last week Bari traveled to San Francisco to host a debate on whether this remarkable, revolutionary technology will enhance our understanding of the world and bring us closer to the truth . . .or do just the opposite. The resolution: The Truth Will Survive Artificial Intelligence! Aravind Srinivas argued yes—the truth will survive A.I. Aravind is the CEO of one of the most exciting companies in this field, Perplexity, which he co-founded in 2022 after working at OpenAI, Google, and DeepMind. Aravind was joined by Dr. Fei-Fei Li. Fei-Fei is a professor of computer science at Stanford, the founding co-director of the Stanford Institute for Human-Centered A.I., and the CEO and co-founder of World Labs, an A.I. company focusing on spatial intelligence and generative A.I. Jaron Lanier argued that no, the truth will not survive A.I. Jaron is a computer scientist, best-selling author, and the founder of VPL Research, the first company to sell virtual reality products. Jaron was joined by Nicholas Carr, the author of countless best-selling books on the human consequences of technology, including Pulitzer Prize finalist The Shallows, The Glass Cage, and, most recently, Superbloom. He also writes the wonderful Substack New Cartographies. Learn more about your ad choices. Visit megaphone.fm/adchoices
Proudly sponsored by PyMC Labs, the Bayesian Consultancy. Book a call, or get in touch!Check out Hugo's latest episode with Fei-Fei Li, on How Human-Centered AI Actually Gets BuiltIntro to Bayes Course (first 2 lessons free)Advanced Regression Course (first 2 lessons free)Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work!Visit our Patreon page to unlock exclusive Bayesian swag ;)Takeaways:Computational cognitive science seeks to understand intelligence mathematically.Bayesian statistics is crucial for understanding human cognition.Inductive biases help explain how humans learn from limited data.Eliciting prior distributions can reveal implicit beliefs.The wisdom of individuals can provide richer insights than averaging group responses.Generative AI can mimic human cognitive processes.Human intelligence is shaped by constraints of data, computation, and communication.AI systems operate under different constraints than human cognition. Human intelligence differs fundamentally from machine intelligence.Generative AI can complement and enhance human learning.AI systems currently lack intrinsic human compatibility.Language training in AI helps align its understanding with human perspectives.Reinforcement learning from human feedback can lead to misalignment of AI goals.Representational alignment can improve AI's understanding of human concepts.AI can help humans make better decisions by providing relevant information.Research should focus on solving problems rather than just methods.Chapters:00:00 Understanding Computational Cognitive Science13:52 Bayesian Models and Human Cognition29:50 Eliciting Implicit Prior Distributions38:07 The Relationship Between Human and AI Intelligence45:15 Aligning Human and Machine Preferences50:26 Innovations in AI and Human Interaction55:35 Resource Rationality in Decision Making01:00:07 Language Learning in AI Models
Fei Fei Li (The Worlds I See: Curiosity, Exploration, and Discovery at the Dawn of AI) is a computer scientist, co-director of the Stanford Institute for Human-Centered Artificial Intelligence, and is considered by many to be the godmother of AI. Fei Fei joins the Armchair Expert to discuss her initial reluctance to tell her personal story as a part of her book on AI, starting a laundromat with her parents to support themselves, and the high school teacher that changed the course of her life. Fei Fei and Dax talk about how math is the closest thing there is to magic, why being fearlessly stupid is sometimes the best asset you can have, and the reason her north star is asking the audacious question. Fei Fei explains her perspective on the tech-lash, why there is so much humanness in everything we do in technology, and how essential it is to put dignity into how we both create and govern AI.Follow Armchair Expert on the Wondery App or wherever you get your podcasts. Watch new content on YouTube or listen to Armchair Expert early and ad-free by joining Wondery+ in the Wondery App, Apple Podcasts, or Spotify. Start your free trial by visiting wondery.com/links/armchair-expert-with-dax-shepard/ now.See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.