Podcasts about Swix

  • 34PODCASTS
  • 170EPISODES
  • 44mAVG DURATION
  • 1MONTHLY NEW EPISODE
  • Apr 23, 2026LATEST
Swix

POPULARITY

20192020202120222023202420252026


Best podcasts about Swix

Latest podcast episodes about Swix

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0
AIE Europe Debrief + Agent Labs Thesis: Unsupervised Learning x Latent Space Crossover Special (2026)

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Apr 23, 2026 54:52


Today, we check in a year after the first Unsupervised Learning x Latent Space Crossover special to discuss everything that has changed (there is a lot) in the world of AI. This episode was recorded just after AIE Europe, but before the Cursor-xAI deal.Unsupervised Learning is a podcast that interviews the sharpest minds in AI about what's real today, what will be real in the future and what it means for businesses and the world - helping builders, researchers and founders deconstruct and understand the biggest breakthroughs.Thanks to Jacob and the UL production team for hosting and editing this!Jacob Effron* LinkedIn: https://www.linkedin.com/in/jacobeffron/* X: https://x.com/jacobeffronFull Episode on Their YouTubeWe discuss:* swyx's view from the center of the AI engineering zeitgeist: OpenClaw, harness engineering, context engineering, evals, observability, GPUs, multimodality, and why conference tracks now reveal what matters most in AI* Whether AI infrastructure has finally stabilized: why “skills” may be the minimal viable packaging format for agents, why infra companies have had to reinvent themselves every year, and why application companies have had an easier time surviving model volatility* The vertical vs. horizontal AI startup debate: why application companies can act as the outsourced AI team for enterprises, why some horizontal companies still matter, and why sandboxes may be the clearest reinvention of classic cloud infrastructure for the AI era* The “agent lab” playbook: starting with frontier models, specializing for your domain, then training your own models once you have enough data, workload, and user behavior to justify the cost and latency savings* Why domain-specific model training is real, not just marketing: how companies like Cursor and Cognition can get users to choose their in-house models, and why search, domain specialization, and distillation are becoming more important* Open models, custom chips, and alternative inference infrastructure: why swyx has turned more bullish on open source, why non-NVIDIA hardware is suddenly getting real attention, and why every 10x speedup can unlock new product experiences* What it means to sell to agents instead of humans: why agent experience may mostly just be good developer experience by another name, why APIs and docs matter more than ever, and how pretraining-data incumbents are compounding advantages in an agent-first world* Why memory and personalization may become the next big wedge: today's models mostly reward frequency of mentions, but in the future, swyx expects product choice to be shaped much more by personalized memory systems* The state of the AI coding wars: why coding has become one of the largest and fastest-growing categories in AI, how Anthropic, OpenAI, Cursor, and Cognition have all ridden the wave, and why the category may still have more room to run* Capability exploration vs. efficiency: why the industry is still in a token-maxing, experiment-heavy phase where people are rewarded for spending more rather than less* Claude Code vs. Codex and the strange stickiness of coding products: why first magical product experiences may matter more than expected, and why the bigger mystery may be why only a few names have emerged as real winners so far* What the end state of the coding market might look like: two major players, a longer tail of niche products, and possible disruption if Microsoft, Mistral, xAI, or the Chinese labs push harder into coding* Where application companies still have room against the labs: why frontier labs are trying to expand into verticals like finance and healthcare, but still leave space for focused companies that own the workflow and the last mile* Why coding may be a preview of every other AI market: the first category to truly go parabolic, the clearest example of foundation model companies colliding with application companies, and a template for how future vertical AI markets may develop* Why AI valuations now feel unbounded: from billion-dollar ARR products built in a year to trillion-dollar market caps, swyx and Jacob unpack how the AI market has broken traditional startup intuitions about scale and durability* Consumer AI vs. coding AI: why ChatGPT's consumer category may have plateaued on frequency and product design, while coding continues to feel like a daily-use category with real momentum* The next product frontier beyond coding: consumer agents, computer use, and “coding agents breaking containment,” with swyx's thesis that 2025 was the year of coding agents and 2026 may be the year they begin to do everything else* Whether foundation models are really killing startup categories: why swyx is less worried for early founders, more worried for mid-size startups and traditional SaaS, and why building something ambitious may now be the best job interview for a frontier lab* AI vs. SaaS and the internal culture war around adoption: the tension between AI-native employees who want to rip out expensive software and skeptics who think quick AI-built replacements create fragile systems* Why traditional SaaS may be under real pressure: swyx's own experience spending six figures on event and sponsor management software, the temptation to rebuild it cheaply with AI, and the broader question of whether teams will trust custom AI-native replacements* Biosafety, security, and frontier model access: why swyx raised biosafety at a dinner with Anthropic's Mike Krieger, why Krieger argued security is the bigger issue, and what restricted model releases reveal about Anthropic vs. OpenAI* The era of giant models: why 10T+ parameter systems may only be a temporary rationing phase before bigger clusters arrive, why labs may increasingly keep their most powerful models private for distillation, and why scale alone no longer feels like a complete answer* Memory as the slowest scaling factor in AI: why context windows have improved far more slowly than people hoped, why million-token context still has not changed most real workflows, and why memory may be the key bottleneck for the next generation of systems* What swyx changed his mind on in the past year: becoming more bullish on open models, more convinced that the top tier of agent startups behaves very differently from the median AI company, and more optimistic about fine-tuning and specialized model adaptation* “Dark factories” and zero-human-review coding: the next frontier after zero human-written code, where models not only write the code but ship it without human review, forcing companies to rethink testing and verification from first principles* Why RL and post-training may matter more than people assumed: even if the resulting models get thrown out every few months, the data, workflows, and domain-specific improvements persist* Synthetic rubrics, Doctor GRPO, and multi-turn RL: why reinforcement learning is becoming much more domain-specific and multi-step than many people realize, opening the door to much deeper customization* The next frontier after coding: memory, personalization, and world models, including why swyx thinks world models matter not just for robotics or gaming, but for giving AI something closer to lived understanding* Fei-Fei Li, spatial intelligence, and the Good Will Hunting analogy: the idea that today's LLMs may know everything by reading it all, but still lack the lived experience that turns knowledge into a deeper kind of intelligenceTimestamps* 00:00:00 Intro preview: AI coding wars, startup pressure, and market structure* 00:00:28 Welcome to the Latent Space × Unsupervised Learning crossover* 00:01:17 What AI builders are focused on now: OpenClaw, harnesses, and infra* 00:04:33 Why AI infra is harder than apps, and where startups can still win* 00:06:39 Should companies train their own models?* 00:09:28 Open models, custom chips, and the new inference race* 00:11:25 Designing products for agents, not just humans* 00:16:49 The state of the AI coding wars in 2026* 00:19:27 Capability exploration, token-maxing, and why coding is going parabolic* 00:21:41 What the end state of the coding market could look like* 00:23:50 Where app companies still have room against the labs* 00:27:02 Why AI valuations and market swings feel unprecedented* 00:28:56 Consumer AI vs. coding AI, and why sticky products still matter* 00:32:28 What the next breakthrough product experience might be* 00:32:53 2026 thesis: coding agents break containment and eat the world* 00:35:27 Are foundation models wiping out startup categories?* 00:37:33 AI vs. SaaS, vibe coding, and internal team tensions* 00:40:01 Biosafety, security, and the politics of restricted model releases* 00:42:19 Giant models, compute constraints, and the limits of scale* 00:44:30 Memory as the real bottleneck in AI* 00:44:57 Why swyx changed his mind on open models* 00:47:44 Dark factories and the future of zero-human-review coding* 00:49:36 Why post-training and RL may matter more than people think* 00:51:50 Memory, world models, and the next frontier of intelligence* 00:53:54 The Good Will Hunting analogy for LLMs* 00:54:21 OutroTranscript[00:00:00] swyx: Isn't that crazy? That number is just mind boggling.[00:00:03] Jacob Effron: What is the state of the AI coding wars today?[00:00:05] swyx: We're in a phase of sort of like capability exploration. The general thesis that I have been pursuing now is that the same way that 2025 was a year coding agents 2026 is coding agents breaking containments to do everything else.[00:00:16] Jacob Effron: Do you worry about the foundation models just getting into a bunch of these startup categories?[00:00:21] swyx: Mid-size startups. Yes.[00:00:23] Jacob Effron: What do you think the end state of this market is[00:00:25] swyx: for the market structure to, to significantly change? There would be[00:00:28] Jacob Effron: today on unsupervised learning. We had a, a fun episode and what's really become an annual tradition, a crossover episode with our friends at Latent space.Swix and I sat down and we talked about everything happening in the AI ecosystem today. What we thought of the various changes at the model layer, what's happening in the infra world, the coding wars, and a bunch of other things. It's a ton of fun to do this with someone I really respect and another great podcaster in the game.Without further ado, here's our episode. Well switch. This is, uh, super fun to be back with another unsupervised learning, uh, latent space crossover episode.[00:01:02] swyx: Yeah,[00:01:02] Jacob Effron: I feel like a lot of places we could start, but you know, one thing I always find fascinating, uh, about the way you spend your time is you obviously are like at the epicenter of this engineering movement and community, and you run these events and conferences and put on these.Awesome talks and, and I think just have a great pulse on the zeitgeist of what's going on.[00:01:16] swyx: Yeah.[00:01:17] Jacob Effron: Maybe to, to start just what are the biggest topics people are thinking about right now?[00:01:21] swyx: Yeah, so I just came back from London, uh, where we did a IE Europe and we're doing roughly one per quarter now, which Yeah, you've[00:01:27] Jacob Effron: really up[00:01:27] swyx: the, hopefully[00:01:28] Jacob Effron: up the, up the pace.[00:01:29] swyx: It's trying. We're trying to match AI speed, youknow?[00:01:30] Jacob Effron: Yeah, exactly. The tops would be completely different, I imagine. Uh,[00:01:33] swyx: yeah. You know, I definitely curate the tracks, like you can see what I think. When you see the track list and the, the speakers that I invite, obviously Open Claw is like the story of the last four or five months, and then be, be just below that.I would consider harness engineering, context engineering to be two related topics in agents and rag. And then there's a long tail of Evergreen stuff like evals, observability, GPUs, uh, and uh, LM infra and just general, just in general. We also have other updates on like multimodality and, uh, generative media, let's call it.Um, but I definitely, the, the first three that I mentioned are top of mind people. Yeah.[00:02:13] Jacob Effron: I think harness is particular like, so interesting. Um, you know, there was this tweet from Harrison Chase, the, the lane chain, CEO, that, that caught my eye recently where he said, you know, it finally feels like we have stability, uh, around the infrastructure for, uh, you know, around ai.And I think what. He basically was implying his like, look over the past two, three years as a company at the epicenter of AI infrastructure, it was a bit like playing whack-a-mole, right? You were constantly moving around with, however, the building patterns were evolving[00:02:36] swyx: for Harrison for sure. Right? Like he's basically had to reinvent the company every year since he started Lang Chain.Right? It was Lang chain, Ang graph and LP agents and like, uh, I think he's like one of the most nimble, adept sharp people about this. Yeah. Yeah.[00:02:49] Jacob Effron: Saying now, now is finally the time stability[00:02:51] swyx: this. Yeah.[00:02:52] Jacob Effron: Yeah. Um, do you buy that or what have you kind of make of that take?[00:02:56] swyx: I think that. It, it's very expensive to say this Time is different sometimes, but when you're just writing code, like it's actually okay to just like try to make a call and I think it may not even matter if this call is right or not.Like I just don't even care that much because you can be right on a thesis, but if you don't, you don't figure out how to monetize the thesis, then who cares if you said something first that said, um, it does feel like, for example. Uh, we went through a lot of different ways of passion packaging integrations up with, uh, with agents.And it feels like we've landed at skills, which is like the minimal viable format. Yeah. Which is just a markdown file, uh, with some scripts attached to it, and I don't see how it can be more simple than that. And so there is some justification for. The stability around harnesses. I feel like there may be more adaptation with regards to maybe like the real time elements or subagents or memory or any of those like agent disciplines, let's call it in, in agent engineering.Uh, but if, if the thesis is that, okay, you just want agents are LMS with tools in the loop with a file system, what they can do. Retrieval with, with skills and all these like standard tooling that now seems to be relatively consensus then probably. That makes sense. Um, I just think like there's no point trying to stake your reputation on this thesis that we're there because if it changes again, just change with it.It's fine.[00:04:33] Jacob Effron: Yeah. It's always, you know, I've always been struck by how that is. Much more challenging for infrastructure companies and application companies. Like obviously I think, yeah. You know, on the application side you've seen, you know, Brett Taylor from Sierra Max, from Lara. Like, they're like, look, we build, you know, what's ahead of the models and we're willing to throw everything out every three months, you know, as the models get better and better.Exactly. Yeah. But the thing you at least have there is you have. Uh, you have an end customer, right? That's like decently sticky. Um, you know, they will mostly stick, you know, they'll, they'll give you a shot at least of, of building these things. What I've always found more challenging, uh, at, at the kind of like, you know, reinvent yourself every three months of the infrastructure layer, it's like, you know, developers are definitely a, a pickier audience maybe than an accounting firm or, uh, you know, a bank.Yeah. And so it's definitely a, a, a more challenging position to be in to, to have to constantly reinvent yourself.[00:05:17] swyx: Yeah. Yeah. Yeah. And, and like when they turn, it's like. Very complete. Like, they'll leave to like the, the hot new thing, uh, because there's like no defensibility, I guess. Like e even, even if you are a database, like, uh, people can migrate workloads off databases.Like it's, it's a, it's a known thing. Uh, so I think like basically what we're talking about is the vertical versus horizontal, uh, debate in, in AI startups. And uh, the way I think about it also is just that like when you are. Um, Lara, when you are a bridge, like you are the outsource AI team, right? You, you are, your job is to apply whatever state ofthe art AI methods.[00:05:55] Jacob Effron: Yeah. Like this translation layer between model capabilities and your[00:05:57] swyx: own customers. Yeah. To, to the end customers and like, well, if they didn't have you, they would've to hire in house and they're not gonna hire in house so they have you. And like, I think that's like a reasonable, like very robust to any whatever trends and, and discoveries that people make in, in the engineering layer.I do think like there is, um. It like sort of useful horizontal companies being built, but they're all. Very much like, sort of like the reinventions of classic cloud in the AI era and the, the primary one being sandboxes. Yeah. Um, which like, it's another form of compute guys, like, let's not get too excited about it.But I mean, like the, the workloads are enormous.[00:06:38] Jacob Effron: Right.[00:06:38] swyx: Yeah.[00:06:39] Jacob Effron: It's interesting, and I feel like as, as part of this, you know, the questions that folks are asking around infrastructure, there's a lot around, you know, the extent to which companies should have their own AI teams and what they should be doing in-house.And, you know, uh, I think there's questions around should people be training their own models? Should people be doing, you know, rl, uh, in-house based on the data they have? I feel like, you know, one has to evolve their takes on this every, every three months with paces. But where, where are you at on this today?[00:07:00] swyx: I think, well, I mean actually all models have gone up. Um, and obviously I'm involved in cognition and also cursors doing, doing, uh, a lot of own model training. And I think that that is some part of the, what I've been calling the agent lab playbook, where you start off with the state of the art models from, uh, from the big labs and you, uh, specialize for your domain.But once you have enough workload and enough high quality data from your users, then you can obviously train your own models and like save a lot on cost and latency and all that, all that good stuff. Um, you also get like a marketing bonus of like calling it some fancy name and putting out some research[00:07:38] Jacob Effron: from my seat.I can't tell how much of it is like actual, you know, value that's provided to the end user. And how much of it is that marketing bonus? Right. It seems some combination of the[00:07:45] swyx: I think it's both.[00:07:46] Jacob Effron: Yeah.[00:07:46] swyx: Um, no, no. There, there actually is real value. Um, and you, you know that for a number of reasons. Like one, even when it's not subsidized, people do choose it as like one of the top four or five.This is both composer two and, uh, suite 1.6 I one of the top five models. Like in a, in a fair market? In a free market, yeah. In a, in a, in a model switch. Or people do choose it and like, it's not subsidized. Like, so that's as good as it gets. Uh, but beyond that, like domain specific models, for example. For search with, with both, which both companies have absolutely makes, makes a ton of sense.Everyone says like, yeah, we should always, always do this. And honestly like, I think the infrastructure for that is becoming easier with, um, like thinking machines tinker thing as well as primary like, uh, lab stuff. Yeah, I mean like, this is one of those like reversal of the, the bitter lesson where you first bootstrap on the large models and the general purpose models to get big.And as you get very well-defined workloads that are just high quantity but not high variance, um, then you just distill down to a smaller model and run that on your own. Right. Which like totally makes sense.[00:08:50] Jacob Effron: What I'm less clear on is the kind of DIY RL use case, which I think is really mostly around, you know, improved, uh, quality for, for different things.Obviously there's probably like more efficient ways to, you know, get a smaller model that's that's faster and cheaper. And it'll be interesting to see whether. You know, obviously you had, you know, uh, two, three years ago this whole case of companies that were, you know, pre-training and claiming better outcomes in, in their domains than getting kind of cooked as each model iteration improved.You know, I wonder whether that's a, a similar story plays out in the, uh, in, in the, our all space. Yeah, for the focus on, on on pure outcomes and quality, not the cost side, which clearly your own models for cost at scale makes a ton of sense.[00:09:28] swyx: I think there are this, there are two sides of the same coin.Like you basically always want to hold, uh, quality constant or trade off a little bit of quality for a drastic decreasing cost. And that's true for everyone. Uh, one element I wanted to bring out, which is very much in favor of open models, is custom chips. So this would be cereus, but also talu. And then there's a huge range of stuff in between.This has been a huge story this past year on just like everything non Nvidia is getting bid up, including like freaking MatX is working for, which is very, which is very rewarding for me, but I think one of those things where like, oh, like the suddenly, because the number of alternative. Hard, uh, hardware is increasing and the inference that you can get is insanely high.Like, um, we're talking thousands of tokens per second instead of less than a hundred. So the trade off for qua quality doesn't hold as much anymore because the speed is so high.[00:10:24] Jacob Effron: Have you seen a lot of companies go all in on the alternative chip?[00:10:26] swyx: So cognition has Yeah. On Cerebras, uh, and, and so has OpenAIUm, uh, and so no, I don't think so beyond that, uh, and that, do you think that's like a, that's mostly, that's foreshadowing of, that's, yeah. I used to be kind of a skeptic in terms of like, okay, so what if I get my inference at a hundred to a hundred tokens per second sped up to 200 tokens per second. It's only two X faster.It's not that big a deal. Um, but when you, uh, I think every 10 x does unlock a different usage pattern. Um, and you, we have proof in Talas and, and some of the others. That you can actually, um, drastically imp improve inference speed and what happens from there? I don't even really know, like it's, it's so hard to predict when entire applications just appear at once.Yeah. Uh, and it also isn't that expensive, right? So like, um, this is one of those things where like, I, I think the, the investment cycle is gonna be multi-year. Um, and I. Would caution people to not dismiss it too, too quickly.[00:11:25] Jacob Effron: Yeah. I mean, one other like infra question I was curious to get your thoughts on is obviously it seems increasingly a lot of the cutting edge infra companies are building for agents as the buyers of their product or users of their product, right?[00:11:35] swyx: Ooh,[00:11:36] Jacob Effron: and[00:11:37] swyx: another huge theme. Yeah. Yeah.[00:11:38] Jacob Effron: And I'm trying to figure out like what. What, what do you have to do differently about selling into agents? Um, are they just the ultimate rational developers? Uh, or is there, you know,[00:11:46] swyx: no, absolutely not. Um, I think they are easily prompt, injected and, uh, very tuned towards like, basically com compounding existing winners.[00:11:57] Jacob Effron: Yeah,[00:11:57] swyx: so like if, like, congrats if you won the lottery for getting into the training data right before 2023, because now you're like installed in there for the foreseeable future. But yeah. Uh, you know, one stat that Versal, uh, CTO Malta dropped at my conference was that there are now, uh, 60% of traffic to Elle's, um, like app arch, like admin app architecture for like configuring versal applications, uh, is bought.It's not, it's not human. Uh, so like your primary customer is agents now. Um, and it's mostly co like mostly coding agents, mostly people using CLI on CP or whatever. But yeah, I mean, I think. More. I, I think step one, if it doesn't exist as an API that agents can use, it doesn't exist. Right, right. Which I think is like, uh, it's a good hygiene thing anyway, to, to make everything API available, but not as like an extra, um.Push on like products, people to not only work on the ui, um, you should probably work on the on SCLI stuff. Beyond that, I think honestly there is like, so I, I come from the sensibility of, I think everything that you are trying to do for agents experience now, which is the term that Matt Bowman and Nullify is trying to coin, is the same thing that you should have been doing for developer experience.That you should have had good docs, you should have had a consistent API, uh, that is. Mostly stateless. Um, you should have, I guess, discoverable or progressive disclosure or like search or like whatever. And so now that people have energy in like finding these customers to do that, that's great. Um, do I believe in.Extending beyond that into something like a EO, um, for gaming The chatbots? Not necessarily, but obviously there's gonna be huge advantages when people who figure out the short term wins. Yeah. And short term wins can compound.[00:13:43] Jacob Effron: Do you think these compounding advantages to like the, the pre-training data cutoff companies, like, you know, obviously over some period of time, I imagine that doesn't persist.And so as you think about like. I dunno, three, four years from now what the, you know, selection criteria end up being. Do you think it still mirrors exactly what you were saying before? Like it's exactly what you should have been doing all along to sell a good product to developers?[00:14:01] swyx: It could be, except that I think in three, four years we'll probably have much better memory and personalization.So then general a EO or GEO doesn't really matter as much. So I think whatever memory or personalization system we end up with will probably d determine what you end up choosing much more. Than, than what is currently the case, which is just frequency of mentions, let's call it. Yeah,[00:14:26] Jacob Effron: yeah.[00:14:26] swyx: Uh, so you just spa quantity and I think that's, I mean, that's something I'm looking forward to.I do think, like, like, you know, I, I think that the fundamental exercise to work through for yourself is if you start a new, um, sort of. Uh, disruptor company. Now there's a, there's a big incumbent that everyone knows, like, like superb base. Super base is like, kind of like the Postgres, like database, uh, incumbent.If you wanna start like new superb base, how would you compete with them? And I don't necessarily have the answer, but I, I, I do think like people, like resend like relatively new. I think they would start like 20, 23 and still there was, there was a recent survey where like, people. Checked what Claude recommends by default.If you just don't prompt it with anything, just say, gimme an email provider and says, resent as in like 70, 70% of each cases. Like the fact that you can get in there with like such a relatively short existence, I think is, is encouraging.[00:15:14] Jacob Effron: Yeah.[00:15:14] swyx: I do think like. Um, you do want to do whatever it is to, to like to, to get in that Very short mentions this because, um, it's not gonna be 20 of them, it's gonna be like three.[00:15:26] Jacob Effron: No, definitely. It feels like, uh, you know, probably more, more consolidation than ever. Uh, or, or kind of like, you know, uh, a winner take most market than maybe the, the, the physics of go-to market in the past. Yeah. Might have, uh, enabled.[00:15:38] swyx: The other thing also is like, semantic association is gonna be very important, uh, in the sense that like, you want to do like the combo articles where you're like, use my thing with for sale, with blah, blah.And like that all gets picked up in a, in a corpus. And so that's. Probably one thing that you, you wanna do? Well, I don't know what else. Uh, it's, it's, it's, it's one of those things where like, I think I feel, I feel I'm behind, uh, I don't know how you feel about this, but like,[00:16:04] Jacob Effron: I think AI is just everyone constantly feeling like they're behind some, uh,[00:16:08] swyx: yeah.With,[00:16:09] Jacob Effron: I wanna meet the person that doesn't feel behind,[00:16:11] swyx: but like with, with ax, right? Like, so, so like, my, my stance was that exactly what I said before, like everything that you, that you should do for agents is something that you should have done for humans anyway. Yeah. And so. To the extent that you're just getting it more energy to, to do things for agents, great.But like, uh, it's hard to articulate what new thing apart from just like more spam, um, that you should be doing. Anyway, that would be my take right now. Um, I I, I do think like there, there will be more turns at this. I think the personalization turn that is coming, um, will be big. And I don't know what that looks like because like basically we're kind of, we feel kind of tapped out on the memory side of things.[00:16:49] Jacob Effron: Yeah. I, I guess since we last chatted, you know, you, you took this role over at cognition, um, and you've obviously have a, have a front row seat to the AI coding space today. You know, I feel like coding in many ways. You know, people view it as this, like, I mean, besides being like the, the mother of all markets and this massive opportunity, I think it's kinda a preview of like, what's to come for many other spaces.Both. Yeah. You know, I feel like agents are most advanced in coding. I also feel like the, you know, competition between foundation models and application companies, you know, and, uh, mirrors what we may see in other spaces. And so maybe for our listeners, can you just lay out like what is the state of the AI coding wars today?[00:17:25] swyx: Um, it is massive, right? Like, uh, and I don't think necessarily, last time we talked about this, we appreciated the size of what[00:17:32] Jacob Effron: No, I wish we did.[00:17:33] swyx: I state of AI coding wars today, um, both opening eye philanthropic have made it their p serials to competing coding. Um, and. Tropic is like 2.5 billion in a RR just from Cloud Code.The way they recognize a RR is. Opt for debate, uh, open ai. I don't think the, a public number is known, but let's call it 2 billion as well. And then cursor is like, rumored to be 2 billion, you know? And, and those, those are like the public numbers that are known? Yeah. Um, so like huge markets that have just been created in the past one year.Like, like anthropic, just like Claude Code just recently celebrated their one year anniversary, which is, yeah, pretty nice. Um, so, and then I think, like the other thing that I see is there's, there's some other people who are like, oh, here's like the, the sort of relative penetration of, uh, Claude use cases, right?Like, and it's like coding 50% and then legal, whatever. Health, uh, it's like the, the remaining ones. And there was a very popular tweet that was like, okay, I'll look at the, the empty space and all these other use cases. If you are a new founder today, you should be betting on the other stuff because on, on a sort of catch up Yeah.Theory and my. Consider my, my pushback is the same pushback that, uh, I had on app over Google, which is like, well, well why is this time different? Like, why, if it went from let's say 10 to 50% in the past year, why can't I keep going? Uh, and like getting that wrong is actually a very painful one because you could have just did, did the momentum bet.Instead of the mean reversion bed. So I, I, I think that that is the, the state of things now that people are very, very much into psychosis. Um, they're are getting rewarded for spending more rather than spending less. And I think we're not in that phase of efficiency. We're in a phase of sort of like capability exploration.So I think people who are more crazy, who are more. Uh, creative, um, get rewarded comparatively. Yeah.[00:19:27] Jacob Effron: Well, it's interesting. I mean, it feels like behind these like token maxing, leaderboards and whatnot is this, it's like the first phase of this transition from a workforce perspective is you just gotta show your employer like, Hey, I, I use these tools.[00:19:37] swyx: Here's my nu number of tokens I cost, and that's it. They don't care about the quality. Right. It is, uh, maybe distasteful to someone who cares about the craft and, and all that. Um, but directionally everyone just wants you to go up regardless. And so, um, there it is not very discerning. It's, and it's probably very sloppy, but I think it's net fine because we're still probably underusing ai just in generally.Yeah. Um, and so I think that's like very interesting. Like we had on the podcast, uh, Ryan La Poplar from OBI, who spends a billion tokens a day. Yeah. Um, and that's for those county home, it's like something like 10,000 worth, $10,000 worth a day of API tokens. If they, they did market rates, um, and like most of us can't afford that.Yeah. But like. And, and, and probably a lot of what he does is slop.[00:20:25] Jacob Effron: Right.[00:20:25] swyx: But like, he's going to dis, he's like, if there were a new capability, he would discover it first before you because he was, he was trying and you were not trying. Right. And like, you only do things that work like, well, good for you.But like the, the people who are going to discover the next hot thing are living at the edge.[00:20:42] Jacob Effron: Right and increase in living at the edge of just having the compute budget to like run these experiments. I mean, kind of similar to what living at the edge on the research side has always been. You know, it was constrained in many ways by the amount of compute you had to run these experiments.It feels similarly on the, almost on the builder or like actualizing these tools now.[00:20:56] swyx: Yeah. The other thing that's, I mean, very obvious is philanthropic is kind of like the high price premium player. Um, that where, you know. Restricting limits or restricting model releases even is like the name of the game.Whereas Codex is like, come on in guys, use our SDK, use our login and we don't care. We're gonna reset limits. Whatever you do want to try to exploit the subsidies where you can get it. And definitely Codex is super subsidized right now. Gemini also very subsidized. Um, and. Comparatively, like, I think you should make, Hey, I guess while, while that's going on, it's not that bad to be a capabilities explorer on just the $200 a month plan from Cloud Code or from OpenAI.Um, and, uh, I I, I, my sense is that people aren't even there yet.[00:21:41] Jacob Effron: How do you think this, like, market ultimately plays? I mean, it's obviously such a big market that, you know, any slice of that market is interesting for, for anyone going after it. But I think what, what makes people so interesting in the coding market particularly is it feels like it's kind of this.Foreshadowing of what will happen in other, you know, any other kind of application market that the foundation models eventually turn to and are all their models against and gather data around. And so how do you think, you know, like does there end up being room for lots of different kinds of players or like, what do you think the end state of this market is and is that, do you think that's applicable to other markets?[00:22:10] swyx: I feel like there will be, I mean. Status quo is probably the most likely outcome, which is there are two big players and there's a small range of longer tail people that, um, fit other use cases that the, the two big players don't. That feels right to me. I think that, um, for it to, for the market structure to, to significantly change there would be, there needs to be significant change in like the economics or like the, the brand building or like the, the, the, the value propositions of the, of the companies involved and I.Haven't seen any in the last six months that, that have really changed the stories materially. So I feel like they would just keep going until something, something else happens. Something else happens, meaning like Microsoft wakes up and like goes like. Guys, we have GitHub, we have, uh, you know, we, we, we'll, we'll do something much bigger here than other, other than just copilot.Um, and, uh, that would be a big change. Um, MSL has put out a model now, and I was in a breakfast with, uh, Alex Wang, where they were like, yeah, like, we, we really, really want to go after the coding use case. We haven't done anything yet, but like, don't underestimate them. Right. Um, and, and similarly for the Chinese labs.Um, I think they're trying to go after it. Like ZAI is doing stuff. GLM uh, ZI and GLM is same thing. Um, uh, and, and so it's, so like everyone's trying to get a piece of that pie. I, I feel like the, the status quo has been pretty stable for the past, like almost a year I'll say.[00:23:39] Jacob Effron: Yeah. And is the room for the, not like, you know, for, for the application companies more on like the enterprise side or like where do the, where do the, like what surface area do the model companies leave for application companies?[00:23:50] swyx: Yeah, that's a good one. Um. It's very much evolving. Um, it, I, I, I will say because opening I did not have this, the, this level of attention on coding. Yeah. Uh, a year ago. We just don't have that much history. Right. Um, and it seems like, for example, so the big push at Open I now is the Super app. Um, is that a consumer thing?Is that like a products like. Portfolio rationalization thing, how much is that gonna take away attention from coding at the time when they actually do want to put more coding? I think it's, it's very unclear. So I do think like there's, there's all these, like in both big labs, there's. Uh, sorry. Both of the, and, and drop and, and deep minus and XAI are are separate cases.Um, they are trying to see the other time expansion areas. So cloud code for finance. Yeah. Um, uh, cloud cowork, all those, all those things. Whereas I think cursor and cognition are like comparatively just focused on coding and so I, I do think they leave space and I do think for the other verticals that also means the same thing.Right. That, uh, that they're not gonna be that. Um, intensely focused on, on, on that domain. Except for, I, I think I would mark out finance and healthcare as like the next ones, um, that they're clearly going after. Uh, I, I would say comparatively, healthcare seems more thorny. There, there, there've been some announcements about it, but like, I would respect the, the finance work a lot more just because like the, the path to money is a lot clearer.[00:25:12] Jacob Effron: Yeah, no, I mean, obviously like, I, I think, you know, maybe similar to, to the space that's being left in these other domains, you know, there's obviously. Uh, a lot that's required to actually implement these tools in enterprises, uh, versus, you know, maybe just giving them, uh, giving model access to, to folks outta the box.[00:25:27] swyx: Yeah, yeah. Yeah. So the, the agent lab thing is like, we'll do the last mile for you. Whereas I think the model labs tend to just trust the model and, and be minimalist about it. Both of them work.[00:25:38] Jacob Effron: Yeah.[00:25:38] swyx: I, I don't, I don't necessarily think one, uh, beats the other, uh, for every, for every use case. Um, all I, all I do know is that it does seem like.Uh, the large enterprises do want a dedicated partner that isn't just the model labs, which is kind of interesting.[00:25:55] Jacob Effron: We, we've been in this phase of, of pure capability exploration. And so I think nothing has been, you know, better for the large labs, right? I mean, they're always gonna be, uh, uh, the frontier of, of capability exploration.And so I think have a very good relationship with a lot of these enterprises. But ultimately over time, like. The, uh, the incentive structure of these labs is always gonna be maximal, you know, token consumption for, uh, for the end customers they work with. And there's just, I think, so few companies that have actually gotten to massive scale.Maybe coding again is the most interesting. So it's the first space that really is just completely gone, you know? Yeah. You must love it every day. Like absolutely insane. And. I think it[00:26:32] swyx: gets even. Okay. I mean, like, I think we, we say good things about crystal cognition, but the sheer liftoff of like both end UPIC and open ai.‘cause they, they, they have independent valuations. I mean, let's throw an XEI in there because it's now I ping at 1.2 trillion. That number is just mind boggling. Like I, I feel like in normal investing or normal startups, there's kind of like a ceiling market cap or valuation. Totally. That, that like you, you reach and you go like, all right, let's, it's gonna be chiller from now on.And these guys are not slow down. No.[00:27:02] Jacob Effron: Well, I also think the dynamic is fascinating about some of these later stage companies is, is, you know, in the past, I feel like in, in venture world, if you got to a certain level of scale, the question around you was really more a valuation question. And this is like why there was different phase, like, you know, types of venture people did and like the late stage growth people were just incredible at like, you know, a little bit of what's the ultimate market opportunity of this company, but also what's the right way to, to value it.Like we know it's, it's in some bands of an outcome that is like. Sure there's some variance to it, but it's like relatively understood what that bands is and then maybe you get over time surprised to the upside. Whereas any kind of like later, even the labs themselves, any later stage company, the bands of which that company might be worth right now, even in a year or two years are so massive because of how fast the ecosystem changes that it's like.Even for later stage companies, every three months could be an existential level event to the upside to the downside. Yeah. Um, and I think that, like, you are obviously seeing it in the, in the positive with code, which, you know, if you think about a company like philanthropic, you know, that. For a while, it was like unclear if they were going to have access to enough capital, um, to really stay in the, in the race, right?And then coding hit at the exact right time. They had the perfect model for it. They executed brilliantly. Um, and you know, now are, are, you know, uh, you know, one of the most valuable companies in the world.[00:28:13] swyx: Uh, at the same time, I, I don't find, I, I have zero sympathy for opening eye because they're crushing it and they're all rich.You know, this is like a high class champagne problem to have to, uh, to be number two at coding or whatever. Like, who cares? Like, you're, you're doing great.[00:28:27] Jacob Effron: Yeah. It's funny though. I can't even, I mean, you would be closer to this, uh, you know, even that you're in the AI coding space, but it's like a lot of people I talk to think Codex is just as good, if not better than Claude Code.Right. I think one thing that I've been really surprised by, and maybe, maybe Cloud Code is a better product in some ways, I'm curious your thoughts is just in consumer AI with chat GBT. You saw this big first mover advantage, right? Where admittedly today, like, I don't know, Claude Gemini. Great products.Not sure, not abundantly clear chat GBTs any better, but like. People stick with chat, GBT, it's the first thing to introduce them.[00:28:56] swyx: They stay, but they're not growing anymore. I don't know if you've seen[00:28:59] Jacob Effron: Right. But that to me is more of like a, a, a product problem than it is. They're not like, it's not like they've like lost share to someone else.My understanding is the overall problem with consumer AI today is much more of a how do you take this tool and, you know, for, for folks like us, like knowledge workers, it's like this incredible magic tool, but it's not necessarily a daily active use tool for a lot of people around the world today. And what are the like products?It's, it's kind of a category wide problem. Like in coding, for example, like. The entire space has gone parabolic. There may be some relative growth in, uh, in other consumer AI players, but it's not like consumer AI as a category is like going parabolic and they're not capturing most of that thing. I think it's actually the larger problem is much more, hey, the category has kind of hit a bit of a plateau of people haven't figured out how to bring, you know, tons more users on board.Yeah, yeah. Or increase the frequency of those users. And so it seems more of a category wide problem than it is, you know, a massive market share of change. I was gonna draw the comparison to, to the coding space where Claude Co is the first product, obviously, to introduce people to this magical experience.You know, by all accounts, codex is, is pretty damn close to as good, if not better. Um, but like still that first product, you, you would've thought that would not be a super sticky, uh, you know, product surface area. And it actually has, it turns out, I, it feels like the first lab to introduce you and experience really does, uh, keep a lot of, uh, a lot of the focus.[00:30:12] swyx: I, I think. M maybe it's like still, still early days. You know, Chad, BT is like three plus years old and Yeah. Cloud code is only one. Just turned a year. Yeah. So give it time, you know? Yeah. Like, yeah. I mean, definitely sometimes a lot of people have switched from to Codex. Maybe that will keep going. I, it's like really hard to tell.Uh, yeah. I, I, I do, I do think that. Because we are in this like, high volatility, high temperature phase. Um, the loyalty and stickiness to first movers and category creators, I don't think is as high as it might be in some other, uh, areas in our careers that we've looked at.[00:30:47] Jacob Effron: Yeah. Though, I mean, I've been surprised by the cloud code thing.I, I would've thought that, like, in many ways I always worried about the[00:30:52] swyx: enterprise. You think you would've been gone by now?[00:30:53] Jacob Effron: Not gone. But I would've, I I always worried that the, that the consumer business of these companies would be quite sticky. And then the enterprise API business. Uh, was actually like, you know, in some ways like your least loyal buyers, like they would, they would move to,[00:31:05] swyx: right, right.But, but they worked out that it wasn't the enterprise API it was enterprise product.[00:31:09] Jacob Effron: Totally. And maybe that was the, that was the secret that like, but the amount of lock-in or just default behavior that has happened in that space, uh, is, is more than I might've imagined with two products that by all accounts are pretty damn similar.Yeah.[00:31:22] swyx: No fight there. Uh, I will say I do think that Codex is still in like a catch up. Like in terms of personal experience. Um, the only thing I like out of, out of Codex is the, is like Spark and like yeah. Uh, the, I, I feel like the skills integration is a little bit better. I feel like, uh, the, the speed is a bit better.Maybe ‘cause it's in, is written in rust or whatever. Um, very minor things that you like. Almost like telling yourself rather than like objectively assessing between two, two of them. I, I, I do think, like vibes wise, I think that's going on. Um, the, the, you know, I, I feel like the, the missing questions, uh, in, in this whole debate is like, why is this so concentrated in only two names, right?Yeah. Like, um, how, where, like, where is the Gemini? You know, presence, where's the Xai presence? Um, and like they are trying, it's just they haven't made that much progress yet.[00:32:12] Jacob Effron: But what the, what the Claude Co moment does show, and it actually in some ways makes you a little more bullish on the potential for someone else to catch up because it does feel like if you're the first person to introduce some magical net new product experience, that that actually might be stickier than one might have imagined.[00:32:27] swyx: Right, right, right. Okay. Yeah.[00:32:28] Jacob Effron: And so it's, everyone can believe they have shot[00:32:29] swyx: that. What do you think that new product experience might be like? I, I, it's, it's like, and this is a failure of imagination on my part. Like, I always wonder, like, people always say this like, well, the, the thing that will save us is like being first to the next new thing.Like what is it?[00:32:41] Jacob Effron: Yeah.[00:32:42] swyx: It's like,[00:32:45] Jacob Effron: I dunno, something around like, uh, consumer agent, computer use, like hybrid. I think, obviously, I think we're like scratching the surface on the consumer side.[00:32:53] swyx: So my, my current theory is like the. Open claw is like a vision of things to come.[00:32:58] Jacob Effron: Totally.[00:32:58] swyx: Um, and uh, it's good that O open I has like the association with open claw, but by no means do they have the rights to win it.The general thesis that I have been pursuing now is that the year the same way that 2025 was the year of coding agents, 2026 is coding agents breaking containment to do everything else. Um, and so coding agents continue to still win, but because they generate software and software eats the world, so like, it's kind of like the trans.Associated property of like software, eat the world, coding agents, eat software, therefore coding agents eat the world. Um, which is like an interesting,[00:33:30] Jacob Effron: yeah, and breaking containment always an easier phase phrase in the consumer context than the enterprise one. You've seen people run these really cool, uh, experiments in their own personal lives.I think like,[00:33:37] swyx: yes.[00:33:38] Jacob Effron: Figuring out, you know, how you, obviously everyone's focused, you know, on the enterprise side now around how you create these experiences. I feel like the vibes, you know, people love to have these narratives of like, everything is completely shifted. It's like I actually, you know, open AI.Organizationally, uh, you know, volatility aside is, you know, great products, great team, great models like everyone else in the world is incentivized for there to be. Two, three more. Everyone would love more like great model companies. And so I feel like the, the natural forces of the world revolt when any one company, you know, is too much the star of the show, right?There's so many people in the ecosystem that are incentivized for that not to happen. And so I think I'd be shocked if we don't have. Uh, uh, reversion of vibes, not maybe completely the other way, but at least a little bit more equal at some point over the next six, 12 months.[00:34:24] swyx: I, I think there's just a kind of different stages when, when you talk about the world, one wanting more model companies, I talked think about like the neo labs.[00:34:30] Jacob Effron: Yeah.[00:34:31] swyx: And I mean, I don't know, is it fair to say none of them have really broken through in the past year?[00:34:35] Jacob Effron: I think that's totally fair,[00:34:37] swyx: which is rough. Um, and well, how are we gonna, how are we gonna grow that diversity in, in, in choice, like. Um, that's, this is it.[00:34:46] Jacob Effron: Yeah. It'll be really interesting to see what, what, what ends up happening with that.And you've seen, you know, folks like Nvidia, you know, very incentivized to make sure there's, there's a broader platform of, of other model providers.[00:34:57] swyx: I think, uh, I don't know people say this, but I, I, I don't think they try it hard. Nvidia tries harder to build neo clouds[00:35:05] Jacob Effron: Yeah.[00:35:06] swyx: Than neo labs.[00:35:07] Jacob Effron: Well, they try pretty damn hard to build neo Cloud, so[00:35:09] swyx: that's,[00:35:09] Jacob Effron: yeah.[00:35:10] swyx: But like, you know, let's call it like the, the core weaves of the world, much happier place in the, you know, than any neo lab built on top of them.[00:35:18] Jacob Effron: Yeah. That one might argue it's, it's easier to, to enable a neo cloud to be successful than it is. Uh, you can't will a neo lab into existence the same way you, soNvidia[00:35:25] swyx: has more direct control over it.Uh, for sure.[00:35:27] Jacob Effron: What else is kind of catching your eye today on the startup side? I mean, you worry, there's obviously this whole narrative of like, you know, the foundation models, you know, they announced a product and every stock goes down 15%. Like[00:35:36] swyx: Yeah.[00:35:37] Jacob Effron: Do you, do you worry about the foundation models just kind of eating into to a bunch of these startup categories?[00:35:43] swyx: Not really. I, I think actually like. As, uh, there's, there's, okay, there's, there's, there's the, there's the point of view of like being an investor in startups, and there's a point of view of like, do you wanna start something? And I think honestly, like the, the downside for all these is so. Minimal in, in a sense of like, the worst you do is you just get hired into one of these labs anyway.So I, I think the, the market for people who just do things and try things and try to execute in like a competent way, even if like it doesn't work out commercially, even if it just wasn't that great anyway. Like, but like that's your job interview to go into, into one of these things anyway, so, um, I don't feel that.From a, from a very, very small startup perspective, mid-size startups. Yes. Uh, I will say there's been a lot of dead, um, LM Infra, a lot of LM infra consolidation like the, the, uh, lang fuses of the world getting absorbed into, into click house. And I, I think. Like people have maybe worked out the domain specific playbook, uh, and like, I think that's okay.Um, and, and yeah, I'm not that, not that worried about, uh, okay. So, um, I, I would say I'd be more worried about traditional SaaS, like low NPSS. This is the whole AI versus SaaS debate that has, that's been going on. Uh, and, and like literally I'm going through that exact thing in my company where, so I like kind of.Thinking through this on a very visceral, visceral level, right? On one hand you have the people who say you vibe coders don't appreciate the amount of work that goes into A-A-C-R-M and like, yeah, you think you can rip out Salesforce? So did the 30 entrepreneurs before you, right? Like, like, you know, you classically underestimate the things that you don't.Deeply, no. And, and, and target audience is not you. Uh, at the same time, like we have never been able to build software so easily and customize software so easily and like Yeah, you're not gonna use 90% of the things in Salesforce. So like, yeah. What's the typical, so what have you, what[00:37:33] Jacob Effron: have you done internally?[00:37:34] swyx: So we have there the main SaaS that we do for event management and sponsor management. That's, and we paid 200 KA year for that. Not, not huge, but like chunky for, for, for my, my scale. Um, and like, yeah, I could probably spend 2000 and, and build like a custom version of that. Um, the, the, the trick has been dealing with my, the rest of my team and getting them on board.Yeah. ‘cause I'm the most ethical person on my team, but like, I can't make that decision myself. And I think in the same way I've been telling with other CEOs team leaders as well, it's like, well you can be super cloud pilled. You can be super LM psychosis and that you think that's okay, but you like you have to bring your team with you.And I think like there, the sort of widening disparity in LM psychosis in companies is causing real s real riffs because. And on one hand, on one hand, the people who are less AI native are not getting with the picture. They're not, they're actually like behind, they're actually not waking up to the fact that like you, everything you think is necessary is not actually that necessary.And in fact, exactly would be better of you if you just like held your nose and went in and when came out the other side. Yeah, only talking to agents in natural language and like your life would actually be better and you just, you're just like close-minded. There's that perspective. The other perspective is, oh, you vibe coder.You, you did this in a weekend and you got the 80% solution and now the rest of your employees. Have to pick up the rest of your s**t, right, that you, that you thought you were, you were such hot, amazing, uh, uh, at, but like, actually you didn't figure it out. And like, actually LMS are still useless at this and blah, blah, blah.So like, I think there's this huge debate going on in every company right now. Um, and like, um, you know, I have a small microcosm of it, but like, yeah, it, it's making me hesitate to, to pull the trigger. But like I will at some point, it's like maybe I've put it off for one year, but not like five. Yeah, but like, so, so like SaaS is definitely getting squeezed.Um, it does make me wonder, like, I, I do think that there's an opportunity for a more AI native, um, system of record thing that is not just Postgres. Um, or not just MongoDB, although both are very good. Maybe it's like a convex or like people Yeah. Bring up convex a lot. I don't know, like, like, I, I just feel like the sort of quote unquote firebase of, of AI apps isn't really a thing yet.Um, beyond what we have. Uh, which, which is fine. It's, it's, it's just. We could probably start in a more sort of rapid iteration cycle first before scaling up to like a Postgres or MongoDB, which are more sort of old tech. I was at a dinner with, uh, Mike Krieger, the CPO of en philanthropic, and, and he, we were just kind of going around the room going like, what are people most worried about?Yeah. And, uh, for me, uh, I, instead of security, I brought up biosafety. Yeah,[00:40:21] Jacob Effron: classic.[00:40:22] swyx: Um, actually, like I said, it was. Cliche and classic, and the rest of the table were, were like, what do you mean? Someone sitting at home can manufacture a virus that wipes out half of humanity,[00:40:32] Jacob Effron: almost like the OG Jeffrey Hinton.Like, this is why you should be scared.[00:40:35] swyx: I'm like, yeah, like the read the, you know, risk reports. Like this is like the thing. Um, I think, and Mike was just sitting there knowing he was sitting on Mythos and going like, actually it's security. Um, and I think like, um, I think the, there's, there's, part of it is.A very good marketing. Like too good. Yeah, like I would actually advise and topic to tune down the marketing because also it's, it is just a very good model and you don't have to make so many marketing claims around it. At the same time, it is not really a private model. If you give it to 40 companies.Each of whom have like 10,000 employees or whatever. Right. It's not, it's not private, it's, it's like there's bad actors in there.[00:41:18] Jacob Effron: Yeah. Hopefully, hopefully not as, uh, as bad as releasing it widely, but, uh, no, I mean, it's an interesting. You know, it's an interesting case study for how all, I mean, many model releases might, I mean, you know, this might be the first model release that looks like the rest of ‘em from from now on, right?[00:41:31] swyx: It, it, so it's, it's the, there's an overall product strategy, uh, for anthropic of like bundle, uh, you know, restrict access bundle, uh, product with model maybe.Whereas, uh, OpenAI has definitely been a lot more sort of. Philosophically aligned on like, we will just enable access everywhere and we don't know what you, what will come out of it. Right.[00:41:51] Jacob Effron: Right. Though, I mean, this current moment, uh, obviously the cynical take is also just ties to the amount of compute that both companies[00:41:56] swyx: Yeah.Right, right, right. Yeah, I think, I think that's true. I I do think like the, the, this is the, the, the scale, the dawn of like larger than 10 trillion parameter models is very interesting. I don't think it, I think it's a temporary phenomenon because we have much larger compute clusters coming online for everyone over the next like three, five years.It's, and this is like already written in, in the cards.[00:42:18] Jacob Effron: Yeah.[00:42:19] swyx: So to the extent that like, you know, will we have rationing of models, uh, above 10 trillion, uh, in like two years? I don't think so. I think everyone will have no, we'll just[00:42:29] Jacob Effron: have rationing of the next phase.[00:42:30] swyx: Right. Right. But like, that's as it should be almost like, um.My, my classic example, which I, this is just me theorizing, not anything confirmed by Google. When Google announced Gemini, they actually announced three sizes, which was Flash Pro Ultra. They never released Ultra. They only have Pro and Flash. Um, so my theory is they have ultra sitting in a basement and they just could distilling from it for, for flashing pro.Um, which like, yeah, I mean, I, I actually think that's. As it should be for any lab that they, that they do that.[00:43:02] Jacob Effron: Yeah. Just because those are the models that people actually wanna end up using. And it's just like cost prohibit.[00:43:06] swyx: It is more, yeah, it's cost. Yeah. It's, it's not the want, it's just, just, just the cost.Um, I do think, like, uh, it is interesting that, uh, for a while I was, I was considering the theory that models capped out at two, 2 trillion, and I think that's proving to be wrong. And well then if I'm wrong, how wrong? How wrong am I? Do we do 200 trillion? Do we do two quarter trillion, whatever? Um, and I don't think we have the straight answer to that, but like, uh, it's interesting that we are continuing to scale number of pers when everyone kind of assu like can see that we're not going to get like the next thousand or 1 million x from this paradigm.So like the others, like the alias of the world are working on other. Um, model architecture improvements. We need a different scaling law, I guess, because like, we're, I, I feel like people already already feel like we're tapped out on this. Like the, the end, the end state of this is we turn most of the world into data centers and like, I don't know.I don't know if we want that.[00:44:08] Jacob Effron: Yeah, I mean, uh, if the, if, if, if the return of intelligence are there, maybe, uh, maybe not so bad.[00:44:13] swyx: I, I, I think there, there's just a sheer amount of like, like un scalability that like is wrangling people's sensibilities right now. Um, especially in terms of like context lengths.Um, my classic quote is that context length is like the slowest scaling factor in, in lms.[00:44:30] Jacob Effron: Yeah.[00:44:30] swyx: Um, we, like, we took maybe. Three years to go from like 4,000 context length to a million and that's about it. Yeah. Like Gemini has had a million token context length for two years now. Um, and no one's using it.Like, so like yeah, it's memory. Memory is probably gonna be the, the biggest limiting constraint on all these things.[00:44:50] Jacob Effron: Yeah. Certainly seems that way. I guess I'm curious over the last year since you recorded last, like what's one thing you've changed your mind on?[00:44:57] swyx: I feel like I was kind of bearish on open models like last year.Um, in a sense of, like, I, I had just done the podcast with an Al[00:45:07] Jacob Effron: Yeah.[00:45:08] swyx: Of Braintrust where he, and he, I mean, you know, he has a good cross section of all the top AI companies and he says market share of open source is 5% and going down. Um, I think that's changed. I think it's going up. Um, and even if,[00:45:22] Jacob Effron: even though the capability gap does seem to be increasing.Spending on the[00:45:26] swyx: time. It's hard to tell. Yeah, it's, it's really hard to tell. ‘cause like, okay, for, for listeners, capability gap increasing is like on public benchmarks. And let's say you're comparing mythos versus like, I don't know, G-T-O-S-S or like GLM 5.1. And, um, it's, it is really hard to tell. ‘cause even if they were closing, you will also not believe that they were closing that much because it's very easy to gain the benchmarks.Yeah. So you just don't really, really know. Um, all you know is like. Uh, there's somewhat objective open router stats on like what people choose in a free market. And people do choose some of these open models in significant volume, except that a lot of them are heavily discounted. So you need to kind of like price adjust, uh, these things.So even if, even if that were true, which I, I'm not sure, like I, I, I feel like the numbers just up now instead of down. Uh, I think the. Separation between what the top tier agent labs

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0
Marc Andreessen introspects on The Death of the Browser, Pi + OpenClaw, and Why "This Time Is Different"

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Apr 3, 2026 76:20


Fresh off raising a monster $15B, Marc Andreessen has lived through multiple computing platform shifts firsthand, from Mosaic and Netscape to cofounding A16z. In this episode, Marc joins swyx and Alessio in a16z's legendary Sand Hill Road office to argue that AI is not just another hype cycle, but the payoff of an “80-year overnight success”: from neural nets and expert systems to transformers, reasoning models, coding, agents, and recursive self-improvement. He lays out why he thinks this moment is different, why AI is finally escaping the old boom-bust pattern, and why the real bottleneck may be less about models than about the messy institutions, incentives, and social systems that struggle to absorb technological change.This episode was a dream come true for us, and many thanks to Erik Torenberg for the assist in setting this up. Full episode on YouTube!We discuss:* Marc's long view on AI: from the 1980s AI boom and expert systems to AlexNet, transformers, and why he sees today's moment as the culmination of decades of compounding technical progress* Why “this time is different”: the jump from LLMs to reasoning, coding, agents, and recursive self-improvement, and why Marc thinks these breakthroughs make AI real in a way prior cycles were not* AI winters vs. “80-year overnight success”: why the field repeatedly swings between utopianism and doom, and why Marc thinks the underlying researchers were mostly right even when the timelines were wrong* Scaling laws, Moore's Law, and what to build: why he believes AI scaling laws will continue, why the outside world is messier than lab purists assume, and how startups can still create durable value on top of rapidly improving models* The dot-com crash and AI infrastructure risk: Marc's comparison between today's AI capex boom and the fiber/data-center overbuild of 2000, plus why he thinks this cycle is different because the buyers are huge cash-rich incumbents and demand is already here* Why old NVIDIA chips may be getting more valuable: the pace of software progress, chronic capacity shortages, and the idea that even current models are “sandbagged” by supply constraints* Open source, edge inference, and the chip bottleneck: why Marc thinks local models, Apple Silicon, privacy, trust, and economics all point toward a major role for edge AI* American vs. Chinese open source AI: DeepSeek as a “gift to the world,” why open models matter not just because they're free but because they teach the world how things work, and how open source strategies may shift as the market consolidates* Why Pi and OpenClaw matter so much: Marc's claim that the combination of LLM + shell + filesystem + markdown + cron loop is one of the biggest software architecture breakthroughs in decades* Agents as the new “Unix”: how agent state living in files allows portability across models and runtimes, and why self-modifying agents that can extend themselves may redefine what software even is* The future of coding and programming languages: why Marc thinks software becomes abundant, why bots may translate freely across languages, and why “programming language” itself may stop being a salient concept* Browsers, protocols, and human readability: lessons from Mosaic and the web, why text protocols and “view source” mattered, and how similar principles may shape AI-native systems* Real-world OpenClaw use: health dashboards, sleep monitoring, smart homes, rewriting firmware on robot dogs, and why the most aggressive users are discovering both the power and danger of agents first* Proof of human vs. proof of bot: why Marc thinks the internet's bot problem is now unsolvable via detection alone, and why biometric + cryptographic proof of human becomes necessaryTimestamps* 00:00 Marc on AI's “80-Year Overnight Success”* 00:01 A Quick Message From swyx* 01:44 Inside a16z With Marc Andreessen* 02:13 The Truth About a16z's AI Pivot* 03:29 Why This AI Boom Is Not Like 2016* 06:33 Marc on AI Winters, Hype Cycles, and What's Different Now* 10:09 Reasoning, Coding, Agents, and the New AI Breakthroughs* 12:13 What Founders Should Build as Models Keep Improving* 16:33 AI Capex, GPU Shortages, and the Dot-Com Crash Analogy* 24:54 Open Source AI, Edge Inference, and Why It Matters* 33:03 Why OpenClaw and PI Could Change Software Forever* 41:37 Agents, the End of Interfaces, and Software for Bots* 46:47 Do Programming Languages Even Have a Future?* 54:19 AI Agents Need Money: Payments, Crypto, and Stablecoins* 56:59 Proof of Human, Internet Bots, and the Drone Problem* 01:06:12 AI, Management, and the Return of Founder-Led Companies* 01:12:23 Why the Real Economy May Resist AI Longer Than Expected* 01:15:53 Closing ThoughtsTranscriptMarc: Something about AI that causes the people in the field, I would say, to become both excessively utopian and excessively apocalyptic. Having said that, I think what's actually happened is an enormous amount of technical progress that built up over time. And like for, for example, we now know that neural network is the correct architecture.And I, I will tell you like there was a 60 year run where that was like a, you know, or even 70 years where that was controversial. And so, so the way I think about what's happening is basically, I think, I think about basically the, the, the period we're in right now is it's, I call it 80 year overnight success, right?Which is like, it's an overnight success ‘cause it's like bam, you know, chat GPT hits and then, and then oh one hits, and then, you know, open claw hits and like, you know, these are open, these are, these are like overnight, like radical, overnight transformative successes, but they're drawing on an 80 year sort of wellspring backlog, you know, of, of, of, of ideas and thinking it's not just that it's all brand new, it's that it's an unlock of all of these decades of like very serious, hardcore research.If I were 18, like this is a hundred, this is what I would be spending all of my time on. This is like such an incredible conceptual breakthrough.swyx: Before we get into today's episode, I just have a small message for listeners. Thank you. We will not be able to bring you the ai, engineering, science, and entertainment contents that you so clearly want if you didn't choose to also click in and tune into our content.We've been approached by sponsors on an almost daily basis, but fortunately enough of you actually subscribed to us to keep all this sustainable without ads, and we wanna keep it that way. But I just have one favor to ask all of you. The single, most powerful, completely free thing you can do is to click that subscribe button.It's the only thing I'll ever ask of you, and it means absolutely everything to me and my team that works so hard to bring the in space to you each and every week. If you do it, I promise you will never stop working to make the show even better. Now, let's get into it.Alessio: Hey everyone, welcome to the Lidian Space Pockets. This is CIO, founder Kernel Labs, and I'm joined by s Swix, editor of Lidian Space.swyx: Hello. And we're in a 16 Z with a, uh, mark G and welcome.Marc: Yes, yes. A and what, half of 16? Something like that. A one. Exactly,swyx: exactly. Uh, apparently this is the, the final few days in your, your current office.You're moving across the road.Marc: Uh, we're, yeah. We have a, we have some, we have some projects underway, but yeah, this is actually, oh, this is the original. We're in actually the original office. We're in the, we're in the, we're, we're in the whole thing.swyx: It's beautiful. Yeah. Great.Marc: Thank you.swyx: So I have to come out, uh, this is a, you know, I wanted to pick a spicy start in October, 2022.I just made friends with Roone and, uh, I wanted to give him something to sort of be spicy about. And I said, uh. Uh, it'll never not be funny. The A 16 Z was constantly going. The future is where the smart people choose to spend their time and then going deep into crypto and not in ai. And that was in October 22nd, 2022.And Ruen says there was an internal meeting in a 16 Z to reorient around Gen ai. Obviously you have, but was there a meeting? What, what was that?Marc: I mean, I don't, look, I've been doing AI since the late eighties.swyx: Yeah.Marc: So I, I don't know, like all that, as far as I'm concerned, this stuff is all Johnny cum lately.Yeah. You, I mean, look, we've been doing ar entire existence. I mean, we've been doing AI machine learning deep, you know, deeply. We've been doing this stuff way from the beginning. Obviously a AI is just core to computer science. I, I, I actually view them as like quite, uh, quite continuous. Um, you know, Ben and I both have computer science degrees.Um, you know, we, we both, Ben, Ben and I actually both are world enough to remember the actual AI boom in the 1980s. Yeah. There was like a, there was a big AI boom at the time. Um, and there was a, was names like expert systems. Um, and they of like lisp and lisp machines. Uh, I, I coded in lisp. I was coding a lisp in 1989.When that was the, the language of the AI future. Um, yeah. So this is something that we're like completely, you completely comfortable with. I've been doing the whole time and are very enthusiastic aboutswyx: is there a strong, like this time is different because, uh, my closest analog was 20 16 17. It was an AI boom.Mm-hmm. And it petered out very, very quickly. Um, we, it just, it just in terms of investingMarc: sort of, sort of,swyx: yeah. Investment, investment excitement.Marc: Although that's really when the, the, the Nvidia phenomenon really, it was, I would say it was in that period when it was very clear that at, at the time it, the vocabulary was more machine learning, but it, it was very clear at that time that machine learning was hitting some sort of takeoff point.Alessio: Yeah.Marc: Well, and as you guys, you guys have talked about this at length on, on your thing, but, you know, if you really track what happened, I think the real story is, it was, it was the Alex net, uh, basically breakthrough in like 2013. That was the, that was the real knee in the curve. Um, and then it was obviously the transformer breakthrough in 17.Alessio: Yeah.Marc: Um, and then everything that followed. But, but, you know, look, machine learning, you know, there were, you know, look, uh, I mean look, I've been working, you know, I've been working with, uh, one of my, you know, kind of projects working with Facebook since 2004. Um, and on the board since 2007, and of course, you know, they, they started using machine learning very early, um, and, you know, have used it basically, you know, for like 20 years for, you know, content, you know, feed optimization and advertising optimization.And obviously many, you know, financial services. You know, many, many, many companies, many different sectors have been doing this. And so it's like one of these things, it's like, it's not a, it's not a single thing. Like it's, it's like, it's like layers, right? Yeah. Um, and, and the layers arrive at different paces and, but they kind of build up.swyx: Yeah.Marc: Uh, they kind of build up over time and then, and then, yeah. And then look, in retrospect, it was 2017 was kind of the, you know, the key, the key point with the trans transformer and then. And then as you guys know, there was this really weird like four year period where it's like the, the transformer existed and then it was just like,swyx: let's go.Yeah.Marc: Well, but, but it was just, but, but between 2020, but between 2017 and 2021, I mean, that was the era of which like companies like Google had internal chat Botts, but they weren't letting anybody use them.swyx: Yeah.Marc: Right. And then, you know, and then OpenAI developed Chat GT or GPT two, and then they told everybody, this is way too dangerous to deploy.Right. Yeah. You know, we can't possibly let normal people, normal people use this thing. And then you, you guys, I'm sure remember AI Dungeon, um mm-hmm. So the o for, there was like a year where like the only way for a normal person to use GP T three was in, in AI dungeon.Alessio: Yeah.Marc: And so you, you, we would do this, you'd go in there and you'd pretend to play Dungeons and Dragons.In reality, you're just trying to talk to talk to GPT. And so there was this, you know, there was this long, you know, and I, you know, the big, big companies, you know, big companies are cautious and, you know, the big companies were cautious. It, it, by the way, it took open ai. You know, they, they, they talk about this, it took open AI time to actually adjust, you know, kind of re redirect their researchswyx: path.I, I think, uh, let say Rosewood, right? Uh, the, the dinner that founded OpenAI was right there.Marc: Right, right. But that, that dinner would've taken place in 20swyx: 18Marc: 19. The formation of OpenAI Uhhuh as late as 2018.swyx: Uh, uh, sorry. Uh, no, I'm, I'm, I'm, I'm wrong. Probably It should be 20. Yeah. They just celebrated a 10 year anniversary, so it it is 2025.Yeah, so, so 2015?Marc: Yeah. 2015. Yeah. 2015. But then, uh, um, Alec Radford did G PT one in what, probablyswyx: mm-hmm. 17, 18,Marc: yeah. 17, 18. So it, yeah. For, and then, and then they didn't really, and then GPT three was what? 2020? 2020.swyx: 2020.Marc: Because that became copilot immediately. Even open ai, which has been, you know, the leader of, of this thing in the last decade, you know, e even they had to adapt and, and, and lean into the new thing.And so. Um, yeah, I, I think it's just this process of basically sort of wave after wave layer after layer, you know, building on itself. And then you kind of get these catalytic moments where, where the whole thing pops and, and obviously that's what's happening now.swyx: Is it useful to think about will there be any ai, winter?‘cause there's always these patterns. Like, is this, in the summer is something I constantly think about because do I get, do I just like. Just get endlessly hyped and just trust that I will only be early and never wrong or right. Well, are we, will there be a winter?Marc: So there's something about, say the following.There's something about AI that has led to this repeated pattern. Um, and, and, and you guys know this,swyx: it's summer, winter, summer,Marc: winter, summer, winter, summer, winter. And it goes back 80 years. Yeah. 80 years. Uh, so the original neural network paper was 1943. Right. Which is, which is amazing. Uh, that it was, it was far back that long.And then there was you, if you guys have ever talked about this on your show, but there was this, uh, there was a big, uh, there was an a GI conference at Dartmouth University in 1950. 55. 55, yeah. And they got a NSF grant to, uh, for the, all the AI experts at the time to spend the summer together. And they figured if they had 10 weeks together, they could get a GI, uh, at the other end.And they got their, by the way, they got the grant, they got the 10 weeks and then, you know, 1955, you know. No, no. A GI. And like I said, I, I lived through the eighties version of this where there was a big, a big boom and a crash. And so, so there is this thing, and there, there is something about AI that causes the people in the field, I would say, to become both excessively utopian and excessively apocalyptic.Um, and, and it's probably on both sides of like the, the, the boom bus cycle. You, you kind of see that play out. Having said that, I think what's actually happened is like just, and you know, and we now know in retrospect like an enormous amount of technical progress that built up over time. And like for, for example, we now know that neural network is the correct architecture.And I, I will tell you like there was a 60 year run where that was like a, you know, or even 70 years or that was controversial. And, and we now know that that's the case. And so we, we now, you know, everything we're building on today just sort of derives from the original idea in 1943. And so, so in retrospect, we, we now know that like, these, these guys are right.They, they, you know, they would get the timing wrong and they thought, you know, capabilities would arrive faster, or they were, it could be turned into businesses sooner or whatever, but like, they were fundamentally, the, the scientists who worked on this over the course of decades were fundamentally correct about what they were doing.And, and the, and the payoff from, from, from all their work is happening now. And so, so the way I think about what's happening is basically, I think, I think about basically the, the, the period we're in right now is it's, I call it 80 year overnight success, right? Which is like, it's an overnight success.‘cause it's like bam, you know, chat, GPT hits and then, and then oh one hits, and then, you know, open claw hits and like, you know, these are open, these are, these are like overnight, like radical, overnight transformative successes, but they're drawing on an 80 year sort of wellspring backlog, you know, of, of, of, of ideas and thinking it's not just that it's all brand new, it's that it's an unlock of all of these decades of like very serious, hardcore research.Um, and thinking, and look, there were AI researchers who spent their entire lives. They got their PhD. They, they worked for, they've researched for 40 years. They retired in a lot of cases, they passed away and they never actually saw it work.swyx: Yeah. It's all sad.Marc: It is. It is sad. It's sad. Knewswyx: Jeff Hinton was like the last guy.Marc: Yeah. Yeah. Well, there were the guys, uh, was a guy, Alan Newell. I mean, there's tons of John McCarthy. You know, John McCarthy was like one of the inventors in the field. He's one of the guys who organized the Dartmouth Conference and you know, he taught at Stanford for 40 years. Wow. And passed, you know, passed away, I don't know, whatever, 10, 10 years ago or something.Never, never actually go. Got to see it happen. But like, it is amazing in retrospect, like, these guys were incredibly smart and they worked really hard and they were correct. So anyway, so then it's like, okay, you know, say history doesn't repeat, but it rhymes. It's like, okay, does that mean that there's gonna be another, like, you know, basically boom buzz cycle.And I, I will tell you, like, let, like in a sense, like yes, everything goes through cycles and, you know, people get overly enthusiastic and overly depressed and there's, there's a time, there's a timelessness to that. Having said that, there's just no question. Um, so the form, the foremost dangerous words in investing this time are, this time is different.Do you know the 12 most dangerous words investing? No. The four most d foremost dangerous words in investing are this time is different. Yeah. Um, the 12 most dangerous words. And so like, I'll tell you what's different. Like now it's working like, like there's just no, I mean, look, there's just no question.And by the way, I, I'll just give you guys my take. Like L LLMs, like from, from basically the Chad G PT moment through to spring of 25. I think you could still, I think well intention, well, and of. Form skeptics could still say, oh, this is just pattern completion. And oh, these things don't really understand what they're doing.And you know, the hall hallucination rates are way too high. And, you know, this is gonna be great for creative writing and creating, you know, Shakespeare and so sonnets and, you know, as, as rap lyrics or whatever, like, it's gonna be great and all that stuff, but we're not gonna be able to harness this to make this relevant in, you know, coding or in medicine or in law or in, you know, you know, kind of feels that, you know, kind of really, really matter.And I think basically it was the reasoning breakthrough. It, it was oh one and then R one that basically answered that question basically said, oh no, we're gonna be able to actually turn this into something that's gonna work in the real world. And, and then obviously the coding breakthrough over the, over basically the coding breakthrough that kind of catalyzed over the holiday break was kind of the third step in that.Mm-hmm. Where you're just like, alright, if, if, you know, if Linus Tova is saying that the AI coding is no better than he is like. Like, that's, that's never happened before. That's theswyx: benchmark.Marc: Yeah. That's never happened before. And so now we know that it's, it's gonna sweep through coding and, and then, and then we, we know, you know, we know that if it's gonna work in coding, it's gonna work in everything else.Right. It's just then, because that's, that's like, that's like, that's like the hardest in many ways. That's the hardest example. And how everything else is gonna be a, a derivative of that. And then on top of that, we just got the agent breakthrough, you know, with Open Claw, which is fantastic. Which is amazing and incredibly powerful.And then we just got the, the, um, the auto research, uh, you know, the, the self-improvement. You know, we're now into the self-improvement breakthrough. And so the, so the way I think about it is we've had four fundamental breakthroughs in functionality, l OMS reasoning, uh, agents, um, and then, uh, and, and then now RSI, um, and, and they're all actually working.Um, and so I'm, I'm just, as you like, you can tell I'm jumping outta my shoes. Like, like this is, like this is it like this, this is the culmination of 80 years worth of worth of work, and this is the time it's becoming real.Alessio: Yeah.Marc: I, I'm completely convinced.Alessio: I think the anxiety that people feel is like during the transistor era, yet Mors law, and it's like, all right, we understand why these things are getting better.We understand the physics of it. Yeah. With ai, it's. It's so jagged in like the jumps where like, like you said, it's like in three months you have like this huge jump like, and people are like, well this can keep happening. Right? But then it keeps happening,Marc: it'll keep happening.Alessio: And so like how do you think about also timelines of like what's we're building?I think we always have this question with guests, which is like, you know, should you spend time building harness for a model versus like the next model just gonna do it one shot in the lead space. Right. And how does that inform, like how you think about the shape of the technology? You know, you talk about how it's a new computing platform.If you have a computing platform, then like every six months it like drastically changes in what it looks like. It's hard to build companies on top of it.Marc: Yeah. So, so a couple things. So one is like, look, the, the Moore's law was what we now call a scaling law. Like Moore's Law was a scaling law and for your younger viewers, more Moore's Law was every chip chip chips either get twice as powerful or twice as cheap every, every 18 months.And that, and that and that, you know, that it's gotten more complicated in the last few years. But like that, that was like the 50 year trajectory of, of, of the computer industry. And then, and then by the way, and that's what took the mainframe computer from a $25 million current dollar thing into, you know, the phone in your pocket being, you know, a million times more powerful than that.Like that, you know, for, for 500 bucks. And so that, that was a scaling law. And then, and then, and then key to any scaling law, including Moore's Law and the AI scaling laws is, you know, they're not really laws, right? They're, they're, they're, they're predictions, but when they work, they become self-fulfilling predictions because they, they, they, they, they set a benchmark and, and then the entire industry, right?All the smart people in the industry kind of work to make sure that, that, that actually happens. And so they, they kind of motivate the breakthroughs that are required to, to keep that going. And, and in and in chips, that was a 50 year, that was a 50 year run. Right. And it, it was amazing. And it's still happening in, in some areas of, of chips.I think the same thing is happening with the, the core scaling laws. The core scaling laws. In, in, in ai, you know, they're, they're not really laws, but like they, they are basically. There are predictions and then they're motivating catalysts for the research work that is required to be. And, and, and, and by the way, also the investment, uh, dollars, um, uh, you know, required to basically keep, you know, keep the curves going and, and look, it, it is, it's gonna be complicated and it's gonna be variable and they're, you know, there're gonna be walls that are gonna look like they're fast approaching, and then they're gonna be, you know, engineers are gonna get to work and they're gonna figure out a way to punch through the walls.And obviously that's, you know, that's been happening a lot, you know, and then look, there's gonna be times when it looks like the walls have, you know, the, the, the laws have petered out and then they're gonna, they're gonna pick up again and surge and then, and then, and then it, it appears what's happening to the eyes is there's not multiple, you know, multiple scaling laws.Um, there's multiple areas of improvement. And, and I think, you know, I don't know how many more there are already yet to be discovered, but there are probably some more that we don't know about yet. You know, they, like, for example, there's probably some scaling law around, um, world models and robotics that we don't fully understand, you know, kind of acquisition of data at scale in the real world that we don't fully understand yet.So that, that, that one will probably kick in at some point here. There's a bunch of really smart people working on that. Um, and so, yeah, I, I think the expectation is that, that, you know, the, the scaling laws generally are gonna continue. Yeah. The, the pace of improvement will continue to move really fast.Um. To your question on like what to build. So, uh, I'm a complete believer the scaling laws are gonna continue. I'm a complete believer the capabilities are gonna keep getting amazing, um, you know, leaps and bounds. Uh, the part where I kind of part ways a little bit with how, what I would describe as the AI purists, um, you know, which is, which I would characterize as like the people who are.In many ways, the smartest people in the field, but also the people who spend their entire life, like at a lab, um, and have, have, I would say, have very little experience in the outside world. Um, the, the, the nuance I would offer is the outside world of 8 billion people and institutions and governments and companies and economic systems and social systems is really complicated.Um, and, um, and doesn't, you know, it it 8 billion people making collective decisions on planet Earth is not a simple process of like, just like you see this happening now. It's like a bunch of AI CEOs have this thing, which is just like, well, there's just this, they just all have this kind of thing when they talk in public where they're just like, well, there's these, these obvious set of things that so society to do.Alessio: Mm-hmm.Marc: And then they're like, society's not doing any of those things. Right. And it's like, how can society not, you know, what, whatever their theory is, how can society not see x, y, Z? Mm-hmm. And the answer is, well, society is number one. There's no single society, it's like 8 billion people. And they like all have a voice, and they all have a vote, like at the end of the day of how they, they react to change.And then, you know, it just like, it's just human reality is just really complicated and messy. Um, and, and, and so the specific answer to your question is like, as usual, it depends. Um, you know, it, it depends. Look, pe there's no question people are gonna, like, there's no question they're gonna be companies.It's already happening. There are companies that think that they're building value on top of the models and then they're just gonna get blissed by the, by the next model. There's no question that's happening. But I think there's no question also that just the process of adaptation of any technology into the real and into the real messy world of humanity is, is just going to be messy and complicated.It's, it's not going to be simple and straightforward. It's gonna be messy and complicated. And there are gonna be a lot of companies and a lot of products, um, uh, and in, in fact entire industries that are gonna get built to, to, to basically actually help all of this technology actually reach real people.Alessio: The amount of capital going into these companies, I mean, Dario talked about it on the Door Cash podcast and Door Cash was like, why don't you just buy 10 x more GPUs? And he is like, because I'm gonna go bankrupt if the model doesn't exactly hit the, the performance level. How do you think about that?Also as a risk on, you know, you guys are investors, open AI and thinking machines and world apps. It seems like we're leveraging the scaling loss at a pretty high rate, right? Like how comfortable, I guess, do you feel with the downside scenario, like, and say like things Peter out, you think you can kind of like restructure uh, these build outs and uh, you know, capital investments.Marc: Yeah. So should start by saying, so I live through the.com crash, um, and I can tell you stories for hours about the.com crash and it was horrible. No, it was awful. It was, it was, it was apocalyptic by the way. The, a lot of the.com crash was actually at the time, it was actually a telecom crash. It was a bandwidth crash.Like the, the thing that actually crashed, that wiped out all the money with the tele, the telecom companies.swyx: GlobalMarc: crossing. Global, global, yeah.swyx: I'm from Singapore and they, they laid so much cable o over over our oceans.Marc: Actually there was a scaling law in the.com. Era. And it was literally the, the US Commerce Department put out a report in 1996 and they said internet traffic was doubling every quarter.Um, and, and actually in 1995 and 1996, internet traffic actually did double every quarter. And so that became the scaling law. And so what all these telecom entrepreneurs did was they went out and they raised money to build fiber, anticipating that the demand for bandwidth is gonna keep doubling every quarter.Doubling every quarter though is like, you know, grains of chess and the chessboard, like at some point the numbers become extremely large. Right. And, and, and it really, and really what happened was the internet. The internet by the way, continuously kept growing basically since inception. And it's, you know, it's, it's continuously grown.It's never shrunk. And it's grown really fast compared to anything else. Mm-hmm. You know, in, in, in human history. But it wasn't doubling every quarter as of 19 98, 19 99. And so there was this gap in the expectation of what they thought was a scaling law versus reality. And that's actually what caused the.com crash, which was the, it they, they way over companies like global crossing way overbuilt fiber, which is sort of the, and by the way, fiber, telecom equipment, you know, so all the, all the networking gear, you know, and then, and then by the way, the actual physical data centers, like that was the beginning of the, of the, of the data center build and then, and the data center overbuild.And so you had that, but it was, it was literally, I think it was like $2 trillion got wiped out, right? It was like Jesus, it was like a big, it was. And by the way, the other, the other subtlety in it was the internet companies themselves never really had any debt. ‘cause tech, tech companies generally don't run on debt, but the telecom companies run on debt.Physical infrastructure companies run on debt. And so the companies like Global Crossing not just raise a lot of equity, they also raise a lot of debt. So they're highly levered. And so then you just do the thing. It's just like, okay, you have a highly levered thing where you're, you're just over, you're overbuilding capacity.Demand is growing, but not as fast as you hoped. And then boom, bankrupt. Right. And, and then it, and then it's like they say about the hotel industry, which is, it's always the third owner of a hotel that makes money. It has to go bankrupt twice, right? You have to wash out all of the over optimistic exuberance before it gets to actually a stable state.And then it makes money. So by the way, all of those data centers and all of those, all the fiber that they're in use, it's all in use today. Yeah. But 25 years later. But it, it, it took, and actually the elapsed time was, it took 15 years. It took 15 years from 2000 to 2015 to actually fill, fill up all that capacity.The cautionary warning is the, the overbuild can happen. Um, and, and, and, and, you know, you, you get into this thing where basically everybody, everybody who basically has any sort of institutional capital, it's like, wow. It's just, I, I don't know how to invest in these crazy software things. For sure I can put build data centers and for sure I can buy GPUs that I can deploy, you know, compute grids and, and all these things.Um, and so, you know, if you're a pessimist, you could look at this and you could say, wow, this is like really set up to be able to basically replicate, you know, what we went through, what we went through in 2000. Obviously that would be bad. The counter argument, which is the one I I agree with, which is the counter on, on the other side is a couple things.One is the companies that are investing all the, the companies that are investing the money are like the bluest chip of companies. And so back, back, back in the, in the do, like Global Crossing was like a, it was like an entrepreneur. It was like a, a new venture, but like the money that's being deployed now at scale is Microsoft, and, you know, and Amazon and Google, Facebook and Facebook and Nvidia and, you know, these, these, these, and, and now you know, by the way, open ai philanthropic, which are now at like, you know, really serious size, um, you know, as companies with, you know, very serious revenue.These are very large scale companies with like, lots, lots of cash, lots of debt capacity that they've, they've never used. And so th this is institutional in a way that, that really wasn't at the time. And then the other is, at least for now, every dollar that's being put into anything that results in a running GPU is being turned into revenue right away.Like so, and you guys know this, like everybody's starved for capacity, everybody's starved for compute capacity and then, you know, all the associated things, memory and, and, and interconnected and everything else. Um, data center space. And so e every dollar right now that's being put into the ground is turning into revenue.And, and it, and in fact, I actually think there's an interesting thing happening, which is because everybody starve for capacity, the models that we actually have that we can use today are inferior versions of what we would have if not for the supply constraints. That's true. Um, if Right pose a hypothetical universe in which GPUs were 10 times cheaper and 10 times more plentiful mm-hmm.The models would be much better. ‘cause you would just allocate a lot more money to training and you'd just build better models and they would be better. Um, and so we're, we're actually getting the sandbag version of the technology.swyx: Yeah. No. Everything we use is quantized because the, the labs have to keep the, the full versions,Marc: right?swyx: LikeMarc: we're not even getting the good stuff.swyx: Yeah.Marc: But, but getting the good stuff, it's, it's just, even if technical progress stops. Once there's like a much bigger build of like GPU manufacturing capacity and memory, you know, all, all the things that have to happen in the course of the next five or 10 years.Once it happens, even the current technology is gonna get, gonna get much better. And then as you know, like there's just like a million ways to use this stuff. Like there's just like a million use cases for this. Mm-hmm. Like, it, it, you know, this isn't just sending packets across a, a thing, whatever, and hoping that people find something to do with it.This is just like, oh, we apply intelligence into every domain of human activity. And then it works like incredibly well. Yeah. Um. Here's what I know, here's what I know. Um, in the next three or four year, it's like somewhere between three or four years out, basically everything is selling out. So like the, the entire supply chain is, is, is, is sold out or, or, or selling out.And so there, there's no, like, we're just gonna have like chronic supply shortage for, you know, for years to come. Um, there's going to be a response from the market that's gonna result in an enormous, you know, it's happening now. An enormous flood of investment in a new fab capacity and ev you know, every, everything else to be able to do that, at some point the supply chain constraints will unlock, you know, at least to some degree that will be another accelerant to industry growth when that happens.‘cause the products will get better and everything will get cheaper. Um, and so, so I know that's gonna happen. I know that, you know, the deployments, you know, the, the actual use cases are like really compelling. And then, like I said, you know, with reasoning and agents and so forth, like, I know they're just gonna get like much, much better from here.And so I, I, I know the capabilities are like really real and serious. I also know that the technical progress is not going to stop. It. It, it is excel. It is, is accelerating. Like the, the breakthroughs are are tremendous. I mean, even just month over month, the breakthroughs are really dramatic. And so, you know, I think if you were a cynic and there, there are cynics, you can look at 2000, you can find echoes.But I can't even imagine betting it that this is gonna like somehow disappoint and, you know, at least for years to come, I think it would be essentially suicidal to make that bet. Yeah. Um, it was that Michael Burry, uh, uh, that'sswyx: anMarc: interesting guy, huh? We'll pick on a guy. We'll pick, let's pick on one guy.We'll pick. Well ‘cause he did, he he came out with, it was, it was the, heswyx: doesn't mind.Marc: It was the Nvidia short. Right. He came with the Nvidia short. And then if you guys probably talked about this, which is the, the analysis now that like the current models are getting better faster at such a rate that if you are running an Nvidia, if you're running an Nvidia inference chip today, that's three years old, you're making more money on it today than you did three years ago because the pace of improvement of the software is, is faster than the, the, the depreciation cycle, the chip.And then my understanding is Google is running. I don't if they've, I don't know exactly what, uh, these are rumors that I've heard or maybe it's public, but, um, I think Google's running very old TPUs, very profitably. Ference. Yeah. And very profit and very profitably. Yeah. Um, and so, so it actually turns out, as far as I can tell, it's actually the opposite of the Beery thesis is actually.He was actually 180 degrees wrong. It's actually the, the, the, the old Nvidia chips are getting more valuable, which is something that's like literally never happened before. Like it's never been the case that you have an older model chip that becomes more valuable, not less valuable. And that, and again, that's an expression of the just ferocious pace of software progress.Ferocious pace of capability payoff. Yeah. Uh, that you're getting on the other side of this. And so I just, the idea of betting against that, like.swyx: Yeah. Yeah. Well, one ofMarc: my, it seems like an invitation to get your face ripped up.swyx: One of my early hits was like modeling the lifespan of the H 100 and h two hundreds and, and going like, you know, usually they advise like four to seven years and it was, you know, maybe you sort of realistically haircut cut it down to two to three.Yeah. But actually it's going up and not down. Yeah. And, and uh, that's, I mean that's, I think that's the dream. Uh, we are finding utilization and I think utilization solves all problems. Like, you can, you can find use, use cases for even like the poor, like even memory, we're having a shortage. Right. And, and even like the, the shittier versions of, of memory that we do have, we are finding use cases for it.So like That's great.Marc: Yeah.Alessio: How, how important is open source AI and kinda like edge inference in a world in which you have three years of supply crunch. Like, do you think in the, like, you know, if you fast forward like five years, like how do you think about inference, uh, in the data center versus at the edge?Marc: Well, so just to start, yeah. So I think, I think open source is very important for a bunch of reasons. I think edge, edge inference is very important for a bunch of reasons. I, I think just practically speaking, if we're just gonna have fundamental construc, supply crunches for the next, I mean, you, you guys know if you just project forward demand over the next three years, right?Yeah. Relative to supply, one of the, its main predictions you can do is what's gonna, what, what's gonna happen to the cost of, of inference in the core, uh, over the next three years? And like, it may rise dramatically, right? Like, so, so what is, and then is, is, you know, like the, the, the big model competition are subsidizing heavily right now.Right? Right. And so, so what's the, what will be the average person's, you know, per day, per month token cost, you know, three years from now to do all the things that they want to do. And I, I don't know, it's gonna. I mean, I have, you guys probably have friends, I have friends today who are paying a thousand dollars a day for open claw, for claw tokens to run open claw.Right? And so, okay. $30,000 a month. Right? And, and by the way, those, those friends have like a thousand more ideas of the things that they want their claw to do, right? Yeah. And so you, you could imagine there, there's like latent demand of up to, I don't know, five or $10,000 a day of, of, of tokens for a fully deployed, you know, per personal agent.Uh, and obviously consumers can't pay that, right? And so, so, but it gives you a sense of the fu of the fu of the future scope of demand, right? And so, so even, even if there's a 10 x improvement in price performance, that still, you know, goes to a hundred dollars a day, which is still way beyond what people can pay.Mm-hmm. So there's just gonna be like. Ferocious to me, by the way. The agent thing, the other interesting thing is I think the agent thing, so up until now, a lot of the constraints of GGPU constraints, I think the agent thing now also translates into CPU constraints. Mm-hmm. Right?swyx: CPU memory.Marc: Yes. CPU memory, right?And so, like the entire chip ecosystem is just gonna get wait,swyx: wait for network constraints, that that will be the killer.Marc: It's all bottleneck potentially for years. And so, so I, I think that Brad, and, and I think it's actually possible, I mean, generally inference costs are gonna keep coming down, but I think the, let's put it this way, the rate of decline, I think may level out here for a bit because of these supply constraints.And then at some point, maybe the lab stops subsidizing so much and that, that, that again, will be, be an issue. And so there's just gonna be so much more demand for inference than, than can be satisfied. Um, you know, kind of with the centralized model. And then, and then, you know, you guys know this, but like all the, just the dramatic, I mean just the dramatic innovations that have happened in the Apple silicon to be able to do, uh, inferences, it's quite amazing the level of effort being put.Like the open source guys are putting incredible effort into getting, you know, this recurring pattern where the big model will never run on a pc, and then six months later mm-hmm. Oh, it runs in a pc, right? It's like amazing. And there's very smart people working on that. So there's all that. And then look, there's also, you know.There's also like other, there's other motivators. There's other motivators which is just like, okay, how much trust are the big centralized model providers? You know, how much trust are they building in the market versus, you know, how much are, you know, at least for, in certain cases with some people, for certain use cases, people being like, well, I'm not willing to just like, turn everything over.So there, there, there's all the trust issues. Um, by the way, there's also just like straight up price optimization. There's many uses of AI where you don't need Einstein in the cloud. You just need like a, a a, a smart local model. There's also performance issues where you want, you know, you want, you know, you're gonna want your doorknob to have an AI model in it.Right. You know, to be able to, you know, do, um, you know, to be able to do access control. Um, obviously like everything with a chip is gonna have an AI model in it. Mm-hmm. And it, a lot of those are gonna be local. Um, and so, yeah. No, like I think, I think you're gonna have ti and then you're gonna, by the way, also wearable devices, you know, you don't wanna do a complete round trip.You want, you know, you, whatever your smart devices are, you want it to be like super low latency. Yeah.swyx: The question, do we care who makes it? Yeah. One of the biggest news this week was the collapse of AI two, the Allen Institute. Mm-hmm. One of the actual American open source model labs. Yeah. Um, and, uh, I'm not that optimistic on, on American open source.Yeah. Like you, you guys invested in MIS trial and MIS trial's doing extremely well outside of China. That's about it.Marc: Yeah. We'll see. We'll see. I look, I, number one, I do think we care. Uh, I do think we, I do think we care who makes it. Um, I would say this, the, the, the, the previous presidential administration wanted to kill it in the us Oh yeah.They wanted to drown in the bathtub. Um, and so they wanted to kill it. So at least we have a government now that actually like, actually wants it wants it to happen. And youswyx: earned to councilMarc: and Yeah. And the new and the P pcast. Yeah. So the, the, you know, this admin for whatever other political issues people have, which are many, you know, this administration has, I think a very enlightened view and in particular an enlightened view on AI and in particular on open source ai.Uh, and so they're very supportive. Um, my read is the Chi. The Chinese have a very, the various Chinese companies have a very specific reason to do open source, which is, they, they, they don't fundamentally, they don't think they can sell commercial, uh, AI outside of China right now. And or at least specifically not, not in the US for a combination of reasons.And so they, they kind of view, I think, open source AI as a bit of a loss leader against basically domestic, uh, you know, paid, paid services. And then kind of an, you know, kind of an ancillary products. You know, they're, they're very excited about it, by the way. I think it's great. I think it's great that they're doing it.Um, you know, I think Deeps seek was like a gift to the world. Um, I think. The great thing about open source, open source, the, the, the impact of open source is felt two ways. One is you, you get the software for free, but the other is you get to learn how it works, right? And so like the paper, the paper, the paper and, and the code, right?And the code. And so, like, for example, I thought this was amazing. So open comes out with L one and it's an amazing technical breakthrough, and it's just like, absolutely fantastic. But of course they don't explain how it works in detail. And then of course they hide the, they hide the reasoning traces, right?And, and then, and then, and then everybody's like, okay, this is great, but like, who's gonna be able to replicate this? Are other people gonna be able to do this? You know, is their secret sauce in there? And then our one comes out and it's just like, there's the code and there's the paper, and now the whole world knows how to do it.And then, you know, three months later, every other AI model is, is adding reasoning. And so, so you get this kind of double, like even if the Chinese models themselves are not the models that get used, the education that's taken place to the rest of the world, the information diffusion, you know, is incredibly powerful.So that happens and then, I don't know. We'll, we'll see. You know, there are a bunch of American, you know, open source, you know, ai, uh, model companies. I mean, look, there's gonna be tremendous, you know, there already is. There's, you know, there's gonna be tre there's tremendous competition, uh, among the primary model companies.You know, there's, depending on how you count, there's like four or five, you know, big co model companies now that are, you know, kind of neck and neck, uh, in different ways. Um, uh, you know, and, and, and, um, you know, and then obviously Bo Bo both X and then MetAware involved are, you know, both have huge, you know, huge attempts to, you know, kind of, to kind of leapfrog underway.And then you've got, you know, a whole fleet of startups, new companies, including a whole bunch that we're backing, that are, you know, trying to come out with different approaches. And then you've got whatever it is. I don't know how, how many, how many, like main line foundation model companies are there in China at this point?It's probably six. It'sswyx: five Tigers is what they call it. Yeah. Uh, Quinn is in questionable because there's change in leadership,Marc: right?swyx: Yeah.Marc: But that, does that include, that includes like Moonshot,swyx: yes. Can deep seek, uh, uh, ZI, um, Quinn oh one is in there.Marc: Right. And then, um, and by dance and, and then you see,swyx: ance would be like the next tier ance.They weren't as prominent. They weren't, didn't haveMarc: a leading. Yeah. But they, you at least, you know, ance is very inspiring and presumably they have more stuff coming and Tencent probably has more stuff coming and, and so forth. And so, so, so like, look, here, here would be a thing you can anticipate, which is there are not these markets, there are not going to be between the US and China right now, there's like a dozen primary foundation model companies that are like at scale, at, at some level of a critical mass.It's not gonna be a dozen in three years, right? Like, it just because these industries don't bear a dozen, it's, it's gonna be three or you know, there's gonna be three or four big winners or maybe one or two big winners. And so there's gonna be like a whole bunch of those guys that are gonna have to figure out alternate strategies.Um, and I think like open source is one of those strategies. And so I, I think you could see like a whole, i, I, I think the questions like, who's gonna do open source? I think that could change really fast. I, I think that, that, that's a very dynamic thing. I think it's very hard to predict what happens. And, and I think it's very important.swyx: NVIDIA's doing a lot.Marc: Well, I was gonna say. Well, exactly. And then you're got Nvidia and then, and then, you know, just to, again, indu, there's an old thing in business strategy, which is called, uh, commoditize Compliments. Commoditize the compliment. That's right. And so if your Jensen is just kind of obvious, of course, you wanna commoditize the software.Yeah. And he's, and to his enormous credit, he's putting enormous resources behind that. And so maybe it, maybe it's literally Nvidia and I think that would be great.Alessio: Yeah. Uh, narrative violation to European projects, uh, in the, uh, damn.swyx: I'm hosting my, uh, Europe, uh, conference soon. And I got both of them.Alessio: They got us.They got us. MarkMarc: finished. They got us, us. Well, wait a minute. Where was Peter? So where was Steinberger when he did? In AustriaAlessio: was, yeah, yeah, yeah.Marc: He was in what? He was in Vienna. Oh, he was in Vienna. And then where is he now?swyx: Uh, he's moving to sf.Marc: Okay. Okay. Alright. Okay, there we go. And then, yeah, the PI guy, right?The PI guys are European.swyx: Yeah, they're also, they're buddies inAlessio: Australia. Mario's also there. Yeah.Marc: Right. And are they, yeah, they haven't announced yet. Any sort of change changed or have theyAlessio: No, they're, they have a company there.Marc: Okay. Got, okay. Good.Alessio: Good, good,good.Alessio: Um,Marc: yeah, good.swyx: Anyways, I think pie and open cloud very important software things and, and I just wanted you to just go off on what you think.Marc: Yeah. So I think in co the, the combination of the two of them I think is one of the 10 most important softwares. Openswyx: Claw got all the attention, but Right. Talk about pie,Marc: pi pie's, kind of the Yeah. PI's, PI's kind of the architectural breakthrough for those of us who are older. There was this whole thing that was very important in the world of software basically from like 1970 to, I don't know, it still is very important, but like 19, from 1973 to like basically the creation of Linux, which is basically this, this thing used to call like the Unix mindset.Like so, so, ‘cause there were all these different, you know, theories. There are all these different operating systems and mainframes and, and then you know, all these windows and Mac and all these things. And then there was this, but kind of behind it all was this idea of kind of the Unix mindset. And the Unix mindset was this thing where basically you don't have these, like, like in the old days, like, like the operating system that like made the computer industry really work, like in the 1960s mm-hmm.Was this thing called o os 360, which was this big operating system that IBM developed that was supposed to basically run everything. And it was this like giant monolithic architecture in the sky. It was like a, you know, it was like a giant castle. Um, of software. And, and by the way, it worked really well and they were very successful with it.But like, it was this huge castle in the sky, but it was this thing, it was almost unapproachable, which is like, you had to be kind of inside IBM or very close to IBM. And you had to really understand every aspect, how the system worked. And then the, the Unix sky is originally out of at and t and then out out of Berkeley, um, you know, came out and they said, no, let's have a completely different architecture.And the way architecture's gonna work is we're gonna have, we're gonna have a, a prompt and, and a, and a shell. And then, and then we're gonna, all, all the functionality is gonna be in the form of these discreet modules, and then you're gonna be able to chain the modules together. Mm-hmm. Yeah. And so like the, the, the op, it's almost like the operating, operating system itself is gonna be a programming language.Um, and then that led led to the, the, the sort of centrality of the shell. Um, and then that led to sort of, uh, you know, basically chaining together Unix tools. And then that led to the emergence of these, these scripting languages like Pearl, where you, you could basically kind of very easily do this, and then the shells got more sophisticated and then, and then, and then look like, you know, that, that, that number one, that worked and that, that was the world I grew up in.Like I was, I was a Unix guy. You know, sort of from, call it 1988 to, you know, kind of all, all the way through my work and it worked really well. It, it's in the background, um, you know, nor normal people don't need to, didn't need to necessarily know about it, but like, if you were doing like system architecture, application development, you, you, you knew all about it.Um, and then, you know, it's been in the background ever since. And, you know, look, your Mac still has a Unix shell, you know, kind of in there, and your iPhone still has a Unix shell kind of buried in there somewhere. So they're kind of in there. And then, you know, the Windows shell is kind of a, you know, sort of a weird derivative of that.But, um, you know, but look, the inter, the internet runs on Unix, um, and that smartphones, actually, both iOS and Android are Unix derivatives. And so, you know, kind of Unix did end up winning. But, but anyway, and then we just started taking that for granted. And then, and then so, so basically the, the way I think about what happened with Pie and then with Open Claw is basically what those guys figured out is, I always say the, the great breakthroughs are obvious in retrospect, right?Which is the best kind, the best kind. They weren't obvious at the time or somebody else would've done them already. Um, and so there is a, like a real conceptual leap, but then you look at it sort of the backwards looking and you're just like, oh, of course. Mm-hmm. Like the, the, to me those are always the best breakthroughs.Well, actually language models themselves are like that. It's just like, oh, next token completion. Oh, of course.swyx: Yeah. What other objective mattered?Marc: Yeah, exactly. But, but like it, right. But she's even saying it wasn't obvious until somebody actually did it. Right. And so the conceptual breakthrough is real and deep and powerful and, and very important.And so the way I think about pie and olaw is it's basically marrying the, the language model mindset to the un to the Unix, basically shell prompt mindset. And so it's, it's basically this idea that what, what, so what is an agent, right? And as, as, and as you know, like many smart people who have been trying to figure out what an agent is for, for, for decades, and they've had many architectures to build agents and the whole thing.And it turns out what is an agent. So it turns out what we now know is an agent is the following. It's, so it's a language model. And then above that, it's a ba, it's a bash shell. Um, so it's a, it's a Unix shell, and then it's, and then the agent has access, uh, has access to, to the shell. And, you know, hopeful, hopefully in a sandbox, maybe in, maybe in a sandbox.So it's, it's the model. Um, it's the shell. Um, and then it's a fi, it's a file system. Um, and then the state is stored in files. And then, you know, there's the markdown format for the, you know, for, for the files themselves. And then, and then there's basically what in Unix is called Aron job. There's a loop and then there's a heartbeat for the, there's heartbeat and, and the thing basically Wake Wakes up.Wakes up. So it's basically LLM plus shell, plus file system, plus markdown, plus kron. And it turns out that's an agent. And, and, and every part of that, other than the model is something that we already completely know and understand. And in fact, it turns out that like the latent power of the Unix shell is like extraordinary because basically like all, like, there's just like an, there's just enormous latent power in the shell.There's enormous numbers of Unix commands, there's enormous number of command line interfaces into all kinds of things already in the, you know, your entire, I mean your entire, just to start with, your computer runs on a shell. If you're running a Mac or a, or, or a phone, your computer, your computer's running on a shell, uh, already.And so like the full power of your computer is available at the command line level. Um, and then it turns out it's really easy to expose other functions as a command line interface. And so like this whole idea where we need like MCP and these like product mm-hmm. Fancy protocols, whatever, it's like, no, we don't, we just need like a command, command line thing.So that's the architecture. And then it turns out what is your agent? Your agent has a bunch of files starting a file system. And then there's the thing that just like completely blew my mind when I write my head around it as a result of this, which is like, okay. This means your agent is now actually independent of the model that it's running on.Because you can actually swap out a different LLM underneath your agent and your, your agent will change personality somewhat. ‘cause the model is different, but all of the state stored in the files will be retained.swyx: Yeah. Different instruction set, but you just compiledit.Marc: Right, exactly. And it's all right.It's like right. Swapping out a ship and recompiling, but it's, it's still, it's still your agent with all of its memories. Um, and with all of its capabilities. And then by the way, you can also swap out the shell, uh, so you can move it to a different execution environment that is also, is also a b shell, by the way, you can also switch out the file system, right.Uh, and you can, and you can, and you can swap out the, the, the heartbeat for the, the crown framework, the, the loop that the agent framework itself. And so your agent basically is ba basically at the end of the day, it's just. It's just, its files. Um, and then, and then there's of course it a openswyx: call.Marc: Yeah, it's, it's basically, it's, it's just the files.Um, and then by the way, as a consequence of that, the agent and then the agent itself, it turns out a couple important things. So one is it, it's, it, it can migrate itself, right? And so you're, you can instruct your agent, migrate yourself to a different, uh, runtime environment, migrate yourself to a different file system, migrate yourself to a different, you know, swap out the language model.Your agent will do all that stuff for you. And then there's the final thing, which is just amazing, which is the agent is the agent actually has full introspection. It actually, it actually knows about its own files and it could rewrite its own files. Right. Which by the way, is basically no widely deployed software system in history where the, the, the thing that you're using actually has full introspective knowledge of how it itself works and is able to modify itself.Like that, that, I mean, there have been toy systems that have had that, but there, there's never been a widely deployed system that has that capability and then that leads you to the capability. That just like completely blew my mind when I wrap my head around it, which is you can tell the agent to add new functions and features to itself and it can do that.Extend yourself. Yeah. Right? Extend, extend yourself. Like extend yourself. Give yourself a new capability. Right? And so, and so literally it's just like you run into somebody at a party and they're like, oh, I have my open claw, do whatever, connect to my eat, sleep bed, and it gives me better advice and sleep.And you go home at night and you tell your claw, or if they're at the party, by the way, you tell your claw, oh, add this capability to yourself. And your claw will say, oh, okay, no problem. And it'll go out on the internet and it'll figure out whatever it needs and then it'll go out to claw code or whatever.It'll write whatever it needs. And then the next thing you know, it has this new capability. And so you don't even have to, like, you can have it upgrade itself without even having to, without having to do anything other than tell it that you want it to do that. And so anyway, so the, the combination of all this is just, I mean, this is just like a massive, incredible, I mean, it's just incredible.Like if I, if I were, if I were 18, like this is a hundred, this is what I would be spending all of my time on. This is like such an incredible conceptual breakthrough. Yeah. And again, pe people are gonna look at it and they already get this response. People are gonna look at it and they're gonna say, oh, well, where's the breakthrough?‘cause these, the, all of these components were already known before. Mm-hmm. But, but this is the key, the key to the breakthrough was by using all these components that were known before, you get all of the underlying capability of that's buried in there. And so all, and so for example, computer use all of a sudden just kind of falls, trivi, trivial.Of course it's gonna be able to use your computer. It has full access to the shell. Right. And then, and then you just, you, you give it access to a browser, and then you've got the computer and the browser and, and often away it goes. And, and then you've got all the abilities of the browser also. Um, yeah.And so, and so the capability unlock here is profound. My friends who are, you know, deepest into this, are having their claw do like a, like, literally like a thousand things in their lives. They have new ideas every day. They're just like constantly throwing new challenges at the thing. And by the way, it's early and, you know, these are, you know, these are prototypes and there are, you know, as you guys know, there's security issues.Yeah. And, and so, you know, there's a bunch of stuff to be ironed out, but the, the unlock of capability is just incredible.swyx: Yeah.Marc: And I, I have absolutely no doubt that everybody in the world is gonna, is gonna have at least, you know, an agent like this, if not an entire family of agents. And w

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

For a limited time, Latent Spacenauts can skip the waitline to join Dreamer and also compete for a $10,000 cash prize for most useful tools for Dreamer! Thanks @dps!In 2024, David Singleton left Stripe and joined forces with Hugo Barra for a buzzy stealth startup named /dev/agents. This month they emerged out as Dreamer, a consumer-first platform to discover, build, and use AI agents and agentic apps, centered on a personal “Sidekick” that helps users customize experiences via natural language. Sidekick is nothing less than an “agent that builds agents”, with all the complexity that that entails:You've seen many many website builder, app builder, and even agent builder startups by now, but our favorite detail is the sheer amount of work that has gone into the “full stack” nature of the platform, including shipping their own SDK, logging, database, prompt management, serverless functions, and so on. Most platforms restrict the tech stack you can use just to get off the ground — Dreamer does it “right” by letting you push whatever arbitrary code you want to their VMs.Paying the BuildersOf course former leaders of Stripe and Android would not stop at just building the tools, but also building the ecosystem. Dreamer is deeply aware of the 4 sided network effect it has going on and is ready to fund all of it - from hiring Builders in Residence to awarding $10,000 cash prizes to the best tool builders for the Dreamer ecosystem.It's time to Dream!Full Video Episodeon youtube.Transcript[00:00:00] Meet Dreamer Purple[00:00:00] swyx: Okay, we're here in the studio with David Singleton. Welcome.[00:00:08] David Singleton: Hey, Wix. It's great to be here.[00:00:09] swyx: It's great to have you. Uh, we have very sympa that your company color is the same as Lean Spaces color.[00:00:15] David Singleton: That's right. Dreamer Purple.[00:00:17] swyx: It used to be Devrel agents, which I thought was very cool. It's like you call back to Devrel Payments.[00:00:22] David Singleton: Yeah.[00:00:22] swyx: And you were obviously CTO Stripe. And talk to me about just the origin or thinking process behind Dreamer. Yeah. And maybe, maybe start with like, what, what is Dreamer?[00:00:31] David Singleton: Yeah.[00:00:31] What Is Dreamer[00:00:31] David Singleton: So Dreamer is a new product, uh, which everyone can come and play with today. Um, it's a place where everyone, literally, everyone can discover, build, and enjoy and use AI agents and agenda apps.[00:00:45] And we really did design it for consumers, for folks who are not necessarily. Uh, have any kind of technical background. It's really aimed at everyone. I think often of my sister, she's very smart. She's not in the slightest bit technical. She has lots of problems in her life that [00:01:00] she would like to be able to have great software and intelligent software to solve.[00:01:04] But you know, even with the rise of tools like Cloud Code and so forth, she's got no way to get started. And Dreamer is a place where she can come in, grab some intelligent apps that other people in the community have built, start using them right away, and solve real problems in her life.[00:01:19] Sidekick And Waitlist[00:01:19] David Singleton: And at the core, we have a personal agent called the Sidekick.[00:01:24] Um, you can give your sidekick a name, you can give it its own personality, and it really helps you across your entire day, your life. It helps you use all of the agents on the platform, and it also helps you build anything you want. And we've been working in this for a little while. We recently launched in beta.[00:01:41] So anyone can go to dreamer.com, join the wait list. Um, and we have many, many, many people in the community now who are building really fun, really powerful, really useful. Agents and the agentic apps for themselves.[00:01:54] swyx: I think we're gonna go right into a demo. Yeah. I just wanna make an observation that, uh, you, you, [00:02:00] you put discover first before build.[00:02:02] Mm-hmm. But actually, at least for the engineers in the audience. ‘cause we are primarily engineers and you're primarily targeting consumers, right?[00:02:08] David Singleton: Yeah.[00:02:08] swyx: For engineers. Like, there's a huge full stack of stuff, which we're gonna dive into. Let's write. It's so impressive. I'm like, holy s**t, this, this is what I've always wanted.[00:02:16] Cool. Uh, so, so I think that's really good and I've, in some ways, I think given your background given, uh, Hugo's, is it Hugo? Hugo.[00:02:24] David Singleton: Hugo. Hugo Bar. Yeah.[00:02:25] swyx: Hugo, it's not surprising that you can basically kind of build an app store Yeah. For agents.[00:02:30] David Singleton: Yeah. So Hugo was my co-founder. Yeah. Um, Hugo and I met with our other co-founder Nicholas Checkoff in the very early days of Android at Google, where we were building Google's first mobile apps.[00:02:41] Uh, we then contributed to very core pieces of Android itself. And you're right, we were really excited about building two things. One, solving a bunch of problems. That this breakthrough technology here I'm talking about mobile needed to have solved in order to make it work for real people at scale. And then secondly, building this ecosystem, um, [00:03:00] of third party developers using the Play Store, um, and able to deliver way more value on the platform than we could have delivered on our own.[00:03:08] And we think about Dreamer in exactly the same way. So I was working at Stripe, as you mentioned, and we had the opportunity to put some of the very first AI agent systems in the world into production. And from the moment we did the first of those, I was just struck with a strong sense of conviction that this is breakthrough technology that's gonna change how all of us work with computers and phones and so forth, all of the, the technology in our lives, but.[00:03:34] There's a lot of problems to be solved, for real people to be able to make this approachable. Um, and it really is kind of a direct analog for what we were solving back in the early days of mobile apps at Google and, and Android. So it's, it's been fun to bring that to life.[00:03:47] swyx: Yeah. Uh, let's look at it.[00:03:48] David Singleton: Yeah, let's take a look.[00:03:49] Dashboard And Daily Briefing[00:03:49] David Singleton: So, uh, dreamer.com, this is our homepage. This is where you can come and, uh, watch some videos about what is here and sign up for the wait list. Once[00:03:57] swyx: you, I, I just wanna say for those listening, ‘cause we have a lot, you [00:04:00] know, switch to YouTube, look at the animations. So much care.[00:04:03] David Singleton: We, we really care about, uh, this product being fun.[00:04:07] Uh, and, and interesting to use. Obviously a lot of people are using it to do real important stuff. You can do real work, uh, here, uh, but also you can build fun things too. Once you get off of our wait list, you'll come into the product. The first thing that happens is you'll have a conversation with your side cake, which is this little friendly, uh, character here.[00:04:27] And psychic will seek to get to know you and understand you. What do you care about? And will help you discover and build your first AI agents or agentic apps. After that, you're, you're gonna have a dashboard. This is my dashboard. Everyone's is different. Um, you can see I have a few things here. I have a feed.[00:04:42] So a lot of our agents do things in the background when you're not looking and the feed is how they let you know what they've been up to. I have, uh, some widgets, uh, from apps that I have built. Uh, this one is called Calendar Hero. Uh, this is something that I installed from the gallery. Uh, so built by someone in our community.[00:04:59] It's a [00:05:00] really powerful calendar app because for each of my meetings, if it's with someone I don't already know, well it'll actually go off and research it, um, and give me both a history of my interactions with those people and also a bunch of, you know, public useful information to, to get started. One of the things I love about this particular app is that every day it generates a podcast, um, a daily briefing.[00:05:24] And one of the things that we've done with the platform is we've made it possible for all the things that agents do to show up in places that you care about. So if you look over here, this is the screen in my phone, and if I go ahead and open my Apple Podcasts, you can see right here. Your Daily briefing podcast is ready.[00:05:39] This was produced by an agent running in my Dreamer account, and it was very easy by scanning a QR code to connect it to my Apple podcast. That's what I listened to in the car now every morning. Yeah. On my way to work.[00:05:50] swyx: It, it[00:05:50] David Singleton: preps me for, for my day.[00:05:52] swyx: So one additional bit of context. I asked you immediately after seeing this was like, what, what about, I wanna talk back to my agent and you said you actually started with voice and then you went to [00:06:00] podcasts.[00:06:00] ‘cause it's nice to have it pre downloaded[00:06:02] David Singleton: that, right? That's right. Um, yeah, we, you, you can talk to your sidekick. So, you know, on mobile we have, uh, a dreamer app and you can talk to the sidekick right here. Um, but we've actually found that making things, uh, show up in the other apps that you already use in your life is incredibly powerful.[00:06:19] So let's take a look at what's kind of under the hood here.[00:06:21] Gallery Tools And Payouts[00:06:21] David Singleton: So I already mentioned that we have a gallery, so this is where you'll find a lot of agents from our community. Uh, there's. Many at this point, hundreds. And they are solving all kinds of, uh, use cases. I'd say the the top use cases are on personal productivity, but also a lot of information management that can range from personal information like docs and so forth, managing your emails.[00:06:42] It also ranges out to public information that you might be interested in, but you need something to help manage the, the kind of fire hose of stuff that's coming at you. For instance, I have, um, an agent which looks at all the AI news, um, all the time. There's a lot of it and it finds the stuff that I would actually be [00:07:00] interested in, um, and I find it incredibly useful.[00:07:03] So these are agents that you can install that other people have built. Anything that you install on Dreamer, you can actually just say, I wanna start making some changes, and we'll look at that in a second. But in natural language, with the sidekicks help, you can change any of these experiences to work just the way you want them.[00:07:18] But the base layer of the system are tools. So you know, as well as anyone swyx, that any AI system is only as good as the quality of data that it can pull in and the quality of action it can take. So before we launched our beta, we worked very hard to make sure that we seeded our tools with a bunch of very high quality and powerful integrations.[00:07:39] So, you know, for instance, this is real Google search, this is actual Gmail. Um, and you can do very useful things with those. But also this is a platform for everyone. And as we got started talking to people in our alpha community, a whole bunch of sports use cases popped out and we realized if you want to build something cool for sports with ai, you need really high quality live data.[00:07:58] So look at these [00:08:00] Formula one M-L-B-N-F-L, uh, these are tools, uh, that we've built. We've done a, these are not data scraped off the web. This is a, a direct data feed integration. And because it's live and ‘cause it's high quality, you can build really powerful stuff. But tools is not something that we are just going to kind of control ourselves.[00:08:19] The platform is open for tool Builders to contribute tools that anyone on Dreamer can use. So, um, this is actually the place in the platform where I think software engineers, um, well number one, would love for you to come and play with it. Uh, but software engineers are really gonna build, um, a lot of powerful stuff into the system.[00:08:38] And we are actually sharing something for the first time on this podcast, which there is, uh, tool builders on Dreamer get paid. So if you publish a tool to the platform and a lot of agents use it, you'll actually get paid, uh, in proportion to their usage. And we'd love for folks to come and give this a try.[00:08:54] We've got good docs that help you get started and you can build things that, you know, scratch your own itch. For instance, someone built this [00:09:00] Ski Bum tool, which provides live snow conditions for a bunch of, uh, ski resorts. I'd love to show you how I've used that in a second. And also we have some tools, partners where the tools themselves are paper use.[00:09:12] So for instance, parallel web systems is a premium tool. Uh, you can do really cool stuff with it. Um, it's a a, an agentic web research tool. And that one, because it's expensive to operate, is paid on a, on a per usage basis. But if you're coming in to build agents on the platform, even the premium tools, you get a free trial.[00:09:29] So you get a chance to actually try them out, make sure that the use case is good for you before you decide to, to to sign up. So that's tools. So we have the gallery, we have tools, and then the sidekick helps us put all of this together to build agents. We do that in the agents studio. You can also do this on your phone, but if I open up Agent Studio here on Desktop psychic's, just gonna start a conversation about what you want to build together.[00:09:51] I'd love to show you one that I made recently.[00:09:53] swyx: Let's do[00:09:53] David Singleton: it.[00:09:53] Building A Conference App[00:09:53] David Singleton: Um, let's look at something that hopefully is kind of near and dear to your heart. So one of the things I love about Dreamer and this kind of moment in technology is that if you think about it. There are all these things in your life where, have you ever gone to a conference?[00:10:09] I know you have. Right? And, uh, big conferences have apps. Um, and these apps are usually built by agencies and they're, they're usually actually quite expensive to build. I've been involved in running some of these myself. And how many conferences have you been to where the app was good? Zero. Honestly.[00:10:23] swyx: Exactly. Zero,[00:10:24] David Singleton: maybe one. I, I've, I've been to one conference. That was pretty good. Wait, wait session sessions. Um, but, but the point is, they're rarely great pieces of software. Right. And they're also expensive to build, but they're, they're interesting ‘cause they're episodic, they last for this one thing. Um, and then they're, they're not relevant anymore.[00:10:43] Um,[00:10:43] swyx: and so it's the worst feeling to invest in them because, you know, it's like, it's got a limited. Date?[00:10:48] David Singleton: Absolutely. So I decided to build, uh, a conference app for your AI engineer conference. Amazing. Uh, on Dreamer. One of the things that Swix has done, uh, which I [00:11:00] thought was very forward-looking, is actually put a whole bunch of data about the conference on the webpage in an LLM readable way.[00:11:06] There's an LLMs txt file, there's a feed of all of the sessions in js, ON. So I used the data from your conference last year and built this intelligent app, uh, just by talking to our sidekick, uh, in Dreamer. So just to give you a quick tour, this is my Dream Conference app. What I always wanna do for conferences is I wanna be able to search for speakers.[00:11:28] I'm usually there because, uh, there, uh, is a speaker I care about. So, you know, SWIX, you're the speaker I care about. I can actually see here who you're on stage with. So here's, here's Greg Brockman. You've read even ai, uh, and this is his session. And look Greg and Swix for the speaker. So let's add that to my schedule.[00:11:45] Great. And then maybe there's a couple others I might see here. Like on day two, I remember there were some keynotes. So, uh, building the open agenda web, that sounds fun. So I add that to my schedule.[00:11:55] swyx: She's now CEO of Xbox.[00:11:56] David Singleton: Awesome.[00:11:57] swyx: Which is interesting. So cool. So,[00:11:59] David Singleton: so I've [00:12:00] gone through and picked out a couple of sessions that I cared about.[00:12:03] That's as far as I usually get with any conference app. But of course you've got the whole of the rest of the conference to figure out what to do. So here is where the native intelligence of, of these things you build on Dreamer can come in. So I'm gonna click guide me. So Dreamers sidekick actually parsed out the whole schedule and figured out what some of the themes are and I can choose what I'm interested in here.[00:12:23] I'm definitely interested in agents. Uh, I'm definitely interested in code generation and also reasoning in rl. So now I'm gonna say build my schedule. So what this is doing is. It's going across every time slot for the conference. And it's choosing among the things I could go to, which one it thinks is best for me based on my interests.[00:12:41] It also uses its own memory of me that's part of Dreamer, uh, to understand what I might like best. And you know, there's an LLM prompt running for each one of these time slots. So this is, it's not super fast, but it'll be done in about 30 or 40 seconds. And I'm gonna have a special custom schedule for the conference.[00:12:57] This, like I said, is my [00:13:00] dream conference app is exactly what I've always wanted and I was able to build this yesterday morning. Um, I did it between some meetings. I think I spent a total of 25 minutes of wall clock time on it. I did it over the course of a couple of hours. And, uh, here is my schedule for the conference.[00:13:15] I can see it in a calendar view. This is what I should do on Tuesday, this is what I should do on Wednesday. Oof, no conflicts, but, you know, I may not go to every single thing. And there you have it built in, you know, dreamer. So let's take a look at what the building experience actually looks like. So this is the, the actual account that I made it on.[00:13:32] Oh, of course I should say anything you build on Dreamer also works on your phone. So, uh, here is my AI engineer conference app right here on my phone. Got all the same functionality, and of course this is the best place to jump into my schedule.[00:13:46] swyx: Yeah.[00:13:46] David Singleton: Um,[00:13:46] swyx: so you could generate a podcast about it just completely multimodal, absolute thing, right?[00:13:51] To me, I mean, this is why I outsource, I mean, well, I, I posted the L-M-T-X-T, the JSON because you cannot run an engineer conference in 2025 [00:14:00] and not let engineers. Do whatever they want.[00:14:02] David Singleton: Yeah.[00:14:03] swyx: And since all conference apps suck, I'm just gonna put up a ba minimum viable app and just let people do whatever they want.[00:14:09] David Singleton: Totally. And the cool thing about this on Bremer is I published this to the gallery and you can use it so you've got one that's built to my taste of conference apps. I think it's pretty cool. But you might want something different. Yeah. In which case you just start telling the sidekick how to change it.[00:14:23] So let's just very quickly look[00:14:24] swyx: at our, what sports grid is also, you can fork it, right? That I can publish. That's right. I can publish your one and go, this is the base starter. It's, it's got good defaults, but go customize, whatever.[00:14:32] David Singleton: That's right. That's right.[00:14:33] swyx: Yeah.[00:14:33] Agent Studio Under The Hood[00:14:33] David Singleton: So let's take a look at how I actually built this.[00:14:34] This is real. So I'm gonna say make changes. This experience we're looking at now is our, uh, agent development studio. Um, like I said, you can do this on your phone as well. And in fact, this one I started out on desktop. Let's look at my actual prompts. I said, let's make an agent called AI Engineer Schedule Planner should be a custom schedule planner for the AI engineer conference.[00:14:53] I'm not gonna read this all up. You get, you get the point and it told it where to get the data from. So that was the first prompt. And actually after I gave it that [00:15:00] prompt, I actually had a simple version of this app working, um, after the sidekick took one turn. So the Sidekick is a, like a professional software engineer, and we've worked very hard to make this work and build functional apps for folks that might not have any engineering experience whatsoever.[00:15:14] So, you know, done here we have build logs that are technical, but you can hide those away. And sidekick, as it is building, will actually translate everything that is coming out of, uh, of the, the harness into English that you can actually read. And by the way, this English is in the personality of your sidekick, which is fun.[00:15:32] Um. And the way that we build agents and agent apps, it's a little different to what you might have seen in some other platforms for a couple of reasons. One, just the build process. The very first thing that Sidekick does, it understands all the agents you've got set up. It understands all the tools and it will come up with a plan for how to realize your goal, how to make sure it actually has the data and the capabilities to complete it.[00:15:54] It will occasionally refuse. If it can't do what you're asking, it will tell you I can't do that. It needs another tool. And that's a good [00:16:00] jumping off point for any of the tool builders out there to build a new tool. So it'll fi first figure out how, then it will build it, and then it will actually test it.[00:16:07] So it will actually make sure that the thing that it has generated is realizing your goal. And you probably know as well as anybody that anytime you can get any. Modern state-of-the-art coding model into a loop where it can make changes and perceive its own output and then fix bugs. Magic happens. So these builds, the first build will often take 10 to 15 minutes on Dreamer, which is a little bit longer than you might've seen on some other platforms.[00:16:31] But the first thing that it creates will work most of the time. And then of course, as you start making smaller changes, you can like ask it to tweak the UI in any way that you like. Those are much faster. And just to give you a sense, uh, for this one, here's something I asked. Put a logo, I gave it a logo file in static files.[00:16:48] Use that as the title. So for folks that actually really want to dig, uh, into a bit more detail, we've provided a powerful IDE here. So I can actually see here's the code that was generated and some pieces of the [00:17:00] code are more accessible than others, like the prompts. So this is the prompt that's used by a powerful LLM in order to do that schedule picking.[00:17:08] And I can actually read it here directly. I can edit it without having to ask the sidekick if I want to do that.[00:17:12] swyx: So this is very nice.[00:17:13] David Singleton: This is for the more, the more, uh, sophisticated users.[00:17:16] swyx: Yeah. This is other people's entire startup is prop management.[00:17:21] David Singleton: This is true. The other thing that is different about Dreamer is once you've built something here, it's ready to go.[00:17:28] We host it. So you don't have to worry about getting a database from a database provider signing up, getting API keys. You don't have to worry about your LLM provider tokens. All of that is hosted on the platform. And you can use it yourself. You can share it to the gallery for other people to, to riff on it.[00:17:46] You can also share it with your friends and coworkers to use your instance of the agent or agentic app. And we're seeing that happen a lot in our community. We've seen a whole bunch of folks who built little applications for their personal life [00:18:00] and shared them with their significant other. We've seen people who are building little productivity apps for their team at work and sharing it, uh, among them.[00:18:07] And we actually do this a lot inside of the company. So at this point we, we pretty much run the company on Dreamer agents for all kinds of important things. Uh, maybe a good example of that is, um, our wait list. People are signing up every time someone signs up for our wait list. A dreamer agent will actually research, uh, that person.[00:18:25] And we're looking for folks who are builders, not super technical to build agents and come in, uh, and give us a lot of feedback and we're prioritized bringing those people off of the wait list First,[00:18:35] swyx: just a quick question on that one is there's, it may not come up again. Do you find enrichment APIs to be useful like the ZoomInfo?[00:18:42] Uh, clear bit[00:18:43] David Singleton: enrichment is a very, uh, common use case. Um, on dreamer. Any application on Dreamer can kick off a sub-agent to do a particular task. Um, so this actually is a powerful agentic harness that runs inside of its own [00:19:00] vm. Uh, we call them sidekick tasks ‘cause they actually run in the context of the sidekick.[00:19:04] I'll talk more about Sidekick in a second and. Enrichment is a very common use case. And the cool thing about a sidekick task is that it has access to all the tools on the platform, but also public data as well. And so very frequently enrichment on our platform happens using public data that it can be found in the web.[00:19:24] There are some tools for getting people data, uh, from, uh, from various bespoke systems. And so that works pretty well. But actually, you'd be surprised. I mean, we would love if someone out there would like to build a ZoomInfo tool, we don't have one today. We'd love to see that on the platform, and I'm sure it'll be very powerful.[00:19:39] But we're also seeing that this powerful agent harness can pull a lot of data in on that note of tools that make experiences better, we're constantly adding more tools because people in the community are building them and publishing them. We review the tools carefully and then they go live for everybody.[00:19:54] Yesterday we added granola. And that was pretty cool. So I was talking to actually, uh, Sarah on my team was [00:20:00] talking to, uh, someone building on the platform this morning and they actually, they have an agentic app that they built, which is a kind of magic to-do list. So they put stuff on their to-do list and for each thing it kicks off one of these, uh, sidekick tasks to figure out how to move the ball forward thing.[00:20:14] Sometimes it'll complete it[00:20:15] swyx: entirely. Yeah.[00:20:16] David Singleton: Often by calling another agent on the platform and sometimes it just kind of researches it and helps ‘em take the first step.[00:20:21] swyx: Yeah. Do you know, this is Sam Altman's number one, ask for an AI app. It's the self-completing to-do list.[00:20:26] David Singleton: Yeah. The self-completing to-do list is something that a lot of people have built on Dreamer and are getting a lot of use out of.[00:20:32] Yeah. And, and finding it actually genuinely I shouldn't, I should, I should try that. Mm-hmm. Please do. And you'll even find some in the gallery that you can remix. So he was saying this morning that he's, he built this self completing to-do list, uh, on Dreamer already. But he connected the granola tool yesterday and now something really magical happens, which is when he says in meetings that he's gonna do a thing, it magically shows up on his to-do list and then it can magically get completed.[00:20:56] And then, as I mentioned, all the agents, all the [00:21:00] apps on Dreamer can actually work together. So our coding agent, as it builds them, does something very special where it exposes the internals of each of the experiences to the system. And then Sidekick can manipulate those to get stuff done. So he has built another agent, which he uses for recruiting.[00:21:18] It kind of keeps track of candidates and also it's got a kinda mini CRM function, so he's able to introduce candidates to each other. He told us this morning that something he'd committed to do in a meeting that was recorded on granola yesterday showed up in his magic to-do list and his magic to-do list.[00:21:34] It was like introduce a person for recruiting, used his recruiting agent to get it done.[00:21:39] swyx: Ah,[00:21:39] David Singleton: um, and this is, this is the dream. This is why we started the company. It really is the case that you can build and use these very powerful, bespoke experiences that can automate your life by working together. And I'd love to talk a little bit about how they work together.[00:21:55] Ecosystem Trust And Monetization[00:21:55] David Singleton: So obviously it's really cool to have [00:22:00] software that will work on your behalf, but it's only useful if you can trust it, right? So privacy and security is very important to us making these things accessible and. While also being trustworthy is hard. So the model that we have, which is working very well, is that the sidekick is at the core of everything here.[00:22:22] So it is both your companion, your helper, but it's also the traffic cup in the system. So when, when one agent wants to work with another agent and dreamer, it doesn't do it directly, it does it via the sidekick, well ask the sidekick to do the thing. And the sidekick understands both everything, all the expectations that have been set with me as a user about what agents can do, which tools I've given them permission to use.[00:22:45] And it will make sure that whatever is is going on is actually aligned with my own interests. And you know, that's part of the background that I bring to this problem domain. I've. Worked for years, uh, keeping very important information, safe and secure. And [00:23:00] so as we started to think about this problem, we realized that we actually had to build something that's a bit like an operating system.[00:23:06] You know, the sidekicks, like the kernel, the agents and apps are like users. Yeah. Different rings. Exactly. Because if you try to pick off just one piece of this, you can't actually make it work for people at scale. Uh, because you could build little vibe coded apps, but they're gonna grab all your data willy-nilly.[00:23:23] They won't be able to work together. You actually have to invest in the fundamental core in order to make it work well for people. And that's what we've been doing and it's, uh, it's been a lot of fun. One other thing I wanted to mention is, um, I've obviously talked about two things, tools and agentic apps.[00:23:42] We really designed Dreamer to be an ecosystem and a platform, and one of my favorite quotes about platforms, I think it's from Bill Gates, is that you can only be a platform. If you create more value for the folks participating and using the platform than, than the platform itself creates. [00:24:00] And that's our goal here.[00:24:01] So we at every step have been thinking about how do we make sure that other people are deriving even more value from Dreamer than we are? So in that vein, I already mentioned tool builders get paid and people can build agents that solve their needs and share them with others, and we are already thinking about ways that they can actually monetize those as well.[00:24:24] Against that backdrop, one of the things that we are launching today is our Builders in Residence program. So there are tons of people building really cool stuff and contributing it to the gallery already, but we've been really inspired by programs we've seen at other companies where artists might be in residence, people that are very creative.[00:24:43] And might have ideas outside of what the, the folks at the company or in the ecosystem already have. And so we are looking for creative people who have fun ideas and, you know, want to really figure out how to apply their creativity at the cutting edge [00:25:00] of technology today to come and work with us. So, uh, if you go to dreamer.com/latent space, you'll find, ooh, well, we love Latent space.[00:25:09] Uh, you'll find a link both to, uh, our tool Builder information and our builder in residence program. And for builders and residents, we'll let you in off the wait list quickly, build an agent, and then for a small number of, of the most creative folks, we're going to pay you to build agents. Uh, you can work directly with our team.[00:25:29] You know, this is like building Legos. So, you know, we've got some of the basic blocks together already, but if you need a Ron steering wheel and we don't have one already, like we'll build it for you. Yeah. Um, we really want to be inspired by, by these, uh, these builders in residence.[00:25:43] swyx: This Legos thing is pretty common as an analogy.[00:25:46] And there's a, there's a thing I call the master builder. Uh, we, the actual Lego company has master builders that they employ Yeah. To inspire people and post on socials.[00:25:56] David Singleton: That is exactly what inspired us as well. Honestly, we talked about the Lego Master [00:26:00] Builder program, so that's our builder in residence program.[00:26:02] swyx: Yeah.[00:26:03] David Singleton: Um, and then, uh, finally back on, on tools. Like I said, anyone can come in and build tools today. If you follow the latent space link dreamer.com/latent space, again, we'll get you off. Directly off the wait list. So you can build right away, you can monetize by publishing onto the platform. That's for everyone, the very best tool that gets added to the platform by mid-April.[00:26:23] Uh, we have a $10,000 prize that we want to give out really, because we just want to seed the creativity of everyone out there. So we're excited to do that.[00:26:31] swyx: Yeah. And you know, uh, this is completely a flywheel, right? Like the more tools, the more builders, the more the third thing agents, you know, it just feeds into each other.[00:26:39] David Singleton: That's right.[00:26:39] swyx: Yeah. Just on the payments thing, because we probably won't touch on that again, but I have to ask the former CTO Stripe on payments as presumably you're using Stripe Connect.[00:26:48] David Singleton: Yeah.[00:26:48] swyx: Um. Any pain points that you're, people are very interested in agent commerce and micropayment and all these things.[00:26:55] Presumably stable coins get into a conversation at some point, but maybe not now.[00:26:58] David Singleton: Yeah, we are [00:27:00] really, really excited about e agent commerce. The first step we are taking is help people in the world who have never been able to build these kind of experiences and software before to build stuff that meets their passions, share it with the world and get paid.[00:27:14] So that's all commerce that happens on our platform, and so we don't need anything new to facilitate that. Stripe Connect has existed for quite a while and is the perfect solution for this kind of stuff, so, um, we we're excited about that. First and foremost, however. A lot of the things that people are already doing on Dreamer, we just talked about a self-completing to-do list.[00:27:34] A lot of the ways that you want to complete to-dos is by actually closing the loop in the real world, and that's going to involve the exchange of value. So we have some folks that are building tools already that actually do have money move in order to, to complete that, that loop. So far, we just want to be open and agnostic to all the protocols out there.[00:27:54] I honestly think this moment in time is a little bit like the early web. So I personally started coding as a kid [00:28:00] and I think I got access to the internet in about 19 95, 19 96. And back then, uh, the web existed, you know, HTTP was a protocol, but there were also other protocols I was using all the time, like Gopher and UUCP and uh, various others.[00:28:15] So the point is like the web, HTTP and HTML. Was just one among many protocols. And of course it became the winner and it's awesome. Yeah. Um, but the others were also kind of interesting and viable at the time as well. And I think the world of agentic commerce is like this right now. Also,[00:28:30] swyx: acp.[00:28:31] David Singleton: Acp, exactly.[00:28:32] All the, all the cps, you know, on Dreamer. We hope that folks will build tools that kinda make use of all of these things, but I'm sure that at a certain point. One or two will emerge as the winners, and then we'll be able to build like really deep support in,[00:28:44] swyx: yeah. This is like maybe a complete tangent, but I do think about how a lot of these companies in AI companies in particular have to switch from c based to usage based because of course, but then, then they end up, end up having to sort of [00:29:00] obscure the margins a little bit and then they inventing end up inventing their equivalent of rob robots.[00:29:04] David Singleton: Mm-hmm.[00:29:04] swyx: Uh, where they're like, well, okay, well every company should have their own currency. And it's, it's like very short lead to a token.[00:29:11] David Singleton: Yeah.[00:29:11] swyx: Or, and I'm like, okay, well where does this end? I can't really play out the next step as to like, is this chaos? Is this,[00:29:18] David Singleton: yeah.[00:29:18] swyx: Okay.[00:29:18] David Singleton: Well, I think it is kind of like the wild west.[00:29:21] I don't mean that in a completely, it's all completely disorganized way, but there's just so many things that could happen from here. The Overton window is very wide, right? Not far how this might land. And I'm just very excited to be building a platform that can take advantage of all of those opportunities and we're just gonna be there.[00:29:36] Uh, working for our users to make sure that things that emerge work,[00:29:39] swyx: you're gonna own the consumers, you're gonna be up the OS for the app store for everything.[00:29:43] David Singleton: So one of the ways to think about this is, um, dreamer actually uses all of the state-of-the-art models as a user. You don't have to think about should I be using, you know, Opus four six, or should I be using the five four model from [00:30:00] OpenAI?[00:30:00] We are continually doing evals and so forth to make sure that the best things are there for you. You can just build on the platform and know that as the world ships around, you're gonna get the right stuff for you. Um, and I think that's something that is needed to actually have folks take advantage of this technology at scale.[00:30:19] I'd love to show you another example of something I built.[00:30:21] swyx: Let's do it.[00:30:22] David Singleton: This is another example of software that just lasts for a certain moment in time. So recently I went on a ski trip with a bunch of friends,[00:30:31] ski[00:30:31] David Singleton: Bum. Uh, so it uses ski bum. Yes. I went on a ski trip to Big Sky. I'd never been there before.[00:30:38] And I made this little intelligent app for us. And you can see it says it's loading big sky conditions. So it's actually calling the Ski Bum tool that I just showed you, which is, uh, published in our, uh, in our gallery. So what is this? This is a little app that was just for our weekend trip. It shows the current status of all the lifts of Big Sky.[00:30:54] Using that tool from the ecosystem, it shows the forecast for the upcoming weekend. It shows our [00:31:00] accommodation. This is just like where my group was staying. This is just for us and also a bunch of dining information that one of our friends, uh, put together who, who's an expert on Big Sky. So I was able to take this app, share the link with my friends.[00:31:12] They weren't on Dreamer yet, just send it to them on iMessage and they get a version they can use on their phone. And of course, here's the real kicker. So I've been on ski trips before and other weekend adventures with my friends. Yeah, people pay for different things and at the end of the weekend it's always a pain to figure out who needs to pay, who to settle up.[00:31:29] So we use this during the weekend. We added all of our expenses in here. Uh, too close are it's drill data. It's only too closely. And then at the end of the trip, we press split. And we're, we settled up and we're done. So there's another dreamer. This was all through dreamer. So the, the actual payment? No, no.[00:31:47] We, it happened because, because we paid for stuff in the real world, it was like, okay, this person needs to pay that person 20 bucks. Right? Right. This person already paid in that. Right. So it just helped us all settle up. We didn't move the money on Dreamer. You could do that. And in fact, if you're a tool builder [00:32:00] thinking about this and getting excited, like come build a tool to do that stuff.[00:32:02] We really think of our tool builders as design partners.[00:32:05] swyx: Yeah. I got, I got the tool. Uh, what, like, I hate, I use Bank of America. I hate bank, I hate the app. Mm-hmm. I hate the web. All banking websites just horrible.[00:32:13] David Singleton: Yeah.[00:32:13] swyx: So just build me, like build a thing on top of Plaid.[00:32:15] David Singleton: Yeah. Right. And then just So[00:32:17] swyx: five code by banking app,[00:32:18] David Singleton: there's already a tool for that.[00:32:20] Oh. So, um, attain Finance is a tool, a builder in our community built. Okay. Um, and it uses a secure system like Plaid. To access your, uh, financial data and you can build powerful personal finance agents on Dreamer today using this tool. And like I said, we review tools carefully. So when bringing Attain Finance onto the platform, we did actually quite a detailed security review with that company to make sure that if folks build stuff with it, it's, it's gonna work well.[00:32:49] So yeah, check that out. I think, uh, I'm, I'm pretty certain it connects to Bank of America. So you'll be able to build the, the app that you wanted already?[00:32:55] swyx: Yeah. There's a couple of points I wanted to sort of dive in on, maybe highlight to folks, [00:33:00] because I, obviously, I spent more time with Dreamers. So we're making a point where you choose on behalf of your users because they're meant to be consumers.[00:33:07] So maybe less technical,[00:33:08] David Singleton: right?[00:33:08] swyx: But obviously people can, how users can override. If you read that's, but it's not just lms, it is also the, the transcription. It, it's like all, like there's, there's a first party curated set of here's the house opinion. That's right. On what?[00:33:21] David Singleton: That's[00:33:21] swyx: right. The thing is, that's right.[00:33:22] Is what's the list? Is there like,[00:33:24] David Singleton: yeah, so actually if you look in the tool gallery, the first party kind of curated set are all the ones that have these grayscale icons. So we have a built in tool for image understanding, for image generation, for RSS, exploration, text to speech and so forth.[00:33:38] swyx: Recipes.[00:33:39] David Singleton: Uh, we actually do have a built in recipes tool.[00:33:41] It turns out that a lot of people in our alpha wanted to do stuff for cooking. Yeah. Um, and you know, you can scrape the web to get good recipes, but we were able to quite quickly find a good repository of recipes. It works great here. Yeah.[00:33:55] Stable Tool Interfaces[00:33:55] David Singleton: So the point behind these though is that we'll keep the interfaces stable, so they'll always work.[00:34:00] But you know, the best translation model and, you know, there are people using this translation tool to translate Chinese podcasts into English. It's, it's pretty powerful. It can deal with very long text, but the best translation tool today might be different from the best translation tool sometime next year.[00:34:15] And we're just gonna make sure that that translation tool is always pretty close to state of the art. So you can build something and you know it's gonna continue to work well. Of course, some of our tools are branded. You may actually have a preferred way of buying groceries, like maybe you prefer Instacart and that's great.[00:34:29] You can use the Instacart tool specifically.[00:34:31] swyx: Yeah.[00:34:32] Partnerships And Ecosystem[00:34:32] swyx: Your partnerships, uh, I mean, I don't know if you ever hit of partnerships, but this is gonna be a bonanza for anyone on to do deals.[00:34:38] David Singleton: We have an amazing person who, uh, works on all of our partnerships. Um, and it's part of what you have to do to build a platform like this that's gonna work for people.[00:34:46] Like, we've gone and done that. Schlep has a lot of work, one talks lots of different companies, um, in order to make sure that you've got good tools at the core.[00:34:54] swyx: Yeah.[00:34:54] David Singleton: And then of course, because we're open to tool builders contributing to the platform, this is only gonna get better and better and [00:35:00] better.[00:35:00] swyx: Yeah.[00:35:01] Agent Lab Routing Layer[00:35:01] swyx: One observation I have this, this is gonna master a thesis I've been pursuing, which is, uh, what I've been calling an agent lab[00:35:05] David Singleton: mm-hmm.[00:35:06] swyx: Where you sort of different than a model lab in, in, in the sense that you never train your own models, but you are the router evaluation layer, ex subject domain expert for choosing between, uh, models.[00:35:18] David Singleton: Yeah.[00:35:18] swyx: And you're explicitly doing these things. And so like in my sort of construction, every agent lab does some version of this where like, here's the image understanding endpoint and we will route for you and don't worry about it. Yeah. Sally, I think it's kind of cool.[00:35:32] David Singleton: I, I think it makes total sense. Um, and again, to make this work for folks that don't follow the AI news every day, it's an actually, it's a, it's a really important thing to do.[00:35:42] Yeah. And it, it's been, it's been a real pleasure. I mean, I'm a, I'm personally a total geek for this stuff. I love it. And being able to go and dive into all those details in order to make it work well for other people. It's a true pleasure. I cannot imagine working at anything else right now. It's just so much fun.[00:35:56] swyx: The tricky part is multimodality when some of these things do [00:36:00] merge.[00:36:00] David Singleton: Mm-hmm.[00:36:01] swyx: And you are, you're sort of, this is your imposing structure on things that fundamentally don't want to be structured. And so sometimes that might work against you, but for 99% of these cases, this is fine.[00:36:10] David Singleton: Yeah. I mean, I think it's gonna be very interesting to see how the, the, the world matures because a lot of the power of dreamer is the ability to kick off these subagents, so these powerful agent harnesses, which can actually change how they work based on the data.[00:36:25] I actually think that we will be able to. Kind of keep up with and stay at the forefront of the changing landscape of how tools and systems work together. And that's, that's new. You know, software didn't used to work like this and now it does. Um, so even, even just figuring out how to design the right pri to make that possible has itself be a lot of fun.[00:36:44] Builders Can Publish Tools[00:36:44] swyx: This is, is a sort of maybe two part question that why can't streamer make its own tools? And then why don't you let you builders maybe stand up their own routing group? I call this a routing group, right? Like where it's like collect Yeah. Things.[00:36:58] David Singleton: So two things, to [00:37:00] some extent, dreamer does make its own tools in that agents appear to the system as tools.[00:37:05] So they can be, they can be used to accomplish things. So you can build an agent that is essentially a tool. Yeah. Um, and it it,[00:37:12] swyx: which is to me very useful for reuse.[00:37:14] David Singleton: Right.[00:37:14] swyx: Right. Exactly. ‘cause I, I like, this is the way I like it. Now my next five apps, I don't want to do this whole series of back and forth again.[00:37:20] David Singleton: Right.[00:37:21] swyx: Yeah.[00:37:21] David Singleton: Um. Then at the tool layer of the system, it's open to anyone. So it's actually quite powerful and flexible. So if you wanted to add a tool, which was, uh, imagine that you were training your own foundation model, Swyx. That might be fun. And imagine you wanted people to be able to play with, I don't know, maybe you make like, you know, nano chat or whatever and you want to Yeah.[00:37:42] Let people play with your own nano chat and see how I change themselves.[00:37:44] swyx: Now.[00:37:45] David Singleton: You could, you could publish a tool that is Nano Chat and it nano image generation behind a tool, and it could be your own writer if you wanted to. I see. And honestly, if that's the kind of thing that gets you excited as a builder, please come and do it.[00:37:57] Like we, we really are [00:38:00] believers in this idea that we aren't going to figure out every single detail ourselves. We're gonna make sure it's a safe and fun place to build this stuff, but we're really open to these ideas coming from other people. Um, and so I'd like nothing more than you come in and build a tool that does some of that cool stuff that you, that you have in mind.[00:38:15] swyx: Yeah. Awesome.[00:38:16] David Singleton: And just as a reminder, if you'd like to do that, the way to find the links is dreamer.com/latent space. Um, and for a limited time on that page, um, anyone who's listening to this podcast will also get directly off of our wait list. Uh, it's quite long right now. We are working hard to bring Zika.[00:38:32] Wait, so skip the wait list.[00:38:33] swyx: You know, I think, I think that's fantastic. I, I think it's, it is really sort of probuild way to do it. I wanted to jump back to the, the bar. Yeah. You know, you know, I get excited about this.[00:38:41] David Singleton: Yes. Okay. Let's set it back in there.[00:38:43] swyx: Like, let's, you know, this is the engineer podcast that's get[00:38:46] David Singleton: Yeah.[00:38:46] swyx: As technical as you can.[00:38:47] David Singleton: Yeah.[00:38:47] swyx: On everything you've built, like have a show off.[00:38:50] David Singleton: Yeah. Okay.[00:38:51] Under The Hood Debugging[00:38:51] David Singleton: So let's go wild in the aisles in the Asian studio. So as you can see, over on the left here is a conversation with the sidekick where you ask it what to do and it will explain in English that anyone can understand what's going on.[00:39:03] But, um, if you want to pull back the covers and look under the hood, um, if you're, uh, an engineer like me, then we have this, uh, this kind of debug drawer at the bottom. So you can see the full build logs here, but you can actually also dig in and see the files and prompts that have been generated. Uh, you can upload files from your computer in static files.[00:39:24] Um,[00:39:24] swyx: very important,[00:39:25] David Singleton: uh, indeed. You can actually read the prompts that have been generated for you. We intentionally put an example in here just that you can see what the format looks like. And then, you know, we already looked at this one that was generated for this particular, um, app, but if you actually want to bring the code out of Dreamer and work on your own local machine, you can.[00:39:45] So at the core of everything here is an SDK with a powerful command line interface and we built that first. It's actually possible to build agents on Dreamer without talking to the sidekick. You can write code with your fingers on a keyboard if you want to. I know that's very [00:40:00] antiquated, not, but actually this can be a lot of fun.[00:40:02] So if you wanna pull it out onto your laptop, you can use our, our CLI and, uh, you can edit it in cursor or in cloud code. You know, you don't have to use our sidekick. And the CLI actually has full access to the rest of the platform with you as the user. So, you know, obviously it is, uh, secure and privacy sensitive, and this is a way that, um, some of our most technical builders do build stuff on the platform.[00:40:24] The really cool thing is the side cake. When it's in coding mode, it uses exactly the same CLI. So the way it. Build stuff on Dreamer is using the same tools that you might as an engineer. Um, and that's actually a very powerful abstraction because it turns out that the right way to give a lot of context to agents to use CLIs is to write great documentation.[00:40:46] Make sure that all of the things that you could do are actually possible. And guess what? That makes it a delightful developer experience for real heroes as well.[00:40:53] swyx: Yeah. So that's pretty cool. We've been telling developers to do this and they ignore this until now they have to for content.[00:40:58] David Singleton: I, I've been saying this for a [00:41:00] long time.[00:41:00] Uh, we actually Stripe docs.[00:41:02] swyx: I mean, come on. Absolutely. Come on.[00:41:03] David Singleton: Absolutely. But actually, I was chatting with folks at Stripe last week and saying, Hey, you gotta make the Stripe CLI actually tell agents what they can do on Stripe because that way they're gonna use more stuff on Stripe. I think this is a real trend for the entire industry.[00:41:16] swyx: Yeah.[00:41:16] David Singleton: So we, we've been doing that.[00:41:17] swyx: To me, this, this download and, uh, GI push mm-hmm. Everything is complete confidence in that you're not hacking it. Right. Because there's other, let's call them AI builder platforms that impose their stack on you and if you, if you, and so therefore they don't allow you to do this because they cannot.[00:41:34] Right. ‘cause they, they impose some degrees of freedom, uh, restrictions so that they can get it to work. Yours is a fully general like VM running the full code. Correct. Do whatever you want. Correct. Any language you want. Correct. Yeah.[00:41:46] David Singleton: Correct. Well, in terms of language, if you use the SDK, you could build stuff in other languages.[00:41:51] We've actually found that TypeScript is the best language for building these experiences. Yes. Because it's strongly tight. So you find out at compile time if you've made mistakes [00:42:00] and there's nothing better than getting in. A coding agent in a loop where it can see its mistakes and ask them. So TypeScript is the language that everything gets built in by default here.[00:42:08] swyx: Did And did you see that TypeScript overtook Python? I did. I did. Yeah.[00:42:12] David Singleton: And for what it's worth, when we started the company, we started writing stuff in Python, and I love Python. Um, if I do, uh, a vendor code, I always write it in Python. It's my favorite language as a developer with my fingers on the keyboard.[00:42:23] Um, but TypeScript is an amazing language for AI because there's tons of training data in the models, um, and it's strongly tight. And actually at the company we built most of the stack in TypeScript, and we have this amazing property, which is, we have type safety all the way from the database to the front end.[00:42:40] And there's nothing better for working with coding agents than being able to have them check their correctness, compile time. So the same ideas behind building the company's code base, we've put into the agent SDK here as well.[00:42:51] swyx: Yeah. Do you know if you'd use one of those tools, like Prisma or whatever, or is it Tool Lab for you?[00:42:55] David Singleton: We, we actually have crafted most of our own tools. Um. For [00:43:00] instance, we had LLM Driven Code Review, uh, before the thing that got published from philanthropic this week. You know, we, we've been doing this stuff, uh, on our own bat[00:43:07] swyx: email, we'll pay $25 per review.[00:43:09] David Singleton: We, we pay a lot less than that. However, I hear that those reviews are excellent and possibly worth $25.[00:43:14] swyx: Yeah. You know, it's an option. Right. It's good, good to have it.[00:43:17] David Singleton: Just to give you a tour of some other stuff here. So, um, I can also see all the versions. Yeah. Um, this is not gi, this is not gi, this is built into dreamer. I can see all the versions that have been pushed before. Why is it[00:43:27] swyx: not gi?[00:43:28] David Singleton: It's not gi because we can make it work more efficiently than Git.[00:43:32] And we actually, we do some work behind the scenes to kind of understand what's in each of these versions. Yeah. Um,[00:43:37] swyx: so one of the things I'm pursuing, and I have a lot of thesis, right? Mm-hmm. One of the thesis is like, does GI go away? Does GitHub go away? And like, what, what is the active reinvent[00:43:46] David Singleton: you for, for what it's worth to some extent.[00:43:48] And anything you build, there's a lot of path dependency. If we started over, we might make this gi There's, uh, you know, within the company we use, uh. For our, you know, platform source code. And we like it and it [00:44:00] works well with coding agents as well. The very first versions of this, we wanted to be able to make it possible for the sidekick to manipulate it easily.[00:44:06] Um, and this, this was an expedient way to do it.[00:44:08] swyx: Yeah.[00:44:08] Workflows Logs And Databases[00:44:08] David Singleton: Um, you can also see all the activity that has happened in the workflows that you build. A lot of agents, you'll build on Dreamer, do things in the background, so they run on triggers. These are stimuli from the outside to kick them off, and this is a nice way to see all of the things that might have kicked off your agent.[00:44:24] You know, you can have an agent that kicks off on a webhook, so you can plug it into external systems. You can have an agent that runs when you receive certain emails that match filters, including LLM filters. And so here you can see, oh, when did it run? What did it do? You know, if I open up one of these guide me prompts or guide me, uh, events.[00:44:41] Oh my can see God. Well, I told you it was calling an LLM for every one of those time slots. Here's all of the LLM calls, here's the actual prompts.[00:44:49] swyx: And you don't mind exposing all of this, right?[00:44:51] David Singleton: No. We want builders to see what's going on under the hood. It's haiku to,[00:44:53] swyx: okay. Yeah. So,[00:44:54] David Singleton: okay. Right now that one was haiku.[00:44:56] Like I said, we work with all the models and sidekick will actually pick the best one [00:45:00] for the job. And you saw that was pretty high quality and pretty fast. So Haiku four five is the one that it picked for that job. Exactly. Uh, we also have logs, as I mentioned, there's a database spun up on demand for every, uh, agent.[00:45:12] You don't have to go and figure out how to do your own hosting. This is a SQL Light. This is a SQL Light database. Yeah. Um, it's a multi-user SQL light database. And then, uh, but, but each one is you, you get a database that is unique to this agent. But then if you share the agent with multiple people, we take care of like who are the owners in each row?[00:45:31] And all of that stuff is just there outta the box. Um,[00:45:34] swyx: and again, in-house?[00:45:35] David Singleton: In-house.[00:45:36] swyx: Oh my God.[00:45:37] David Singleton: Yeah. Um, well we do work with a bunch of infrastructure providers, but the technology for how to manipulate this is in-house. Fun fact. We actually did a lot of our own infrastructure development early on at the company and realized we need to spend our energy in the stuff that we're uniquely doing in the world.[00:45:53] So we're very delighted to partner with a bunch of great designer and some of this stuff. And then finally, um, I mentioned that agentic apps agents [00:46:00] expose all of their internals to the system so the psychic can manipulate them and use them just like a user can. So you can see how it's decided to break this problem up into functions.[00:46:09] Some of the functions, the ones with the little I here are exported. That means that there's probably the visible from outside. Exactly. And others are internal. And if you want to, you can dig right in here and call individual functions and see what happens. But mostly. You don't need to think about that at all.[00:46:24] Yeah. Uh, you can keep that little drawer closed and you can talk to your sidekick and build really powerful and enchanting experiences.[00:46:30] swyx: Yeah. I mean, to me, like showing this gives the engineer a complete mental model of what you've done and what you can do with it. Yeah. For example, the first thing I, I, I look for.[00:46:39] A mental checklist of things, right? Like is off in the database, off looks like it's not right. So that's a separate layer. That's probably me means it's hard to do multi-user apps on the same app, right?[00:46:50] David Singleton: So you actually, we've solved that. So, um, see, yes, the platform builds in off, so you as a user sign into the platform, if you're using an [00:47:00] agent that was published by someone else, then your identity is, is kind of taken care of by the system.[00:47:05] And when you query the database, you're gonna get the stuff that is for you. Unless the builder specifically said, this is public data that everyone should see. So they, they actually get a chance to think about that. And again, sidekick can guide you through building, uh, agents and apps that work that way.[00:47:19] So you're right, that's another thing that people have to think about when they're trying to figure out how to build software experiences on Dreamer. You, it's built in. You talk to the sidekick as if it were a human being about what you want and that's what you get. So, you know, my, my Big Sky app that I just showed you that was designed for multiple people to use it.[00:47:38] And of course the things that we were putting in as expenses were supposed to be visible to everybody, and I just told the sidekick that's the way I wanted it. Uh, but by default, if I built an app like that, the data from each user would not been visible to the others.[00:47:49] swyx: Yeah. Yeah. Uh, this is, I presume this is a mood question, but basically you've had to build your own coding agent, right?[00:47:55] Which is sidekick slash whatever is in Inside Psychic. Obviously there's a lot of [00:48:00] people with a lot of desire for cloud code and Code X and attachment to it. Mm-hmm. I know under the hood data basically reduced to a loop, but like, would you let people use cloud coding and Code X or is the harness too specialized?[00:48:12] David Singleton: Yeah. If you, if you want to use, um, cloud code and Code X, then you go down here. Yeah. Hit get the S St K. And we even say this right here, edits your heart's content Z cursor code.[00:48:22] swyx: Like people want to use it inside of Ick, right? Yeah. They want to switch the engine.[00:48:26] David Singleton: Yeah.[00:48:26] swyx: That's the coding engine.[00:48:27] David Singleton: Yeah. We are not doing that right now.[00:48:29] Um, you know, again, the goal really is abstract the complexity. Yeah. Um, because the real target for. Building agentic apps is folks who can't do this already today. I can't tell you how many users in our community I've spoken to who are like Dreamer has changed my life because I used to have all these ideas.[00:48:50] If only I could find an engineer to help me implement them, I'd be able to get them done. They're free, and now I can talk to my sidekick and, and get it built. I think that's like really how we think [00:49:00] about the people that should get a ton of value and fun, um, out of the platform. And so they're not asking to be able to plug in their their own, you know, coding agent.[00:49:11] And for those folks, the opportunity is massive. If you've never been able to do stuff in code, now you can build stuff for you, for your friends, for your family, for your coworkers. And also there's a huge opportunity for folks who do build stuff in code to actually contribute to this ecosystem. So that's how we think about it.[00:49:28] swyx: Yeah. Amazing.[00:49:28] Personalization And Memory[00:49:28] swyx: That's most of what I wanted to cover Dreamer wise. I think personalization and memory yeah. Is probably like the single most important job of, uh, of the os. Maybe we could talk about that and then I'll, I wanted to zoom out on company building stuff.[00:49:40] David Singleton: Yeah, yeah. Sounds good.[00:49:41] swyx: Yeah. So how do you handle memory?[00:49:43] What, yeah, what have you found? What have you tried and failed?[00:49:45] David Singleton: Yeah. Okay. So, uh, first of all, at the core of dreamer is the sidekick. The sidekick gets to know you and it builds up a memory about you over time, and that turns out to be very important. So Dreamer, that's

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0
Retrieval After RAG: Hybrid Search, Agents, and Database Design — Simon Hørup Eskildsen of Turbopuffer

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Mar 12, 2026 60:32


Turbopuffer came out of a reading app.In 2022, Simon was helping his friends at Readwise scale their infra for a highly requested feature: article recommendations and semantic search. Readwise was paying ~$5k/month for their relational database and vector search would cost ~$20k/month making the feature too expensive to ship. In 2023 after mulling over the problem from Readwise, Simon decided he wanted to “build a search engine” which became Turbopuffer.We discuss:• Simon's path: Denmark → Shopify infra for nearly a decade → “angel engineering” across startups like Readwise, Replicate, and Causal → turbopuffer almost accidentally becoming a company • The Readwise origin story: building an early recommendation engine right after the ChatGPT moment, seeing it work, then realizing it would cost ~$30k/month for a company spending ~$5k/month total on infra and getting obsessed with fixing that cost structure • Why turbopuffer is “a search engine for unstructured data”: Simon's belief that models can learn to reason, but can't compress the world's knowledge into a few terabytes of weights, so they need to connect to systems that hold truth in full fidelity • The three ingredients for building a great database company: a new workload, a new storage architecture, and the ability to eventually support every query plan customers will want on their data • The architecture bet behind turbopuffer: going all in on object storage and NVMe, avoiding a traditional consensus layer, and building around the cloud primitives that only became possible in the last few years • Why Simon hated operating Elasticsearch at Shopify: years of painful on-call experience shaped his obsession with simplicity, performance, and eliminating state spread across multiple systems • The Cursor story: launching turbopuffer as a scrappy side project, getting an email from Cursor the next day, flying out after a 4am call, and helping cut Cursor's costs by 95% while fixing their per-user economics • The Notion story: buying dark fiber, tuning TCP windows, and eating cross-cloud costs because Simon refused to compromise on architecture just to close a deal faster • Why AI changes the build-vs-buy equation: it's less about whether a company can build search infra internally, and more about whether they have time especially if an external team can feel like an extension of their own • Why RAG isn't dead: coding companies still rely heavily on search, and Simon sees hybrid retrieval semantic, text, regex, SQL-style patterns becoming more important, not less • How agentic workloads are changing search: the old pattern was one retrieval call up front; the new pattern is one agent firing many parallel queries at once, turning search into a highly concurrent tool call • Why turbopuffer is reducing query pricing: agentic systems are dramatically increasing query volume, and Simon expects retrieval infra to adapt to huge bursts of concurrent search rather than a small number of carefully chosen calls • The philosophy of “playing with open cards”: Simon's habit of being radically honest with investors, including telling Lachy Groom he'd return the money if turbopuffer didn't hit PMF by year-end • The “P99 engineer”: Simon's framework for building a talent-dense company, rejecting by default unless someone on the team feels strongly enough to fight for the candidate —Simon Hørup Eskildsen• LinkedIn: https://www.linkedin.com/in/sirupsen• X: https://x.com/Sirupsen• https://sirupsen.com/aboutturbopuffer• https://turbopuffer.com/Full Video PodTimestamps00:00:00 The PMF promise to Lachy Groom00:00:25 Intro and Simon's background00:02:19 What turbopuffer actually is00:06:26 Shopify, Elasticsearch, and the pain behind the company00:10:07 The Readwise experiment that sparked turbopuffer00:12:00 The insight Simon couldn't stop thinking about00:17:00 S3 consistency, NVMe, and the architecture bet00:20:12 The Notion story: latency, dark fiber, and conviction00:25:03 Build vs. buy in the age of AI00:26:00 The Cursor story: early launch to breakout customer00:29:00 Why code search still matters00:32:00 Search in the age of agents00:34:22 Pricing turbopuffer in the AI era00:38:17 Why Simon chose Lachy Groom00:41:28 Becoming a founder on purpose00:44:00 The “P99 engineer” philosophy00:49:30 Bending software to your will00:51:13 The future of turbopuffer00:57:05 Simon's tea obsession00:59:03 Tea kits, X Live, and P99 LiveTranscriptSimon Hørup Eskildsen: I don't think I've said this publicly before, but I just called Lockey and was like, local Lockie. Like if this doesn't have PMF by the end of the year, like we'll just like return all the money to you. But it's just like, I don't really, we, Justine and I don't wanna work on this unless it's really working.So we want to give it the best shot this year and like we're really gonna go for it. We're gonna hire a bunch of people. We're just gonna be honest with everyone. Like when I don't know how to play a game, I just play with open cards. Lockey was the only person that didn't, that didn't freak out. He was like, I've never heard anyone say that before.Alessio: Hey everyone, welcome to the Leading Space podcast. This is Celesio Pando, Colonel Laz, and I'm joined by Swix, editor of Leading Space.swyx: Hello. Hello, uh, we're still, uh, recording in the Ker studio for the first time. Very excited. And today we are joined by Simon Eski. Of Turbo Farer welcome.Simon Hørup Eskildsen: Thank you so much for having me.swyx: Turbo Farer has like really gone on a huge tear, and I, I do have to mention that like you're one of, you're not my newest member of the Danish AHU Mafia, where like there's a lot of legendary programmers that have come out of it, like, uh, beyond Trotro, Rasmus, lado Berg and the V eight team and, and Google Maps team.Uh, you're mostly a Canadian now, but isn't that interesting? There's so many, so much like strong Danish presence.Simon Hørup Eskildsen: Yeah, I was writing a post, um, not that long ago about sort of the influences. So I grew up in Denmark, right? I left, I left when, when I was 18 to go to Canada to, to work at Shopify. Um, and so I, like, I've, I would still say that I feel more Danish than, than Canadian.This is also the weird accent. I can't say th because it, this is like, I don't, you know, my wife is also Canadian, um, and I think. I think like one of the things in, in Denmark is just like, there's just such a ruthless pragmatism and there's also a big focus on just aesthetics. Like, they're like very, people really care about like where, what things look like.Um, and like Canada has a lot of attributes, US has, has a lot of attributes, but I think there's been lots of the great things to carry. I don't know what's in the water in Ahu though. Um, and I don't know that I could be considered part of the Mafi mafia quite yet, uh, compared to the phenomenal individuals we just mentioned.Barra OV is also, uh, Danish Canadian. Okay. Yeah. I don't know where he lives now, but, and he's the PHP.swyx: Yeah. And obviously Toby German, but moved to Canada as well. Yes. Like this is like import that, uh, that, that is an interesting, um, talent move.Alessio: I think. I would love to get from you. Definition of Turbo puffer, because I think you could be a Vector db, which is maybe a bad word now in some circles, you could be a search engine.It's like, let, let's just start there and then we'll maybe run through the history of how you got to this point.Simon Hørup Eskildsen: For sure. Yeah. So Turbo Puffer is at this point in time, a search engine, right? We do full text search and we do vector search, and that's really what we're specialized in. If you're trying to do much more than that, like then this might not be the right place yet, but Turbo Buffer is all about search.The other way that I think about it is that we can take all of the world's knowledge, all of the exabytes and exabytes of data that there is, and we can use those tokens to train a model, but we can't compress all of that into a few terabytes of weights, right? Compress into a few terabytes of weights, how to reason with the world, how to make sense of the knowledge.But we have to somehow connect it to something externally that actually holds that like in full fidelity and truth. Um, and that's the thing that we intend to become. Right? That's like a very holier than now kind of phrasing, right? But being the search engine for unstructured, unstructured data is the focus of turbo puffer at this point in time.Alessio: And let's break down. So people might say, well, didn't Elasticsearch already do this? And then some other people might say, is this search on my data, is this like closer to rag than to like a xr, like a public search thing? Like how, how do you segment like the different types of search?Simon Hørup Eskildsen: The way that I generally think about this is like, there's a lot of database companies and I think if you wanna build a really big database company, sort of, you need a couple of ingredients to be in the air.We don't, which only happens roughly every 15 years. You need a new workload. You basically need the ambition that every single company on earth is gonna have data in your database. Multiple times you look at a company like Oracle, right? You will, like, I don't think you can find a company on earth with a digital presence that it not, doesn't somehow have some data in an Oracle database.Right? And I think at this point, that's also true for Snowflake and Databricks, right? 15 years later it's, or even more than that, there's not a company on earth that doesn't, in. Or directly is consuming Snowflake or, or Databricks or any of the big analytics databases. Um, and I think we're in that kind of moment now, right?I don't think you're gonna find a company over the next few years that doesn't directly or indirectly, um, have all their data available for, for search and connect it to ai. So you need that new workload, like you need something to be happening where there's a new workload that causes that to happen, and that new workload is connecting very large amounts of data to ai.The second thing you need. The second condition to build a big database company is that you need some new underlying change in the storage architecture that is not possible from the databases that have come before you. If you look at Snowflake and Databricks, right, commoditized, like massive fleet of HDDs, like that was not possible in it.It just wasn't in the air in the nineties, right? So you just didn't, we just didn't build these systems. S3 and and and so on was not around. And I think the architecture that is now possible that wasn't possible 15 years ago is to go all in on NVME SSDs. It requires a particular type of architecture for the database that.It's difficult to retrofit onto the databases that are already there, including the ones you just mentioned. The second thing is to go all in on OIC storage, more so than we could have done 15 years ago. Like we don't have a consensus layer, we don't really have anything. In fact, you could turn off all the servers that Turbo Buffer has, and we would not lose any data because we have all completely all in on OIC storage.And this means that our architecture is just so simple. So that's the second condition, right? First being a new workload. That means that every company on earth, either indirectly or directly, is using your database. Second being, there's some new storage architecture. That means that the, the companies that have come before you can do what you're doing.I think the third thing you need to do to build a big database company is that over time you have to implement more or less every Cory plan on the data. What that means is that you. You can't just get stuck in, like, this is the one thing that a database does. It has to be ever evolving because when someone has data in the database, they over time expect to be able to ask it more or less every question.So you have to do that to get the storage architecture to the limit of what, what it's capable of. Those are the three conditions.swyx: I just wanted to get a little bit of like the motivation, right? Like, so you left Shopify, you're like principal, engineer, infra guy. Um, you also head of kernel labs, uh, inside of Shopify, right?And then you consulted for read wise and that it kind of gave you that, that idea. I just wanted you to tell that story. Um, maybe I, you've told it before, but, uh, just introduce the, the. People to like the, the new workload, the sort of aha moment for turbo PufferSimon Hørup Eskildsen: For sure. So yeah, I spent almost a decade at Shopify.I was on the infrastructure team, um, from the fairly, fairly early days around 2013. Um, at the time it felt like it was growing so quickly and everything, all the metrics were, you know, doubling year on year compared to the, what companies are contending with today. It's very cute in growth. I feel like lot some companies are seeing that month over month.Um, of course. Shopify compound has been compounding for a very long time now, but I spent a decade doing that and the majority of that was just make sure the site is up today and make sure it's up a year from now. And a lot of that was really just the, um, you know, uh, the Kardashians would drive very, very large amounts of, of data to, to uh, to Shopify as they were rotating through all the merch and building out their businesses.And we just needed to make sure we could handle that. Right. And sometimes these were events, a million requests per second. And so, you know, we, we had our own data centers back in the day and we were moving to the cloud and there was so much sharding work and all of that that we were doing. So I spent a decade just scaling databases ‘cause that's fundamentally what's the most difficult thing to scale about these sites.The database that was the most difficult for me to scale during that time, and that was the most aggravating to be on call for, was elastic search. It was very, very difficult to deal with. And I saw a lot of projects that were just being held back in their ambition by using it.swyx: And I mean, self-hosted.Self-hosted. ‘causeSimon Hørup Eskildsen: it's, yeah, and it commercial, this is like 2015, right? So it's like a very particular vintage. Right. It's probably better at a lot of these things now. Um, it was difficult to contend with and I'm just like, I just think about it. It's an inverted index. It should be good at these kinds of queries and do all of this.And it was, we, we often couldn't get it to do exactly what we needed to do or basically get lucine to do, like expose lucine raw to, to, to what we needed to do. Um, so that was like. Just something that we did on the side and just panic scaled when we needed to, but not a particular focus of mine. So I left, and when I left, I, um, wasn't sure exactly what I wanted to do.I mean, it spent like a decade inside of the same company. I'd like grown up there. I started working there when I was 18.swyx: You only do Rails?Simon Hørup Eskildsen: Yeah. I mean, yeah. Rails. And he's a Rails guy. Uh, love Rails. So good. Um,Alessio: we all wish we could still work in Rails.swyx: I know know. I know, but some, I tried learning Ruby.It's just too much, like too many options to do the same thing. It's, that's my, I I know there's a, there's a way to do it.Simon Hørup Eskildsen: I love it. I don't know that I would use it now, like given cloud code and, and, and cursor and everything, but, um, um, but still it, like if I'm just sitting down and writing a teal code, that's how I think.But anyway, I left and I wasn't, I talked to a couple companies and I was like, I don't. I need to see a little bit more of the world here to know what I'm gonna like focus on next. Um, and so what I decided is like I was gonna, I called it like angel engineering, where I just hopped around in my friend's companies in three months increments and just helped them out with something.Right. And, and just vested a bit of equity and solved some interesting infrastructure problem. So I worked with a bunch of companies at the time, um, read Wise was one of them. Replicate was one of them. Um, causal, I dunno if you've tried this, it's like a, it's a spreadsheet engine Yeah. Where you can do distribution.They sold recently. Yeah. Um, we've been, we used that in fp and a at, um, at Turbo Puffer. Um, so a bunch of companies like this and it was super fun. And so we're the Chachi bt moment happened, I was with. With read Wise for a stint, we were preparing for the reader launch, right? Which is where you, you cue articles and read them later.And I was just getting their Postgres up to snuff, like, which basically boils down to tuning, auto vacuum. So I was doing that and then this happened and we were like, oh, maybe we should build a little recommendation engine and some features to try to hook in the lms. They were not that good yet, but it was clear there was something there.And so I built a small recommendation engine just, okay, let's take the articles that you've recently read, right? Like embed all the articles and then do recommendations. It was good enough that when I ran it on one of the co-founders of Rey's, like I found out that I got articles about, about having a child.I'm like, oh my God, I didn't, I, I didn't know that, that they were having a child. I wasn't sure what to do with that information, but the recommendation engine was good enough that it was suggesting articles, um, about that. And so there was, there was recommendations and uh, it actually worked really well.But this was a company that was spending maybe five grand a month in total on all their infrastructure and. When I did the napkin math on running the embeddings of all the articles, putting them into a vector index, putting it in prod, it's gonna be like 30 grand a month. That just wasn't tenable. Right?Like Read Wise is a proudly bootstrapped company and it's paying 30 grand for infrastructure for one feature versus five. It just wasn't tenable. So sort of in the bucket of this is useful, it's pretty good, but let us, let's return to it when the costs come down.swyx: Did you say it grows by feature? So for five to 30 is by the number of, like, what's the, what's the Scaling factor scale?It scales by the number of articles that you embed.Simon Hørup Eskildsen: It does, but what I meant by that is like five grand for like all of the other, like the Heroku, dinos, Postgres, like all the other, and this then storage is 30. Yeah. And then like 30 grand for one feature. Right. Which is like, what other articles are related to this one.Um, so it was just too much right to, to power everything. Their budget would've been maybe a few thousand dollars, which still would've been a lot. And so we put it in a bucket of, okay, we're gonna do that later. We'll wait, we will wait for the cost to come down. And that haunted me. I couldn't stop thinking about it.I was like, okay, there's clearly some latent demand here. If the cost had been a 10th, we would've shipped it and. This was really the only data point that I had. Right. I didn't, I, I didn't, I didn't go out and talk to anyone else. It was just so I started reading Right. I couldn't, I couldn't help myself.Like I didn't know what like a vector index is. I, I generally barely do about how to generate the vectors. There was a lot of hype about, this is a early 2023. There was a lot of hype about vector databases. There were raising a lot of money and it's like, I really didn't know anything about it. It's like, you know, trying these little models, fine tuning them.Like I was just trying to get sort of a lay of the land. So I just sat down. I have this. A GitHub repository called Napkin Math. And on napkin math, there's just, um, rows of like, oh, this is how much bandwidth. Like this is how many, you know, you can do 25 gigabytes per second on average to dram. You can do, you know, five gigabytes per second of rights to an SSD, blah blah.All of these numbers, right? And S3, how many you could do per, how much bandwidth can you drive per connection? I was just sitting down, I was like, why hasn't anyone build a database where you just put everything on O storage and then you puff it into NVME when you use the data and you puff it into dram if you're, if you're querying it alive, it's just like, this seems fairly obvious and you, the only real downside to that is that if you go all in on o storage, every right will take a couple hundred milliseconds of latency, but from there it's really all upside, right?You do the first go, it takes half a second. And it sort of occurred to me as like, well. The architecture is really good for that. It's really good for AB storage, it's really good for nvm ESSD. It's, well, you just couldn't have done that 10 years ago. Back to what we were talking about before. You really have to build a database where you have as few round trips as possible, right?This is how CPUs work today. It's how NVM E SSDs work. It's how as, um, as three works that you want to have a very large amount of outstanding requests, right? Like basically go to S3, do like that thousand requests to ask for data in one round trip. Wait for that. Get that, like, make a new decision. Do it again, and try to do that maybe a maximum of three times.But no databases were designed that way within NVME as is ds. You can drive like within, you know, within a very low multiple of DRAM bandwidth if you use it that way. And same with S3, right? You can fully max out the network card, which generally is not maxed out. You get very, like, very, very good bandwidth.And, but no one had built a database like that. So I was like, okay, well can't you just, you know, take all the vectors right? And plot them in the proverbial coordinate system. Get the clusters, put a file on S3 called clusters, do json, and then put another file for every cluster, you know, cluster one, do js O cluster two, do js ON you know that like it's two round trips, right?So you get the clusters, you find the closest clusters, and then you download the cluster files like the, the closest end. And you could do this in two round trips.swyx: You were nearest neighbors locally.Simon Hørup Eskildsen: Yes. Yes. And then, and you would build this, this file, right? It's just like ultra simplistic, but it's not a far shot from what the first version of Turbo Buffer was.Why hasn't anyone done thatAlessio: in that moment? From a workload perspective, you're thinking this is gonna be like a read heavy thing because they're doing recommend. Like is the fact that like writes are so expensive now? Oh, with ai you're actually not writing that much.Simon Hørup Eskildsen: At that point I hadn't really thought too much about, well no actually it was always clear to me that there was gonna be a lot of rights because at Shopify, the search clusters were doing, you know, I don't know, tens or hundreds of crew QPS, right?‘cause you just have to have a human sit and type in. But we did, you know, I don't know how many updates there were per second. I'm sure it was in the millions, right into the cluster. So I always knew there was like a 10 to 100 ratio on the read write. In the read wise use case. It's, um, even, even in the read wise use case, there'd probably be a lot fewer reads than writes, right?There's just a lot of churn on the amount of stuff that was going through versus the amount of queries. Um, I wasn't thinking too much about that. I was mostly just thinking about what's the fundamentally cheapest way to build a database in the cloud today using the primitives that you have available.And this is it, right? You just, now you have one machine and you know, let's say you have a terabyte of data in S3, you paid the $200 a month for that, and then maybe five to 10% of that data and needs to be an NV ME SSDs and less than that in dram. Well. You're paying very, very little to inflate the data.swyx: By the way, when you say no one else has done that, uh, would you consider Neon, uh, to be on a similar path in terms of being sort of S3 first and, uh, separating the compute and storage?Simon Hørup Eskildsen: Yeah, I think what I meant with that is, uh, just build a completely new database. I don't know if we were the first, like it was very much, it was, I mean, I, I hadn't, I just looked at the napkin math and was like, this seems really obvious.So I'm sure like a hundred people came up with it at the same time. Like the light bulb and every invention ever. Right. It was just in the air. I think Neon Neon was, was first to it. And they're trying, they're retrofitted onto Postgres, right? And then they built this whole architecture where you have, you have it in memory and then you sort of.You know, m map back to S3. And I think that was very novel at the time to do it for, for all LTP, but I hadn't seen a database that was truly all in, right. Not retrofitting it. The database felt built purely for this no consensus layer. Even using compare and swap on optic storage to do consensus. I hadn't seen anyone go that all in.And I, I mean, there, there, I'm sure there was someone that did that before us. I don't know. I was just looking at the napkin mathswyx: and, and when you say consensus layer, uh, are you strongly relying on S3 Strong consistency? You are. Okay.SoSimon Hørup Eskildsen: that is your consensus layer. It, it is the consistency layer. And I think also, like, this is something that most people don't realize, but S3 only became consistent in December of 2020.swyx: I remember this coming out during COVID and like people were like, oh, like, it was like, uh, it was just like a free upgrade.Simon Hørup Eskildsen: Yeah.swyx: They were just, they just announced it. We saw consistency guys and like, okay, cool.Simon Hørup Eskildsen: And I'm sure that they just, they probably had it in prod for a while and they're just like, it's done right.And people were like, okay, cool. But. That's a big moment, right? Like nv, ME SSDs, were also not in the cloud until around 2017, right? So you just sort of had like 2017 nv, ME SSDs, and people were like, okay, cool. There's like one skew that does this, whatever, right? Takes a few years. And then the second thing is like S3 becomes consistent in 2020.So now it means you don't have to have this like big foundation DB or like zookeeper or whatever sitting there contending with the keys, which is how. You know, that's what Snowflake and others have do so muchswyx: for goneSimon Hørup Eskildsen: Exactly. Just gone. Right? And so just push to the, you know, whatever, how many hundreds of people they have working on S3 solved and then compare and swap was not in S3 at this point in time,swyx: by the way.Uh, I don't know what that is, so maybe you wanna explain. Yes. Yeah.Simon Hørup Eskildsen: Yes. So, um, what Compare and swap is, is basically, you can imagine that if you have a database, it might be really nice to have a file called metadata json. And metadata JSON could say things like, Hey, these keys are here and this file means that, and there's lots of metadata that you have to operate in the database, right?But that's the simplest way to do it. So now you have might, you might have a lot of servers that wanna change the metadata. They might have written a file and want the metadata to contain that file. But you have a hundred nodes that are trying to contend with this metadata that JSON well, what compare and Swap allows you to do is basically just you download the file, you make the modifications, and then you write it only if it hasn't changed.While you did the modification and if not you retry. Right? Should just have this retry loops. Now you can imagine if you have a hundred nodes doing that, it's gonna be really slow, but it will converge over time. That primitive was not available in S3. It wasn't available in S3 until late 2024, but it was available in GCP.The real story of this is certainly not that I sat down and like bake brained it. I was like, okay, we're gonna start on GCS S3 is gonna get it later. Like it was really not that we started, we got really lucky, like we started on GCP and we started on GCP because tur um, Shopify ran on GCP. And so that was the platform I was most available with.Right. Um, and I knew the Canadian team there ‘cause I'd worked with them at Shopify and so it was natural for us to start there. And so when we started building the database, we're like, oh yeah, we have to build a, we really thought we had to build a consensus layer, like have a zookeeper or something to do this.But then we discovered the compare and swap. It's like, oh, we can kick the can. Like we'll just do metadata r json and just, it's fine. It's probably fine. Um, and we just kept kicking the can until we had very, very strong conviction in the idea. Um, and then we kind of just hinged the company on the fact that S3 probably was gonna get this, it started getting really painful in like mid 2024.‘cause we were closing deals with, um, um, notion actually that was running in AWS and we're like, trust us. You, you really want us to run this in GCP? And they're like, no, I don't know about that. Like, we're running everything in AWS and the latency across the cloud were so big and we had so much conviction that we bought like, you know, dark fiber between the AWS regions in, in Oregon, like in the InterExchange and GCP is like, we've never seen a startup like do like, what's going on here?And we're just like, no, we don't wanna do this. We were tuning like TCP windows, like everything to get the latency down ‘cause we had so high conviction in not doing like a, a metadata layer on S3. So those were the three conditions, right? Compare and swap. To do metadata, which wasn't in S3 until late 2024 S3 being consistent, which didn't happen until December, 2020.Uh, 2020. And then NVMe ssd, which didn't end in the cloud until 2017.swyx: I mean, in some ways, like a very big like cloud success story that like you were able to like, uh, put this all together, but also doing things like doing, uh, bind our favor. That that actually is something I've never heard.Simon Hørup Eskildsen: I mean, it's very common when you're a big company, right?You're like connecting your own like data center or whatever. But it's like, it was uniquely just a pain with notion because the, um, the org, like most of the, like if you're buying in Ashburn, Virginia, right? Like US East, the Google, like the GCP and, and AWS data centers are like within a millisecond on, on each other, on the public exchanges.But in Oregon uniquely, the GCP data center sits like a couple hundred kilometers, like east of Portland and the AWS region sits in Portland, but the network exchange they go through is through Seattle. So it's like a full, like 14 milliseconds or something like that. And so anyway, yeah. It's, it's, so we were like, okay, we can't, we have to go through an exchange in Portland.Yeah. Andswyx: you'd rather do this than like run your zookeeper and likeSimon Hørup Eskildsen: Yes. Way rather. It doesn't have state, I don't want state and two systems. Um, and I think all that is just informed by Justine, my co-founder and I had just been on call for so long. And the worst outages are the ones where you have state in multiple places that's not syncing up.So it really came from, from a a, like just a, a very pure source of pain, of just imagining what we would be Okay. Being woken up at 3:00 AM about and having something in zookeeper was not one of them.swyx: You, you're talking to like a notion or something. Do they care or do they just, theySimon Hørup Eskildsen: just, they care about latency.swyx: They latency cost. That's it.Simon Hørup Eskildsen: They just cared about latency. Right. And we just absorbed the cost. We're just like, we have high conviction in this. At some point we can move them to AWS. Right. And so we just, we, we'll buy the fiber, it doesn't matter. Right. Um, and it's like $5,000. Usually when you buy fiber, you buy like multiple lines.And we're like, we can only afford one, but we will just test it that when it goes over the public internet, it's like super smooth. And so we did a lot of, anyway, it's, yeah, it was, that's cool.Alessio: You can imagine talking to the GCP rep and it's like, no, we're gonna buy, because we know we're gonna turn, we're gonna turn from you guys and go to AWS in like six months.But in the meantime we'll do this. It'sSimon Hørup Eskildsen: a, I mean, like they, you know, this workload still runs on GCP for what it's worth. Right? ‘cause it's so, it was just, it was so reliable. So it was never about moving off GCP, it was just about honesty. It was just about giving notion the latency that they deserved.Right. Um, and we didn't want ‘em to have to care about any of this. We also, they were like, oh, egress is gonna be bad. It was like, okay, screw it. Like we're just gonna like vvc, VPC peer with you and AWS we'll eat the cost. Yeah. Whatever needs to be done.Alessio: And what were the actual workloads? Because I think when you think about ai, it's like 14 milliseconds.It's like really doesn't really matter in the scheme of like a model generation.Simon Hørup Eskildsen: Yeah. We were told the latency, right. That we had to beat. Oh, right. So, so we're just looking at the traces. Right. And then sort of like hand draw, like, you know, kind of like looking at the trace and then thinking what are the other extensions of the trace?Right. And there's a lot more to it because it's also when you have, if you have 14 versus seven milliseconds, right. You can fit in another round trip. So we had to tune TCP to try to send as much data in every round trip, prewarm all the connections. And there was, there's a lot of things that compound from having these kinds of round trips, but in the grand scheme it was just like, well, we have to beat the latency of whatever we're up against.swyx: Which is like they, I mean, notion is a database company. They could have done this themselves. They, they do lots of database engineering themselves. How do you even get in the door? Like Yeah, just like talk through that kind of.Simon Hørup Eskildsen: Last time I was in San Francisco, I was talking to one of the engineers actually, who, who was one of our champions, um, at, AT Notion.And they were, they were just trying to make sure that the, you know, per user cost matched the economics that they needed. You know, Uhhuh like, it's like the way I think about, it's like I have to earn a return on whatever the clouds charge me and then my customers have to earn a return on that. And it's like very simple, right?And so there has to be gross margin all the way up and that's how you build the product. And so then our customers have to make the right set of trade off the turbo Puffer makes, and if they're happy with that, that's great.swyx: Do you feel like you're competing with build internally versus buy or buy versus buy?Simon Hørup Eskildsen: Yeah, so, sorry, this was all to build up to your question. So one of the notion engineers told me that they'd sat and probably on a napkin, like drawn out like, why hasn't anyone built this? And then they saw terrible. It was like, well, it literally that. So, and I think AI has also changed the buy versus build equation in terms of, it's not really about can we build it, it's about do we have time to build it?I think they like, I think they felt like, okay, if this is a team that can do that and they, they feel enough like an extension of our team, well then we can go a lot faster, which would be very, very good for them. And I mean, they put us through the, through the test, right? Like we had some very, very long nights to to, to do that POC.And they were really our biggest, our second big customer off the cursor, which also was a lot of late nights. Right.swyx: Yeah. That, I mean, should we go into that story? The, the, the sort of Chris's story, like a lot, um, they credit you a lot for. Working very closely with them. So I just wanna hear, I've heard this, uh, story from Sole's point of view, but like, I'm curious what, what it looks like from your side.Simon Hørup Eskildsen: I actually haven't heard it from Sole's point of view, so maybe you can now cross reference it. The way that I remember it was that, um, the day after we launched, which was just, you know, I'd worked the whole summer on, on the first version. Justine wasn't part of it yet. ‘cause I just, I didn't tell anyone that summer that I was working on this.I was just locked in on building it because it's very easy otherwise to confuse talking about something to actually doing it. And so I was just like, I'm not gonna do that. I'm just gonna do the thing. I launched it and at this point turbo puffer is like a rust binary running on a single eight core machine in a T Marks instance.And me deploying it was like looking at the request log and then like command seeing it or like control seeing it to just like, okay, there's no request. Let's upgrade the binary. Like it was like literally the, the, the, the scrappiest thing. You could imagine it was on purpose because just like at Shopify, we did that all the time.Like, we like move, like we ran things in tux all the time to begin with. Before something had like, at least the inkling of PMF, it was like, okay, is anyone gonna hear about this? Um, and one of the cursor co-founders Arvid reached out and he just, you know, the, the cursor team are like all I-O-I-I-M-O like, um, contenders, right?So they just speak in bullet points and, and facts. It was like this amazing email exchange just of, this is how many QPS we have, this is what we're paying, this is where we're going, blah, blah, blah. And so we're just conversing in bullet points. And I tried to get a call with them a few times, but they were, so, they were like really writing the PMF bowl here, just like late 2023.And one time Swally emails me at like five. What was it like 4:00 AM Pacific time saying like, Hey, are you open for a call now? And I'm on the East coast and I, it was like 7:00 AM I was like, yeah, great, sure, whatever. Um, and we just started talking and something. Then I didn't know anything about sales.It was something that just comp compelled me. I have to go see this team. Like, there's something here. So I, I went to San Francisco and I went to their office and the way that I remember it is that Postgres was down when I showed up at the office. Did SW tell you this? No. Okay. So Postgres was down and so it's like they were distracting with that.And I was trying my best to see if I could, if I could help in any way. Like I knew a little bit about databases back to tuning, auto vacuum. It was like, I think you have to tune out a vacuum. Um, and so we, we talked about that and then, um, that evening just talked about like what would it look like, what would it look like to work with us?And I just said. Look like we're all in, like we will just do what we'll do whatever, whatever you tell us, right? They migrated everything over the next like week or two, and we reduced their cost by 95%, which I think like kind of fixed their per user economics. Um, and it solved a lot of other things. And we were just, Justine, this is also when I asked Justine to come on as my co-founder, she was the best engineer, um, that I ever worked with at Shopify.She lived two blocks away and we were just, okay, we're just gonna get this done. Um, and we did, and so we helped them migrate and we just worked like hell over the next like month or two to make sure that we were never an issue. And that was, that was the cursor story. Yeah.swyx: And, and is code a different workload than normal text?I, I don't know. Is is it just text? Is it the same thing?Simon Hørup Eskildsen: Yeah, so cursor's workload is basically, they, um, they will embed the entire code base, right? So they, they will like chunk it up in whatever they would, they do. They have their own embedding model, um, which they've been public about. Um, and they find that on, on, on their evals.It. There's one of their evals where it's like a 25% improvement on a very particular workload. They have a bunch of blog posts about it. Um, I think it works best on larger code basis, but they've trained their own embedding model to do this. Um, and so you'll see it if you use the cursor agent, it will do searches.And they've also been public around, um, how they've, I think they post trained their model to be very good at semantic search as well. Um, and that's, that's how they use it. And so it's very good at, like, can you find me on the code that's similar to this, or code that does this? And just in, in this queries, they also use GR to supplement it.swyx: Yeah.Simon Hørup Eskildsen: Um, of courseswyx: it's been a big topic of discussion like, is rag dead because gr you know,Simon Hørup Eskildsen: and I mean like, I just, we, we see lots of demand from the coding company to ethicsswyx: search in every part. Yes.Simon Hørup Eskildsen: Uh, we, we, we see demand. And so, I mean, I'm. I like case studies. I don't like, like just doing like thought pieces on this is where it's going.And like trying to be all macroeconomic about ai, that's has turned out to be a giant waste of time because no one can really predict any of this. So I just collect case studies and I mean, cursor has done a great job talking about what they're doing and I hope some of the other coding labs that use Turbo Puffer will do the same.Um, but it does seem to make a difference for particular queries. Um, I mean we can also do text, we can also do RegX, but I should also say that cursors like security posture into Tur Puffer is exceptional, right? They have their own embedding model, which makes it very difficult to reverse engineer. They obfuscate the file paths.They like you. It's very difficult to learn anything about a code base by looking at it. And the other thing they do too is that for their customers, they encrypt it with their encryption keys in turbo puffer's bucket. Um, so it's, it's, it's really, really well designed.swyx: And so this is like extra stuff they did to work with you because you are not part of Cursor.Exactly like, and this is just best practice when working in any database, not just you guys. Okay. Yeah, that makes sense. Yeah. I think for me, like the, the, the learning is kind of like you, like all workloads are hybrid. Like, you know, uh, like you, you want the semantic, you want the text, you want the RegX, you want sql.I dunno. Um, but like, it's silly to like be all in on like one particularly query pattern.Simon Hørup Eskildsen: I think, like I really like the way that, um, um, that swally at cursor talks about it, which is, um, I'm gonna butcher it here. Um, and you know, I'm a, I'm a database scalability person. I'm not a, I, I dunno anything about training models other than, um, what the internet tells me and what.The way he describes is that this is just like cash compute, right? It's like you have a point in time where you're looking at some particular context and focused on some chunk and you say, this is the layer of the neural net at this point in time. That seems fundamentally really useful to do cash compute like that.And, um, how the value of that will change over time. I'm, I'm not sure, but there seems to be a lot of value in that.Alessio: Maybe talk a bit about the evolution of the workload, because even like search, like maybe two years ago it was like one search at the start of like an LLM query to build the context. Now you have a gentech search, however you wanna call it, where like the model is both writing and changing the code and it's searching it again later.Yeah. What are maybe some of the new types of workloads or like changes you've had to make to your architecture for it?Simon Hørup Eskildsen: I think you're right. When I think of rag, I think of, Hey, there's an 8,000 token, uh, context window and you better make it count. Um, and search was a way to do that now. Everything is moving towards the, just let the agent do its thing.Right? And so back to the thing before, right? The LLM is very good at reasoning with the data, and so we're just the tool call, right? And that's increasingly what we see our customers doing. Um, what we're seeing more demand from, from our customers now is to do a lot of concurrency, right? Like Notion does a ridiculous amount of queries in every round trip just because they can't.And I'm also now, when I use the cursor agent, I also see them doing more concurrency than I've ever seen before. So a bit similar to how we designed a database to drive as much concurrency in every round trip as possible. That's also what the agents are doing. So that's new. It means just an enormous amount of queries all at once to the dataset while it's warm in as few turns as possible.swyx: Can I clarify one thing on that?Simon Hørup Eskildsen: Yes.swyx: Is it, are they batching multiple users or one user is driving multiple,Simon Hørup Eskildsen: one user driving multiple, one agent driving.swyx: It's parallel searching a bunch of things.Simon Hørup Eskildsen: Exactly.swyx: Yeah. Yeah, exactly. So yeah, the clinician also did, did this for the fast context thing, like eight parallel at once.Simon Hørup Eskildsen: Yes.swyx: And, and like an interesting problem is, well, how do you make sure you have enough diversity so you're not making the the same request eight times?Simon Hørup Eskildsen: And I think like that's probably also where the hybrid comes in, where. That's another way to diversify. It's a completely different way to, to do the search.That's a big change, right? So before it was really just like one call and then, you know, the LLM took however many seconds to return, but now we just see an enormous amount of queries. So the, um, we just see more queries. So we've like tried to reduce query, we've reduced query pricing. Um, this is probably the first time actually I'm saying that, but the query pricing is being reduced, like five x.Um, and we'll probably try to reduce it even more to accommodate some of these workloads of just doing very large amounts of queries. Um, that's one thing that's changed. I think the right, the right ratio is still very high, right? Like there's still a, an enormous amount of rights per read, but we're starting probably to see that change if people really lean into this pattern.Alessio: Can we talk a little bit about the pricing? I'm curious, uh, because traditionally a database would charge on storage, but now you have the token generation that is so expensive, where like the actual. Value of like a good search query is like much higher because they're like saving inference time down the line.How do you structure that as like, what are people receptive to on the other side too?Simon Hørup Eskildsen: Yeah. I, the, the turbo puffer pricing in the beginning was just very simple. The pricing on these on for search engines before Turbo Puffer was very server full, right? It was like, here's the vm, here's the per hour cost, right?Great. And I just sat down with like a piece of paper and said like, if Turbo Puffer was like really good, this is probably what it would cost with a little bit of margin. And that was the first pricing of Turbo Puffer. And I just like sat down and I was like, okay, like this is like probably the storage amp, but whenever on a piece of paper I, it was vibe pricing.It was very vibe price, and I got it wrong. Oh. Um, well I didn't get it wrong, but like Turbo Puffer wasn't at the first principle pricing, right? So when Cursor came on Turbo Puffer, it was like. Like, I didn't know any VCs. I didn't know, like I was just like, I don't know, I didn't know anything about raising money or anything like that.I just saw that my GCP bill was, was high, was a lot higher than the cursor bill. So Justine and I was just like, well, we have to optimize it. Um, and I mean, to the chagrin now of, of it, of, of the VCs, it now means that we're profitable because we've had so much pricing pressure in the beginning. Because it was running on my credit card and Justine and I had spent like, like tens of thousands of dollars on like compute bills and like spinning off the company and like very like, like bad Canadian lawyers and like things like to like get all of this done because we just like, we didn't know.Right. If you're like steeped in San Francisco, you're just like, you just know. Okay. Like you go out, raise a pre-seed round. I, I never heard a word pre-seed at this point in time.swyx: When you had Cursor, you had Notion you, you had no funding.Simon Hørup Eskildsen: Um, with Cursor we had no funding. Yeah. Um, by the time we had Notion Locke was, Locke was here.Yeah. So it was really just, we vibe priced it 100% from first Principles, but it wasn't, it, it was not performing at first principles, so we just did everything we could to optimize it in the beginning for that, so that at least we could have like a 5% margin or something. So I wasn't freaking out because Cursor's bill was also going like this as they were growing.And so my liability and my credit limit was like actively like calling my bank. It was like, I need a bigger credit. Like it was, yeah. Anyway, that was the beginning. Yeah. But the pricing was, yeah, like storage rights and query. Right. And the, the pricing we have today is basically just that pricing with duct tape and spit to try to approach like, you know, like a, as a margin on the physical underlying hardware.And we're doing this year, you're gonna see more and more pricing changes from us. Yeah.swyx: And like is how much does stuff like VVC peering matter because you're working in AWS land where egress is charged and all that, you know.Simon Hørup Eskildsen: We probably don't like, we have like an enterprise plan that just has like a base fee because we haven't had time to figure out SKU pricing for all of this.Um, but I mean, yeah, you can run turbo puffer either in SaaS, right? That's what Cursor does. You can run it in a single tenant cluster. So it's just you. That's what Notion does. And then you can run it in, in, in BYOC where everything is inside the customer's VPC, that's what an for example, philanthropic does.swyx: What I'm hearing is that this is probably the best CRO job for somebody who can come in and,Simon Hørup Eskildsen: I mean,swyx: help you with this.Simon Hørup Eskildsen: Um, like Turbo Puffer hired, like, I don't know what, what number this was, but we had a full-time CFO as like the 12th hire or something at Turbo Puffer, um, I think I hear are a lot of comp.I don't know how they do it. Like they have a hundred employees and not a CFO. It's like having a CFO is like a runningswyx: business man. Like, you know,Simon Hørup Eskildsen: it's so good. Yeah, like money Mike, like he just, you know, just handles the money and a lot of the business stuff and so he came in and just hopped with a lot of the operational side of the business.So like C-O-O-C-F-O, like somewhere in between.swyx: Just as quick mention of Lucky, just ‘cause I'm curious, I've met Lock and like, he's obviously a very good investor and now on physical intelligence, um, I call it generalist super angel, right? He invests in everything. Um, and I always wonder like, you know, is there something appealing about focusing on developer tooling, focusing on databases, going like, I've invested for 10 years in databases versus being like a lock where he can maybe like connect you to all the customers that you need.Simon Hørup Eskildsen: This is an excellent question. No, no one's asked me this. Um, why lockey? Because. There was a couple of people that we were talking to at the time and when we were raising, we were almost a little, we were like a bit distressed because one of our, one of our peers had just launched something that was very similar to Turbo Puffer.And someone just gave me the advice at the time of just choose the person where you just feel like you can just pick up the phone and not prepare anything. And just be completely honest, and I don't think I've said this publicly before, but I just called Lockey and was like local Lockie. Like if this doesn't have PMF by the end of the year, like we'll just like return all the money to you.But it's just like, I don't really, we, Justine and I don't wanna work on this unless it's really working. So we want to give it the best shot this year and like we're really gonna go for it. We're gonna hire a bunch of people and we're just gonna be honest with everyone. Like when I don't know how to play a game, I just play with open cards and.Lockey was the only person that didn't, that didn't freak out. He was like, I've never heard anyone say that before. As I said, I didn't even know what a seed or pre-seed round was like before, probably even at this time. So I was just like very honest with him. And I asked him like, Lockie, have you ever have, have you ever invested in database company?He was just like, no. And at the time I was like, am I dumb? Like, but I think there was something that just like really drew me to Lockie. He is so authentic, so honest, like, and there was something just like, I just felt like I could just play like, just say everything openly. And that was, that was, I think that that was like a perfect match at the time, and, and, and honestly still is.He was just like, okay, that's great. This is like the most honest, ridiculous thing I've ever heard anyone say to me. But like that, like that, whyswyx: is this ridiculous? Say competitor launch, this may not work out. It wasSimon Hørup Eskildsen: more just like. If this doesn't work out, I'm gonna close up shop by the end of the mo the year, right?Like it was, I don't know, maybe it's common. I, I don't know. He told me it was uncommon. I don't know. Um, that's why we chose him and he'd been phenomenal. The other people were talking at the, at the time were database experts. Like they, you know, knew a lot about databases and Locke didn't, this turned out to be a phenomenal asset.Right. I like Justine and I know a lot about databases. The people that we hire know a lot about databases. What we needed was just someone who didn't know a lot about databases, didn't pretend to know a lot about databases, and just wanted to help us with candidates and customers. And he did. Yeah. And I have a list, right, of the investors that I have a relationship with, and Lockey has just performed excellent in the number of sub bullets of what we can attribute back to him.Just absolutely incredible. And when people talk about like no ego and just the best thing for the founder, I like, I don't think that anyone, like even my lawyer is like, yeah, Lockey is like the most friendly person you will find.swyx: Okay. This is my most glow recommendation I've ever heard.Alessio: He deserves it.He's very special.swyx: Yeah. Yeah. Yeah. Okay. Amazing.Alessio: Since you mentioned candidates, maybe we can talk about team building, you know, like, especially in sf, it feels like it's just easier to start a company than to join a company. Uh, I'm curious your experience, especially not being n SF full-time and doing something that is maybe, you know, a very low level of detail and technical detail.Simon Hørup Eskildsen: Yeah. So joining versus starting, I never thought that I would be a founder. I would start with it, like Turbo Puffer started as a blog post, and then it became a project and then sort of almost accidentally became a company. And now it feels like it's, it's like becoming a bigger company. That was never the intention.The intentions were very pure. It's just like, why hasn't anyone done this? And it's like, I wanna be the, like, I wanna be the first person to do it. I think some founders have this, like, I could never work for anyone else. I, I really don't feel that way. Like, it's just like, I wanna see this happen. And I wanna see it happen with some people that I really enjoy working with and I wanna have fun doing it and this, this, this has all felt very natural on that, on that sense.So it was never a like join versus versus versus found. It was just dis found me at the right moment.Alessio: Well I think there's an argument for, you should have joined Cursor, right? So I'm curious like how you evaluate it. Okay, I should actually go raise money and make this a company versus like, this is like a company that is like growing like crazy.It's like an interesting technical problem. I should just build it within Cursor and then they don't have to encrypt all this stuff. They don't have to obfuscate things. Like was that on your mind at all orSimon Hørup Eskildsen: before taking the, the small check from Lockie, I did have like a hard like look at myself in the mirror of like, okay, do I really want to do this?And because if I take the money, I really have to do it right. And so the way I almost think about it's like you kind of need to ha like you kind of need to be like fucked up enough to want to go all the way. And that was the conversation where I was like, okay, this is gonna be part of my life's journey to build this company and do it in the best way that I possibly can't.Because if I ask people to join me, ask people to get on the cap table, then I have an ultimate responsibility to give it everything. And I don't, I think some people, it doesn't occur to me that everyone takes it that seriously. And maybe I take it too seriously, I don't know. But that was like a very intentional moment.And so then it was very clear like, okay, I'm gonna do this and I'm gonna give it everything.Alessio: A lot of people don't take it this seriously. But,swyx: uh, let's talk about, you have this concept of the P 99 engineer. Uh, people are 10 x saying, everyone's saying, you know, uh, maybe engineers are out of a job. I don't know.But you definitely see a P 99 engineer, and I just want you to talk about it.Simon Hørup Eskildsen: Yeah, so the P 99 engineer was just a term that we started using internally to talk about candidates and talk about how we wanted to build the company. And you know, like everyone else is, like we want a talent dense company.And I think that's almost become trite at this point. What I credit the cursor founders a lot with is that they just arrived there from first principles of like, we just need a talent dense, um, talent dense team. And I think I've seen some teams that weren't talent dense and like seemed a counterfactual run, which if you've run in been in a large company, you will just see that like it's just logically will happen at a large company.Um, and so that was super important to me and Justine and it's very difficult to maintain. And so we just needed, we needed wording for it. And so I have a document called Traits of the P 99 Engineer, and it's a bullet point list. And I look at that list after every single interview that I do, and in every single recap that we do and every recap we end with.End with, um, some version of I'm gonna reject this candidate completely regardless of what the discourse was, because I wanna see people fight for this person because the default should not be, we're gonna hire this person. The default should be, we're definitely not hiring this person. And you know, if everyone was like, ah, maybe throw a punch, then this is not the right.swyx: Do, do you operate, like if there's one cha there must have at least one champion who's like, yes, I will put my career on, on, on the line for this. You know,Simon Hørup Eskildsen: I think career on the line,swyx: maybe a chair, butSimon Hørup Eskildsen: yeah. You know, like, um, I would say so someone needs to like, have both fists up and be like, I'd fight.Right? Yeah. Yeah. And if one person said, then, okay, let's do it. Right?swyx: Yeah.Simon Hørup Eskildsen: Um. It doesn't have to be absolutely everyone. Right? And like the interviews are always the sign that you're checking for different attributes. And if someone is like knocking it outta the park in every single attribute, that's, that's fairly rare.Um, but that's really important. And so the traits of the P 99 engineer, there's lots of them. There's also the traits of the p like triple nine engineer and the quadruple nine engineer. This is like, it's a long list.swyx: Okay.Simon Hørup Eskildsen: Um, I'll give you some samples, right. Of what we, what we look for. I think that the P 99 engineer has some history of having bent, like their trajectory or something to their will.Right? Some moment where it was just, they just, you know, made the computer do what it needed to do. There's something like that, and it will, it will occur to have them at some point in their career. And, uh. Hopefully multiple times. Right.swyx: Gimme an example of one of your engineers that like,Simon Hørup Eskildsen: I'll give an eng.Uh, so we, we, we launched this thing called A and NV three. Um, we could, we're also, we're working on V four and V five right now, but a and NV three can search a hundred billion vectors with a P 50 of around 40 milliseconds and a p 99 of 200 milliseconds. Um, maybe other people have done this, I'm sure Google and others have done this, but, uh, we haven't seen anyone, um, at least not in like a public consumable SaaS that can do this.And that was an engineer, the chief architect of Turbo Puffer, Nathan, um, who more or less just bent this, the software was not capable of this and he just made it capable for a very particular workload in like a, you know, six to eight week period with the help of a lot of the team. Right. It's been, been, there's numerous of examples of that, like at, at turbo puff, but that's like really bending the software and X 86 to your will.It was incredible to watch. Um. You wanna see some moments like that?swyx: Isn't that triple nine?Simon Hørup Eskildsen: Um, I think Nathan, what's calledAlessio: group nine, that was only nine. I feel like this is too high forSimon Hørup Eskildsen: Nathan. Nathan is, uh, Nathan is like, yeah, there's a lot of nines. Okay. After that p So I think that's one trait. I think another trait is that, uh, the P 99 spends a lot of time looking at maps.Generally it's their preferred ux. They just love looking at maps. You ever seen someone who just like, sits on their phone and just like, scrolls around on a map? Or did you not look at maps A lot? You guys don't look atswyx: maps? I guess I'm not feeling there. I don't know, butSimon Hørup Eskildsen: you just dis What about trains?Do you like trains?swyx: Uh, I mean they, not enough. Okay. This is just like weapon nice. Autism is what I call it. Like, like,Simon Hørup Eskildsen: um, I love looking at maps, like, it's like my preferred UX and just like I, you know, I likeswyx: lotsAlessio: of, of like random places, soswyx: like,youswyx: know.Alessio: Yes. Okay. There you go. So instead of like random places, like how do you explore the maps?Simon Hørup Eskildsen: No, it's, it's just a joke.swyx: It's autism laugh. It's like you are just obsessed by something and you like studying a thing.Simon Hørup Eskildsen: The origin of this was that at some point I read an interview with some IOI gold medalistswyx: Uhhuh,Simon Hørup Eskildsen: and it's like, what do you do in your spare time? I was just like, I like looking at maps.I was like, I feel so seen. Like, I just like love, like swirling out. I was like, oh, Canada is so big. Where's Baffin Island? I don't know. I love it. Yeah. Um, anyway, so the traits of P 99, P 99 is obsessive, right? Like, there's just like, you'll, you'll find traits of that we do an interview at, at, at, at turbo puffer or like multiple interviews that just try to screen for some of these things.Um, so. There's lots of others, but these are the kinds of traits that we look for.swyx: I'll tell you, uh, some people listen for like some of my dere stuff. Uh, I do think about derel as maps. Um, you draw a map for people, uh, maps show you the, uh, what is commonly agreed to be the geographical features of what a boundary is.And it shows also shows you what is not doing. And I, I think a lot of like developer tools, companies try to tell you they can do everything, but like, let's, let's be real. Like you, your, your three landmarks are here, everyone comes here, then here, then here, and you draw a map and, and then you draw a journey through the map.And like that. To me, that's what developer relations looks like. So I do think about things that way.Simon Hørup Eskildsen: I think the P 99 thinks in offs, right? The P 99 is very clear about, you know, hey, turbo puffer, you can't run a high transaction workload on turbo puffer, right? It's like the right latency is a hundred milliseconds.That's a clear trade off. I think the P 99 is very good at articulating the trade offs in every decision. Um. Which is exactly what the map is in your case, right?swyx: Uh, yeah, yeah. My, my, my world. My world.Alessio: How, how do you reconcile some of these things when you're saying you bend the will the computer versus like the trade

JSEDirect with Simon Brown
Value on the JSE? We find the Cheapest Shares

JSEDirect with Simon Brown

Play Episode Listen Later Oct 7, 2025 25:20


Nerdelandslaget
#280: Hva heter det snowboardspillet... SWIX?? (MED MULTIGURU!)

Nerdelandslaget

Play Episode Listen Later Jun 18, 2025 131:04


Som en reddende hyperaktuell engel fra oven kommer brennaktuelle Martin «Multiguru» Belgen Isaksen skliende inn fra siden til ukas episode! Han stiller med rykende ferske NRK-anmeldelser av Dune: Awakening, Skogdal og SpreadCheat, og har mange nydelige tanker rundt alle sammen

Mellepodden
Mellepodden S7. Sport. Del 1. Langrenn. Kjell Berge Melbybråten

Mellepodden

Play Episode Listen Later Apr 11, 2025 51:12


Send us a textI denne sesongen av Mellepodden har vi i en serie av Valdres-sporter intervju med lokale eksperter innen de store Valdresidrettene. Her får du det beste fra to verdener. Tips om idretten samt flotte turmål i dalføret du kan utøve idretten på! I første del er tema langrenn, presentert av Kjell Berge Melbybråten – stadionsjef på Beitostølen World Cup-stadion. Han er og smører for aktive løpere og tidligere skitrener. Ja, og forhenværende ordfører og gründer. Resten hører du i episoden!Support the showMellepodden kan abboneres i din podkast-avspiller.Laget av Mellepodden Podkast Forening.Produsert i Lydkåken Rockeverkstad.Kjenningsmelodi laget av Lars Isachsen Jemterud.Mellepodden har Grasrotandel, Norsk Tipping.

Mellepodden
Mellepodden Uncut - Episodens Gjest - Harald Øygard (R)

Mellepodden

Play Episode Listen Later Apr 3, 2025 17:26


Send us a textI denne reprisen fra 2020 får du høre intervjuet Mellepodden gjorde med skiskytter Harald Øygard, som senere tok NM-tittelen på Geilo. Øygard ble juniorverdensmester i skiskyting med 19 treff og høy fart. Hør hvordan han har oppnådd dette gjennom målrettet trening og stor idrettsglede.Support the showMellepodden kan abboneres i din podkast-avspiller.Laget av Mellepodden Podkast Forening.Produsert i Lydkåken Rockeverkstad.Kjenningsmelodi laget av Lars Isachsen Jemterud.Mellepodden har Grasrotandel, Norsk Tipping.

world cup uncut nm harald gjest laget produsert geilo valdres brageprisen norsk tipping ski vm langrenn swix
Skisporet
Sesongpremiere: Mystisk sykdom rammet Lars Haugvad – I sommer overrasket han alle i Norseman. Hør den sterke historien her.

Skisporet

Play Episode Listen Later Aug 26, 2024 62:25


Lars Haugvad gikk fra å løpe maraton på godt under tre timer til å bli så syk at han etterhvert drømte om å kunne stå oppreist i dusjen. I 2011 startet en lang og mystisk sykdomsperiode da han plutselig en dag fikk smertefulle, røde utslett over hele kroppen. Derfra var det inn og ut av sykehus i en periode på flere år, uvitende om hva som var galt med kroppen. Men det som var sikkert, var at Lars ikke var å kjenne igjen, og at han på enkelte dager fryktet at han kom til dø. Men vendepunktet skulle komme, og i august i år fullførte han Norseman, et av verdens aller tøffeste triatlon, på imponerende vis. Med seg på veien har han samlet inn penger til Aktiv mot kreft og deres initiativ «Pusterommet» som bygger treningssenter på sykehus. – Historien til Lars gir oss et perspektiv på livet og toppidretten, sier skiløper Mikael Gunnulfsen som også er med i denne episoden sammen med programleder Håvard Rønning. I denne episoden av Skisporet kommer du tett på en helt rå idrettsprestasjon som hos mange har gått under radaren i sommer. Skisporet podcast gis ut av Swix. Du kan lese mer om Lars og finne ut hvordan du kan støtte innsamlingsaksjonen på denne siden: https://swixsport.com/no/artikkel/ambassadorer/lars-haugvad

Skisporet
Her er «nye» Team Swix: - Vi er verdens nest beste distanselag i langrenn

Skisporet

Play Episode Listen Later May 9, 2024 31:08


Henrik Dønnestad, David Thorvik og Sondre Ramse går inn på Team Swix før VM-sesongen 24/25. De danner lag med Mattis Stenshagen, Mikael Gunnulfsen og Jonas Vika på det som blir et av de sterkeste distanselagene kommende sesong. I denne podcasten blir du kjente med alle seks løpere, hvorfor de valgte Team Swix og hva som gjør dette til et helt unikt langrennslag.  Team Swix er fullt utstyrt av Swix, og jobber tett sammen for å optimalisere fremtidens skismøring, staver, rulleski, og alt nødvendig utstyr for å lykkes på toppnivå. Skisporet podcast gis ut av Swix.

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

We are reuniting for the 2nd AI UX demo day in SF on Apr 28. Sign up to demo here! And don't forget tickets for the AI Engineer World's Fair — for early birds who join before keynote announcements!About a year ago there was a lot of buzz around prompt engineering techniques to force structured output. Our friend Simon Willison tweeted a bunch of tips and tricks, but the most iconic one is Riley Goodside making it a matter of life or death:Guardrails (friend of the pod and AI Engineer speaker), Marvin (AI Engineer speaker), and jsonformer had also come out at the time. In June 2023, Jason Liu (today's guest!) open sourced his “OpenAI Function Call and Pydantic Integration Module”, now known as Instructor, which quickly turned prompt engineering black magic into a clean, developer-friendly SDK. A few months later, model providers started to add function calling capabilities to their APIs as well as structured outputs support like “JSON Mode”, which was announced at OpenAI Dev Day (see recap here). In just a handful of months, we went from threatening to kill grandmas to first-class support from the research labs. And yet, Instructor was still downloaded 150,000 times last month. Why?What Instructor looks likeInstructor patches your LLM provider SDKs to offer a new response_model option to which you can pass a structure defined in Pydantic. It currently supports OpenAI, Anthropic, Cohere, and a long tail of models through LiteLLM.What Instructor is forThere are three core use cases to Instructor:* Extracting structured data: Taking an input like an image of a receipt and extracting structured data from it, such as a list of checkout items with their prices, fees, and coupon codes.* Extracting graphs: Identifying nodes and edges in a given input to extract complex entities and their relationships. For example, extracting relationships between characters in a story or dependencies between tasks.* Query understanding: Defining a schema for an API call and using a language model to resolve a request into a more complex one that an embedding could not handle. For example, creating date intervals from queries like “what was the latest thing that happened this week?” to then pass onto a RAG system or similar.Jason called all these different ways of getting data from LLMs “typed responses”: taking strings and turning them into data structures. Structured outputs as a planning toolThe first wave of agents was all about open-ended iteration and planning, with projects like AutoGPT and BabyAGI. Models would come up with a possible list of steps, and start going down the list one by one. It's really easy for them to go down the wrong branch, or get stuck on a single step with no way to intervene.What if these planning steps were returned to us as DAGs using structured output, and then managed as workflows? This also makes it easy to better train model on how to create these plans, as they are much more structured than a bullet point list. Once you have this structure, each piece can be modified individually by different specialized models. You can read some of Jason's experiments here:While LLMs will keep improving (Llama3 just got released as we write this), having a consistent structure for the output will make it a lot easier to swap models in and out. Jason's overall message on how we can move from ReAct loops to more controllable Agent workflows mirrors the “Process” discussion from our Elicit episode:Watch the talkAs a bonus, here's Jason's talk from last year's AI Engineer Summit. He'll also be a speaker at this year's AI Engineer World's Fair!Timestamps* [00:00:00] Introductions* [00:02:23] Early experiments with Generative AI at StitchFix* [00:08:11] Design philosophy behind the Instructor library* [00:11:12] JSON Mode vs Function Calling* [00:12:30] Single vs parallel function calling* [00:14:00] How many functions is too many?* [00:17:39] How to evaluate function calling* [00:20:23] What is Instructor good for?* [00:22:42] The Evolution from Looping to Workflow in AI Engineering* [00:27:03] State of the AI Engineering Stack* [00:28:26] Why Instructor isn't VC backed* [00:31:15] Advice on Pursuing Open Source Projects and Consulting* [00:36:00] The Concept of High Agency and Its Importance* [00:42:44] Prompts as Code and the Structure of AI Inputs and Outputs* [00:44:20] The Emergence of AI Engineering as a Distinct FieldShow notes* Jason on the UWaterloo mafia* Jason on Twitter, LinkedIn, website* Instructor docs* Max Woolf on the potential of Structured Output* swyx on Elo vs Cost* Jason on Anthropic Function Calling* Jason on Rejections, Advice to Young People* Jason on Bad Startup Ideas* Jason on Prompts as Code* Rysana's inversion models* Bryan Bischof's episode* Hamel HusainTranscriptAlessio [00:00:00]: Hey everyone, welcome to the Latent Space Podcast. This is Alessio, partner and CTO at Residence at Decibel Partners, and I'm joined by my co-host Swyx, founder of Smol AI.Swyx [00:00:16]: Hello, we're back in the remote studio with Jason Liu from Instructor. Welcome Jason.Jason [00:00:21]: Hey there. Thanks for having me.Swyx [00:00:23]: Jason, you are extremely famous, so I don't know what I'm going to do introducing you, but you're one of the Waterloo clan. There's like this small cadre of you that's just completely dominating machine learning. Actually, can you list like Waterloo alums that you're like, you know, are just dominating and crushing it right now?Jason [00:00:39]: So like John from like Rysana is doing his inversion models, right? I know like Clive Chen from Waterloo. When I started the data science club, he was one of the guys who were like joining in and just like hanging out in the room. And now he was at Tesla working with Karpathy, now he's at OpenAI, you know.Swyx [00:00:56]: He's in my climbing club.Jason [00:00:58]: Oh, hell yeah. I haven't seen him in like six years now.Swyx [00:01:01]: To get in the social scene in San Francisco, you have to climb. So both in career and in rocks. So you started a data science club at Waterloo, we can talk about that, but then also spent five years at Stitch Fix as an MLE. You pioneered the use of OpenAI's LLMs to increase stylist efficiency. So you must have been like a very, very early user. This was like pretty early on.Jason [00:01:20]: Yeah, I mean, this was like GPT-3, okay. So we actually were using transformers at Stitch Fix before the GPT-3 model. So we were just using transformers for recommendation systems. At that time, I was very skeptical of transformers. I was like, why do we need all this infrastructure? We can just use like matrix factorization. When GPT-2 came out, I fine tuned my own GPT-2 to write like rap lyrics and I was like, okay, this is cute. Okay, I got to go back to my real job, right? Like who cares if I can write a rap lyric? When GPT-3 came out, again, I was very much like, why are we using like a post request to review every comment a person leaves? Like we can just use classical models. So I was very against language models for like the longest time. And then when ChatGPT came out, I basically just wrote a long apology letter to everyone at the company. I was like, hey guys, you know, I was very dismissive of some of this technology. I didn't think it would scale well, and I am wrong. This is incredible. And I immediately just transitioned to go from computer vision recommendation systems to LLMs. But funny enough, now that we have RAG, we're kind of going back to recommendation systems.Swyx [00:02:21]: Yeah, speaking of that, I think Alessio is going to bring up the next one.Alessio [00:02:23]: Yeah, I was going to say, we had Bryan Bischof from Hex on the podcast. Did you overlap at Stitch Fix?Jason [00:02:28]: Yeah, he was like one of my main users of the recommendation frameworks that I had built out at Stitch Fix.Alessio [00:02:32]: Yeah, we talked a lot about RecSys, so it makes sense.Swyx [00:02:36]: So now I have adopted that line, RAG is RecSys. And you know, if you're trying to reinvent new concepts, you should study RecSys first, because you're going to independently reinvent a lot of concepts. So your system was called Flight. It's a recommendation framework with over 80% adoption, servicing 350 million requests every day. Wasn't there something existing at Stitch Fix? Why did you have to write one from scratch?Jason [00:02:56]: No, so I think because at Stitch Fix, a lot of the machine learning engineers and data scientists were writing production code, sort of every team's systems were very bespoke. It's like, this team only needs to do like real time recommendations with small data. So they just have like a fast API app with some like pandas code. This other team has to do a lot more data. So they have some kind of like Spark job that does some batch ETL that does a recommendation. And so what happens is each team writes their code differently. And I have to come in and refactor their code. And I was like, oh man, I'm refactoring four different code bases, four different times. Wouldn't it be better if all the code quality was my fault? Let me just write this framework, force everyone else to use it. And now one person can maintain five different systems, rather than five teams having their own bespoke system. And so it was really a need of just sort of standardizing everything. And then once you do that, you can do observability across the entire pipeline and make large sweeping improvements in this infrastructure, right? If we notice that something is slow, we can detect it on the operator layer. Just hey, hey, like this team, you guys are doing this operation is lowering our latency by like 30%. If you just optimize your Python code here, we can probably make an extra million dollars. So let's jump on a call and figure this out. And then a lot of it was doing all this observability work to figure out what the heck is going on and optimize this system from not only just a code perspective, sort of like harassingly or against saying like, we need to add caching here. We're doing duplicated work here. Let's go clean up the systems. Yep.Swyx [00:04:22]: Got it. One more system that I'm interested in finding out more about is your similarity search system using Clip and GPT-3 embeddings and FIASS, where you saved over $50 million in annual revenue. So of course they all gave all that to you, right?Jason [00:04:34]: No, no, no. I mean, it's not going up and down, but you know, I got a little bit, so I'm pretty happy about that. But there, you know, that was when we were doing fine tuning like ResNets to do image classification. And so a lot of it was given an image, if we could predict the different attributes we have in the merchandising and we can predict the text embeddings of the comments, then we can kind of build a image vector or image embedding that can capture both descriptions of the clothing and sales of the clothing. And then we would use these additional vectors to augment our recommendation system. And so with the recommendation system really was just around like, what are similar items? What are complimentary items? What are items that you would wear in a single outfit? And being able to say on a product page, let me show you like 15, 20 more things. And then what we found was like, hey, when you turn that on, you make a bunch of money.Swyx [00:05:23]: Yeah. So, okay. So you didn't actually use GPT-3 embeddings. You fine tuned your own? Because I was surprised that GPT-3 worked off the shelf.Jason [00:05:30]: Because I mean, at this point we would have 3 million pieces of inventory over like a billion interactions between users and clothes. So any kind of fine tuning would definitely outperform like some off the shelf model.Swyx [00:05:41]: Cool. I'm about to move on from Stitch Fix, but you know, any other like fun stories from the Stitch Fix days that you want to cover?Jason [00:05:46]: No, I think that's basically it. I mean, the biggest one really was the fact that I think for just four years, I was so bearish on language models and just NLP in general. I'm just like, none of this really works. Like, why would I spend time focusing on this? I got to go do the thing that makes money, recommendations, bounding boxes, image classification. Yeah. Now I'm like prompting an image model. I was like, oh man, I was wrong.Swyx [00:06:06]: So my Stitch Fix question would be, you know, I think you have a bit of a drip and I don't, you know, my primary wardrobe is free startup conference t-shirts. Should more technology brothers be using Stitch Fix? What's your fashion advice?Jason [00:06:19]: Oh man, I mean, I'm not a user of Stitch Fix, right? It's like, I enjoy going out and like touching things and putting things on and trying them on. Right. I think Stitch Fix is a place where you kind of go because you want the work offloaded. I really love the clothing I buy where I have to like, when I land in Japan, I'm doing like a 45 minute walk up a giant hill to find this weird denim shop. That's the stuff that really excites me. But I think the bigger thing that's really captured is this idea that narrative matters a lot to human beings. Okay. And I think the recommendation system, that's really hard to capture. It's easy to use AI to sell like a $20 shirt, but it's really hard for AI to sell like a $500 shirt. But people are buying $500 shirts, you know what I mean? There's definitely something that we can't really capture just yet that we probably will figure out how to in the future.Swyx [00:07:07]: Well, it'll probably output in JSON, which is what we're going to turn to next. Then you went on a sabbatical to South Park Commons in New York, which is unusual because it's based on USF.Jason [00:07:17]: Yeah. So basically in 2020, really, I was enjoying working a lot as I was like building a lot of stuff. This is where we were making like the tens of millions of dollars doing stuff. And then I had a hand injury. And so I really couldn't code anymore for like a year, two years. And so I kind of took sort of half of it as medical leave, the other half I became more of like a tech lead, just like making sure the systems were like lights were on. And then when I went to New York, I spent some time there and kind of just like wound down the tech work, you know, did some pottery, did some jujitsu. And after GPD came out, I was like, oh, I clearly need to figure out what is going on here because something feels very magical. I don't understand it. So I spent basically like five months just prompting and playing around with stuff. And then afterwards, it was just my startup friends going like, hey, Jason, you know, my investors want us to have an AI strategy. Can you help us out? And it just snowballed and bore more and more until I was making this my full time job. Yeah, got it.Swyx [00:08:11]: You know, you had YouTube University and a journaling app, you know, a bunch of other explorations. But it seems like the most productive or the best known thing that came out of your time there was Instructor. Yeah.Jason [00:08:22]: Written on the bullet train in Japan. I think at some point, you know, tools like Guardrails and Marvin came out. Those are kind of tools that I use XML and Pytantic to get structured data out. But they really were doing things sort of in the prompt. And these are built with sort of the instruct models in mind. Like I'd already done that in the past. Right. At Stitch Fix, you know, one of the things we did was we would take a request note and turn that into a JSON object that we would use to send it to our search engine. Right. So if you said like, I want to, you know, skinny jeans that were this size, that would turn into JSON that we would send to our internal search APIs. But it always felt kind of gross. A lot of it is just like you read the JSON, you like parse it, you make sure the names are strings and ages are numbers and you do all this like messy stuff. But when function calling came out, it was very much sort of a new way of doing things. Right. Function calling lets you define the schema separate from the data and the instructions. And what this meant was you can kind of have a lot more complex schemas and just map them in Pytantic. And then you can just keep those very separate. And then once you add like methods, you can add validators and all that kind of stuff. The one thing I really had with a lot of these libraries, though, was it was doing a lot of the string formatting themselves, which was fine when it was the instruction to models. You just have a string. But when you have these new chat models, you have these chat messages. And I just didn't really feel like not being able to access that for the developer was sort of a good benefit that they would get. And so I just said, let me write like the most simple SDK around the OpenAI SDK, a simple wrapper on the SDK, just handle the response model a bit and kind of think of myself more like requests than actual framework that people can use. And so the goal is like, hey, like this is something that you can use to build your own framework. But let me just do all the boring stuff that nobody really wants to do. People want to build their own frameworks, but people don't want to build like JSON parsing.Swyx [00:10:08]: And the retrying and all that other stuff.Jason [00:10:10]: Yeah.Swyx [00:10:11]: Right. We had this a little bit of this discussion before the show, but like that design principle of going for being requests rather than being Django. Yeah. So what inspires you there? This has come from a lot of prior pain. Are there other open source projects that inspired your philosophy here? Yeah.Jason [00:10:25]: I mean, I think it would be requests, right? Like, I think it is just the obvious thing you install. If you were going to go make HTTP requests in Python, you would obviously import requests. Maybe if you want to do more async work, there's like future tools, but you don't really even think about installing it. And when you do install it, you don't think of it as like, oh, this is a requests app. Right? Like, no, this is just Python. The bigger question is, like, a lot of people ask questions like, oh, why isn't requests like in the standard library? Yeah. That's how I want my library to feel, right? It's like, oh, if you're going to use the LLM SDKs, you're obviously going to install instructor. And then I think the second question would be like, oh, like, how come instructor doesn't just go into OpenAI, go into Anthropic? Like, if that's the conversation we're having, like, that's where I feel like I've succeeded. Yeah. It's like, yeah, so standard, you may as well just have it in the base libraries.Alessio [00:11:12]: And the shape of the request stayed the same, but initially function calling was maybe equal structure outputs for a lot of people. I think now the models also support like JSON mode and some of these things and, you know, return JSON or my grandma is going to die. All of that stuff is maybe to decide how have you seen that evolution? Like maybe what's the metagame today? Should people just forget about function calling for structure outputs or when is structure output like JSON mode the best versus not? We'd love to get any thoughts given that you do this every day.Jason [00:11:42]: Yeah, I would almost say these are like different implementations of like the real thing we care about is the fact that now we have typed responses to language models. And because we have that type response, my IDE is a little bit happier. I get autocomplete. If I'm using the response wrong, there's a little red squiggly line. Like those are the things I care about in terms of whether or not like JSON mode is better. I usually think it's almost worse unless you want to spend less money on like the prompt tokens that the function call represents, primarily because with JSON mode, you don't actually specify the schema. So sure, like JSON load works, but really, I care a lot more than just the fact that it is JSON, right? I think function calling gives you a tool to specify the fact like, okay, this is a list of objects that I want and each object has a name or an age and I want the age to be above zero and I want to make sure it's parsed correctly. That's where kind of function calling really shines.Alessio [00:12:30]: Any thoughts on single versus parallel function calling? So I did a presentation at our AI in Action Discord channel, and obviously showcase instructor. One of the big things that we have before with single function calling is like when you're trying to extract lists, you have to make these funky like properties that are lists to then actually return all the objects. How do you see the hack being put on the developer's plate versus like more of this stuff just getting better in the model? And I know you tweeted recently about Anthropic, for example, you know, some lists are not lists or strings and there's like all of these discrepancies.Jason [00:13:04]: I almost would prefer it if it was always a single function call. Obviously, there is like the agents workflows that, you know, Instructor doesn't really support that well, but are things that, you know, ought to be done, right? Like you could define, I think maybe like 50 or 60 different functions in a single API call. And, you know, if it was like get the weather or turn the lights on or do something else, it makes a lot of sense to have these parallel function calls. But in terms of an extraction workflow, I definitely think it's probably more helpful to have everything be a single schema, right? Just because you can sort of specify relationships between these entities that you can't do in a parallel function calling, you can have a single chain of thought before you generate a list of results. Like there's like small like API differences, right? Where if it's for parallel function calling, if you do one, like again, really, I really care about how the SDK looks and says, okay, do I always return a list of functions or do you just want to have the actual object back out and you want to have like auto complete over that object? Interesting.Alessio [00:14:00]: What's kind of the cap for like how many function definitions you can put in where it still works well? Do you have any sense on that?Jason [00:14:07]: I mean, for the most part, I haven't really had a need to do anything that's more than six or seven different functions. I think in the documentation, they support way more. I don't even know if there's any good evals that have over like two dozen function calls. I think if you're running into issues where you have like 20 or 50 or 60 function calls, I think you're much better having those specifications saved in a vector database and then have them be retrieved, right? So if there are 30 tools, like you should basically be like ranking them and then using the top K to do selection a little bit better rather than just like shoving like 60 functions into a single. Yeah.Swyx [00:14:40]: Yeah. Well, I mean, so I think this is relevant now because previously I think context limits prevented you from having more than a dozen tools anyway. And now that we have million token context windows, you know, a cloud recently with their new function calling release said they can handle over 250 tools, which is insane to me. That's, that's a lot. You're saying like, you know, you don't think there's many people doing that. I think anyone with a sort of agent like platform where you have a bunch of connectors, they wouldn't run into that problem. Probably you're right that they should use a vector database and kind of rag their tools. I know Zapier has like a few thousand, like 8,000, 9,000 connectors that, you know, obviously don't fit anywhere. So yeah, I mean, I think that would be it unless you need some kind of intelligence that chains things together, which is, I think what Alessio is coming back to, right? Like there's this trend about parallel function calling. I don't know what I think about that. Anthropic's version was, I think they use multiple tools in sequence, but they're not in parallel. I haven't explored this at all. I'm just like throwing this open to you as to like, what do you think about all these new things? Yeah.Jason [00:15:40]: It's like, you know, do we assume that all function calls could happen in any order? In which case, like we either can assume that, or we can assume that like things need to happen in some kind of sequence as a DAG, right? But if it's a DAG, really that's just like one JSON object that is the entire DAG rather than going like, okay, the order of the function that return don't matter. That's definitely just not true in practice, right? Like if I have a thing that's like turn the lights on, like unplug the power, and then like turn the toaster on or something like the order doesn't matter. And it's unclear how well you can describe the importance of that reasoning to a language model yet. I mean, I'm sure you can do it with like good enough prompting, but I just haven't any use cases where the function sequence really matters. Yeah.Alessio [00:16:18]: To me, the most interesting thing is the models are better at picking than your ranking is usually. Like I'm incubating a company around system integration. For example, with one system, there are like 780 endpoints. And if you're actually trying to do vector similarity, it's not that good because the people that wrote the specs didn't have in mind making them like semantically apart. You know, they're kind of like, oh, create this, create this, create this. Versus when you give it to a model, like in Opus, you put them all, it's quite good at picking which ones you should actually run. And I'm curious to see if the model providers actually care about some of those workflows or if the agent companies are actually going to build very good rankers to kind of fill that gap.Jason [00:16:58]: Yeah. My money is on the rankers because you can do those so easily, right? You could just say, well, given the embeddings of my search query and the embeddings of the description, I can just train XGBoost and just make sure that I have very high like MRR, which is like mean reciprocal rank. And so the only objective is to make sure that the tools you use are in the top end filtered. Like that feels super straightforward and you don't have to actually figure out how to fine tune a language model to do tool selection anymore. Yeah. I definitely think that's the case because for the most part, I imagine you either have like less than three tools or more than a thousand. I don't know what kind of company said, oh, thank God we only have like 185 tools and this works perfectly, right? That's right.Alessio [00:17:39]: And before we maybe move on just from this, it was interesting to me, you retweeted this thing about Anthropic function calling and it was Joshua Brown's retweeting some benchmark that it's like, oh my God, Anthropic function calling so good. And then you retweeted it and then you tweeted it later and it's like, it's actually not that good. What's your flow? How do you actually test these things? Because obviously the benchmarks are lying, right? Because the benchmarks say it's good and you said it's bad and I trust you more than the benchmark. How do you think about that? And then how do you evolve it over time?Jason [00:18:09]: It's mostly just client data. I actually have been mostly busy with enough client work that I haven't been able to reproduce public benchmarks. And so I can't even share some of the results in Anthropic. I would just say like in production, we have some pretty interesting schemas where it's like iteratively building lists where we're doing like updates of lists, like we're doing in place updates. So like upserts and inserts. And in those situations we're like, oh yeah, we have a bunch of different parsing errors. Numbers are being returned to strings. We were expecting lists of objects, but we're getting strings that are like the strings of JSON, right? So we had to call JSON parse on individual elements. Overall, I'm like super happy with the Anthropic models compared to the OpenAI models. Sonnet is very cost effective. Haiku is in function calling, it's actually better, but I think they just had to sort of file down the edges a little bit where like our tests pass, but then we actually deployed a production. We got half a percent of traffic having issues where if you ask for JSON, it'll try to talk to you. Or if you use function calling, you know, we'll have like a parse error. And so I think that definitely gonna be things that are fixed in like the upcoming weeks. But in terms of like the reasoning capabilities, man, it's hard to beat like 70% cost reduction, especially when you're building consumer applications, right? If you're building something for consultants or private equity, like you're charging $400, it doesn't really matter if it's a dollar or $2. But for consumer apps, it makes products viable. If you can go from four to Sonnet, you might actually be able to price it better. Yeah.Swyx [00:19:31]: I had this chart about the ELO versus the cost of all the models. And you could put trend graphs on each of those things about like, you know, higher ELO equals higher cost, except for Haiku. Haiku kind of just broke the lines, or the ISO ELOs, if you want to call it. Cool. Before we go too far into your opinions on just the overall ecosystem, I want to make sure that we map out the surface area of Instructor. I would say that most people would be familiar with Instructor from your talks and your tweets and all that. You had the number one talk from the AI Engineer Summit.Jason [00:20:03]: Two Liu. Jason Liu and Jerry Liu. Yeah.Swyx [00:20:06]: Yeah. Until I actually went through your cookbook, I didn't realize the surface area. How would you categorize the use cases? You have LLM self-critique, you have knowledge graphs in here, you have PII data sanitation. How do you characterize to people what is the surface area of Instructor? Yeah.Jason [00:20:23]: This is the part that feels crazy because really the difference is LLMs give you strings and Instructor gives you data structures. And once you get data structures, again, you can do every lead code problem you ever thought of. Right. And so I think there's a couple of really common applications. The first one obviously is extracting structured data. This is just be, okay, well, like I want to put in an image of a receipt. I want to give it back out a list of checkout items with a price and a fee and a coupon code or whatever. That's one application. Another application really is around extracting graphs out. So one of the things we found out about these language models is that not only can you define nodes, it's really good at figuring out what are nodes and what are edges. And so we have a bunch of examples where, you know, not only do I extract that, you know, this happens after that, but also like, okay, these two are dependencies of another task. And you can do, you know, extracting complex entities that have relationships. Given a story, for example, you could extract relationships of families across different characters. This can all be done by defining a graph. The last really big application really is just around query understanding. The idea is that like any API call has some schema and if you can define that schema ahead of time, you can use a language model to resolve a request into a much more complex request. One that an embedding could not do. So for example, I have a really popular post called like rag is more than embeddings. And effectively, you know, if I have a question like this, what was the latest thing that happened this week? That embeds to nothing, right? But really like that query should just be like select all data where the date time is between today and today minus seven days, right? What if I said, how did my writing change between this month and last month? Again, embeddings would do nothing. But really, if you could do like a group by over the month and a summarize, then you could again like do something much more interesting. And so this really just calls out the fact that embeddings really is kind of like the lowest hanging fruit. And using something like instructor can really help produce a data structure. And then you can just use your computer science and reason about the data structure. Maybe you say, okay, well, I'm going to produce a graph where I want to group by each month and then summarize them jointly. You can do that if you know how to define this data structure. Yeah.Swyx [00:22:29]: So you kind of run up against like the LangChains of the world that used to have that. They still do have like the self querying, I think they used to call it when we had Harrison on in our episode. How do you see yourself interacting with the other LLM frameworks in the ecosystem? Yeah.Jason [00:22:42]: I mean, if they use instructor, I think that's totally cool. Again, it's like, it's just Python, right? It's like asking like, oh, how does like Django interact with requests? Well, you just might make a request.get in a Django app, right? But no one would say, I like went off of Django because I'm using requests now. They should be ideally like sort of the wrong comparison in terms of especially like the agent workflows. I think the real goal for me is to go down like the LLM compiler route, which is instead of doing like a react type reasoning loop. I think my belief is that we should be using like workflows. If we do this, then we always have a request and a complete workflow. We can fine tune a model that has a better workflow. Whereas it's hard to think about like, how do you fine tune a better react loop? Yeah. You always train it to have less looping, in which case like you wanted to get the right answer the first time, in which case it was a workflow to begin with, right?Swyx [00:23:31]: Can you define workflow? Because I used to work at a workflow company, but I'm not sure this is a good term for everybody.Jason [00:23:36]: I'm thinking workflow in terms of like the prefect Zapier workflow. Like I want to build a DAG, I want you to tell me what the nodes and edges are. And then maybe the edges are also put in with AI. But the idea is that like, I want to be able to present you the entire plan and then ask you to fix things as I execute it, rather than going like, hey, I couldn't parse the JSON, so I'm going to try again. I couldn't parse the JSON, I'm going to try again. And then next thing you know, you spent like $2 on opening AI credits, right? Yeah. Whereas with the plan, you can just say, oh, the edge between node like X and Y does not run. Let me just iteratively try to fix that, fix the one that sticks, go on to the next component. And obviously you can get into a world where if you have enough examples of the nodes X and Y, maybe you can use like a vector database to find a good few shot examples. You can do a lot if you sort of break down the problem into that workflow and executing that workflow, rather than looping and hoping the reasoning is good enough to generate the correct output. Yeah.Swyx [00:24:35]: You know, I've been hammering on Devon a lot. I got access a couple of weeks ago. And obviously for simple tasks, it does well. For the complicated, like more than 10, 20 hour tasks, I can see- That's a crazy comparison.Jason [00:24:47]: We used to talk about like three, four loops. Only once it gets to like hour tasks, it's hard.Swyx [00:24:54]: Yeah. Less than an hour, there's nothing.Jason [00:24:57]: That's crazy.Swyx [00:24:58]: I mean, okay. Maybe my goalposts have shifted. I don't know. That's incredible.Jason [00:25:02]: Yeah. No, no. I'm like sub one minute executions. Like the fact that you're talking about 10 hours is incredible.Swyx [00:25:08]: I think it's a spectrum. I think I'm going to say this every single time I bring up Devon. Let's not reward them for taking longer to do things. Do you know what I mean? I think that's a metric that is easily abusable.Jason [00:25:18]: Sure. Yeah. You know what I mean? But I think if you can monotonically increase the success probability over an hour, that's winning to me. Right? Like obviously if you run an hour and you've made no progress. Like I think when we were in like auto GBT land, there was that one example where it's like, I wanted it to like buy me a bicycle overnight. I spent $7 on credit and I never found the bicycle. Yeah.Swyx [00:25:41]: Yeah. Right. I wonder if you'll be able to purchase a bicycle. Because it actually can do things in real world. It just needs to suspend to you for off and stuff. The point I was trying to make was that I can see it turning plans. I think one of the agents loopholes or one of the things that is a real barrier for agents is LLMs really like to get stuck into a lane. And you know what you're talking about, what I've seen Devon do is it gets stuck in a lane and it will just kind of change plans based on the performance of the plan itself. And it's kind of cool.Jason [00:26:05]: I feel like we've gone too much in the looping route and I think a lot of more plans and like DAGs and data structures are probably going to come back to help fill in some holes. Yeah.Alessio [00:26:14]: What do you think of the interface to that? Do you see it's like an existing state machine kind of thing that connects to the LLMs, the traditional DAG players? Do you think we need something new for like AI DAGs?Jason [00:26:25]: Yeah. I mean, I think that the hard part is going to be describing visually the fact that this DAG can also change over time and it should still be allowed to be fuzzy. I think in like mathematics, we have like plate diagrams and like Markov chain diagrams and like recurrent states and all that. Some of that might come into this workflow world. But to be honest, I'm not too sure. I think right now, the first steps are just how do we take this DAG idea and break it down to modular components that we can like prompt better, have few shot examples for and ultimately like fine tune against. But in terms of even the UI, it's hard to say what it will likely win. I think, you know, people like Prefect and Zapier have a pretty good shot at doing a good job.Swyx [00:27:03]: Yeah. You seem to use Prefect a lot. I actually worked at a Prefect competitor at Temporal and I'm also very familiar with Dagster. What else would you call out as like particularly interesting in the AI engineering stack?Jason [00:27:13]: Man, I almost use nothing. I just use Cursor and like PyTests. Okay. I think that's basically it. You know, a lot of the observability companies have... The more observability companies I've tried, the more I just use Postgres.Swyx [00:27:29]: Really? Okay. Postgres for observability?Jason [00:27:32]: But the issue really is the fact that these observability companies isn't actually doing observability for the system. It's just doing the LLM thing. Like I still end up using like Datadog or like, you know, Sentry to do like latency. And so I just have those systems handle it. And then the like prompt in, prompt out, latency, token costs. I just put that in like a Postgres table now.Swyx [00:27:51]: So you don't need like 20 funded startups building LLM ops? Yeah.Jason [00:27:55]: But I'm also like an old, tired guy. You know what I mean? Like I think because of my background, it's like, yeah, like the Python stuff, I'll write myself. But you know, I will also just use Vercel happily. Yeah. Yeah. So I'm not really into that world of tooling, whereas I think, you know, I spent three good years building observability tools for recommendation systems. And I was like, oh, compared to that, Instructor is just one call. I just have to put time star, time and then count the prompt token, right? Because I'm not doing a very complex looping behavior. I'm doing mostly workflows and extraction. Yeah.Swyx [00:28:26]: I mean, while we're on this topic, we'll just kind of get this out of the way. You famously have decided to not be a venture backed company. You want to do the consulting route. The obvious route for someone as successful as Instructor is like, oh, here's hosted Instructor with all tooling. Yeah. You just said you had a whole bunch of experience building observability tooling. You have the perfect background to do this and you're not.Jason [00:28:43]: Yeah. Isn't that sick? I think that's sick.Swyx [00:28:44]: I mean, I know why, because you want to go free dive.Jason [00:28:47]: Yeah. Yeah. Because I think there's two things. Right. Well, one, if I tell myself I want to build requests, requests is not a venture backed startup. Right. I mean, one could argue whether or not Postman is, but I think for the most part, it's like having worked so much, I'm more interested in looking at how systems are being applied and just having access to the most interesting data. And I think I can do that more through a consulting business where I can come in and go, oh, you want to build perfect memory. You want to build an agent. You want to build like automations over construction or like insurance and supply chain, or like you want to handle writing private equity, mergers and acquisitions reports based off of user interviews. Those things are super fun. Whereas like maintaining the library, I think is mostly just kind of like a utility that I try to keep up, especially because if it's not venture backed, I have no reason to sort of go down the route of like trying to get a thousand integrations. In my mind, I just go like, okay, 98% of the people use open AI. I'll support that. And if someone contributes another platform, that's great. I'll merge it in. Yeah.Swyx [00:29:45]: I mean, you only added Anthropic support this year. Yeah.Jason [00:29:47]: Yeah. You couldn't even get an API key until like this year, right? That's true. Okay. If I add it like last year, I was trying to like double the code base to service, you know, half a percent of all downloads.Swyx [00:29:58]: Do you think the market share will shift a lot now that Anthropic has like a very, very competitive offering?Jason [00:30:02]: I think it's still hard to get API access. I don't know if it's fully GA now, if it's GA, if you can get a commercial access really easily.Alessio [00:30:12]: I got commercial after like two weeks to reach out to their sales team.Jason [00:30:14]: Okay.Alessio [00:30:15]: Yeah.Swyx [00:30:16]: Two weeks. It's not too bad. There's a call list here. And then anytime you run into rate limits, just like ping one of the Anthropic staff members.Jason [00:30:21]: Yeah. Then maybe we need to like cut that part out. So I don't need to like, you know, spread false news.Swyx [00:30:25]: No, it's cool. It's cool.Jason [00:30:26]: But it's a common question. Yeah. Surely just from the price perspective, it's going to make a lot of sense. Like if you are a business, you should totally consider like Sonnet, right? Like the cost savings is just going to justify it if you actually are doing things at volume. And yeah, I think the SDK is like pretty good. Back to the instructor thing. I just don't think it's a billion dollar company. And I think if I raise money, the first question is going to be like, how are you going to get a billion dollar company? And I would just go like, man, like if I make a million dollars as a consultant, I'm super happy. I'm like more than ecstatic. I can have like a small staff of like three people. It's fun. And I think a lot of my happiest founder friends are those who like raised a tiny seed round, became profitable. They're making like 70, 60, 70, like MRR, 70,000 MRR and they're like, we don't even need to raise the seed round. Let's just keep it like between me and my co-founder, we'll go traveling and it'll be a great time. I think it's a lot of fun.Alessio [00:31:15]: Yeah. like say LLMs / AI and they build some open source stuff and it's like I should just raise money and do this and I tell people a lot it's like look you can make a lot more money doing something else than doing a startup like most people that do a company could make a lot more money just working somewhere else than the company itself do you have any advice for folks that are maybe in a similar situation they're trying to decide oh should I stay in my like high paid FAANG job and just tweet this on the side and do this on github should I go be a consultant like being a consultant seems like a lot of work so you got to talk to all these people you know there's a lot to unpackJason [00:31:54]: I think the open source thing is just like well I'm just doing it purely for fun and I'm doing it because I think I'm right but part of being right is the fact that it's not a venture backed startup like I think I'm right because this is all you need right so I think a part of the philosophy is the fact that all you need is a very sharp blade to sort of do your work and you don't actually need to build like a big enterprise so that's one thing I think the other thing too that I've kind of been thinking around just because I have a lot of friends at google that want to leave right now it's like man like what we lack is not money or skill like what we lack is courage you should like you just have to do this a hard thing and you have to do it scared anyways right in terms of like whether or not you do want to do a founder I think that's just a matter of optionality but I definitely recognize that the like expected value of being a founder is still quite low it is right I know as many founder breakups and as I know friends who raised a seed round this year right like that is like the reality and like you know even in from that perspective it's been tough where it's like oh man like a lot of incubators want you to have co-founders now you spend half the time like fundraising and then trying to like meet co-founders and find co-founders rather than building the thing this is a lot of time spent out doing uh things I'm not really good at. I do think there's a rising trend in solo founding yeah.Swyx [00:33:06]: You know I am a solo I think that something like 30 percent of like I forget what the exact status something like 30 percent of starters that make it to like series B or something actually are solo founder I feel like this must have co-founder idea mostly comes from YC and most everyone else copies it and then plenty of companies break up over co-founderJason [00:33:27]: Yeah and I bet it would be like I wonder how much of it is the people who don't have that much like and I hope this is not a diss to anybody but it's like you sort of you go through the incubator route because you don't have like the social equity you would need is just sort of like send an email to Sequoia and be like hey I'm going on this ride you want a ticket on the rocket ship right like that's very hard to sell my message if I was to raise money is like you've seen my twitter my life is sick I've decided to make it much worse by being a founder because this is something I have to do so do you want to come along otherwise I want to fund it myself like if I can't say that like I don't need the money because I can like handle payroll and like hire an intern and get an assistant like that's all fine but I really don't want to go back to meta I want to like get two years to like try to find a problem we're solving that feels like a bad timeAlessio [00:34:12]: Yeah Jason is like I wear a YSL jacket on stage at AI Engineer Summit I don't need your accelerator moneyJason [00:34:18]: And boots, you don't forget the boots. But I think that is a part of it right I think it is just like optionality and also just like I'm a lot older now I think 22 year old Jason would have been probably too scared and now I'm like too wise but I think it's a matter of like oh if you raise money you have to have a plan of spending it and I'm just not that creative with spending that much money yeah I mean to be clear you just celebrated your 30th birthday happy birthday yeah it's awesome so next week a lot older is relative to some some of the folks I think seeing on the career tipsAlessio [00:34:48]: I think Swix had a great post about are you too old to get into AI I saw one of your tweets in January 23 you applied to like Figma, Notion, Cohere, Anthropic and all of them rejected you because you didn't have enough LLM experience I think at that time it would be easy for a lot of people to say oh I kind of missed the boat you know I'm too late not gonna make it you know any advice for people that feel like thatJason [00:35:14]: Like the biggest learning here is actually from a lot of folks in jiu-jitsu they're like oh man like is it too late to start jiu-jitsu like I'll join jiu-jitsu once I get in more shape right it's like there's a lot of like excuses and then you say oh like why should I start now I'll be like 45 by the time I'm any good and say well you'll be 45 anyways like time is passing like if you don't start now you start tomorrow you're just like one more day behind if you're worried about being behind like today is like the soonest you can start right and so you got to recognize that like maybe you just don't want it and that's fine too like if you wanted you would have started I think a lot of these people again probably think of things on a too short time horizon but again you know you're gonna be old anyways you may as well just start now you knowSwyx [00:35:55]: One more thing on I guess the um career advice slash sort of vlogging you always go viral for this post that you wrote on advice to young people and the lies you tell yourself oh yeah yeah you said you were writing it for your sister.Jason [00:36:05]: She was like bummed out about going to college and like stressing about jobs and I was like oh and I really want to hear okay and I just kind of like text-to-sweep the whole thing it's crazy it's got like 50,000 views like I'm mind I mean your average tweet has more but that thing is like a 30-minute read nowSwyx [00:36:26]: So there's lots of stuff here which I agree with I you know I'm also of occasionally indulge in the sort of life reflection phase there's the how to be lucky there's the how to have high agency I feel like the agency thing is always a trend in sf or just in tech circles how do you define having high agencyJason [00:36:42]: I'm almost like past the high agency phase now now my biggest concern is like okay the agency is just like the norm of the vector what also matters is the direction right it's like how pure is the shot yeah I mean I think agency is just a matter of like having courage and doing the thing that's scary right you know if people want to go rock climbing it's like do you decide you want to go rock climbing then you show up to the gym you rent some shoes and you just fall 40 times or do you go like oh like I'm actually more intelligent let me go research the kind of shoes that I want okay like there's flatter shoes and more inclined shoes like which one should I get okay let me go order the shoes on Amazon I'll come back in three days like oh it's a little bit too tight maybe it's too aggressive I'm only a beginner let me go change no I think the higher agent person just like goes and like falls down 20 times right yeah I think the higher agency person is more focused on like process metrics versus outcome metrics right like from pottery like one thing I learned was if you want to be good at pottery you shouldn't count like the number of cups or bowls you make you should just weigh the amount of clay you use right like the successful person says oh I went through 100 pounds of clay right the less agency was like oh I've made six cups and then after I made six cups like there's not really what are you what do you do next no just pounds of clay pounds of clay same with the work here right so you just got to write the tweets like make the commits contribute open source like write the documentation there's no real outcome it's just a process and if you love that process you just get really good at the thing you're doingSwyx [00:38:04]: yeah so just to push back on this because obviously I mostly agree how would you design performance review systems because you were effectively saying we can count lines of code for developers rightJason [00:38:15]: I don't think that would be the actual like I think if you make that an outcome like I can just expand a for loop right I think okay so for performance review this is interesting because I've mostly thought of it from the perspective of science and not engineering I've been running a lot of engineering stand-ups primarily because there's not really that many machine learning folks the process outcome is like experiments and ideas right like if you think about outcome is what you might want to think about an outcome is oh I want to improve the revenue or whatnot but that's really hard but if you're someone who is going out like okay like this week I want to come up with like three or four experiments I might move the needle okay nothing worked to them they might think oh nothing worked like I suck but to me it's like wow you've closed off all these other possible avenues for like research like you're gonna get to the place that you're gonna figure out that direction really soon there's no way you try 30 different things and none of them work usually like 10 of them work five of them work really well two of them work really really well and one thing was like the nail in the head so agency lets you sort of capture the volume of experiments and like experience lets you figure out like oh that other half it's not worth doing right I think experience is going like half these prompting papers don't make any sense just use chain of thought and just you know use a for loop that's basically right it's like usually performance for me is around like how many experiments are you running how oftentimes are you trying.Alessio [00:39:32]: When do you give up on an experiment because a StitchFix you kind of give up on language models I guess in a way as a tool to use and then maybe the tools got better you were right at the time and then the tool improved I think there are similar paths in my engineering career where I try one approach and at the time it doesn't work and then the thing changes but then I kind of soured on that approach and I don't go back to it soonJason [00:39:51]: I see yeah how do you think about that loop so usually when I'm coaching folks and as they say like oh these things don't work I'm not going to pursue them in the future like one of the big things like hey the negative result is a result and this is something worth documenting like this is an academia like if it's negative you don't just like not publish right but then like what do you actually write down like what you should write down is like here are the conditions this is the inputs and the outputs we tried the experiment on and then one thing that's really valuable is basically writing down under what conditions would I revisit these experiments these things don't work because of what we had at the time if someone is reading this two years from now under what conditions will we try again that's really hard but again that's like another skill you kind of learn right it's like you do go back and you do experiments you figure out why it works now I think a lot of it here is just like scaling worked yeah rap lyrics you know that was because I did not have high enough quality data if we phase shift and say okay you don't even need training data oh great then it might just work a different domainAlessio [00:40:48]: Do you have anything in your list that is like it doesn't work now but I want to try it again later? Something that people should maybe keep in mind you know people always like agi when you know when are you going to know the agi is here maybe it's less than that but any stuff that you tried recently that didn't work thatJason [00:41:01]: You think will get there I mean I think the personal assistance and the writing I've shown to myself it's just not good enough yet so I hired a writer and I hired a personal assistant so now I'm gonna basically like work with these people until I figure out like what I can actually like automate and what are like the reproducible steps but like I think the experiment for me is like I'm gonna go pay a person like thousand dollars a month that helped me improve my life and then let me get them to help me figure like what are the components and how do I actually modularize something to get it to work because it's not just like a lot gmail calendar and like notion it's a little bit more complicated than that but we just don't know what that is yet those are two sort of systems that I wish gb4 or opus was actually good enough to just write me an essay but most of the essays are still pretty badSwyx [00:41:44]: yeah I would say you know on the personal assistance side Lindy is probably the one I've seen the most flow was at a speaker at the summit I don't know if you've checked it out or any other sort of agents assistant startupJason [00:41:54]: Not recently I haven't tried lindy they were not ga last time I was considering it yeah yeah a lot of it now it's like oh like really what I want you to do is take a look at all of my meetings and like write like a really good weekly summary email for my clients to remind them that I'm like you know thinking of them and like working for them right or it's like I want you to notice that like my monday is like way too packed and like block out more time and also like email the people to do the reschedule and then try to opt in to move them around and then I want you to say oh jason should have like a 15 minute prep break after form back to back those are things that now I know I can prompt them in but can it do it well like before I didn't even know that's what I wanted to prompt for us defragging a calendar and adding break so I can like eat lunch yeah that's the AGI test yeah exactly compassion right I think one thing that yeah we didn't touch on it before butAlessio [00:42:44]: I think was interesting you had this tweet a while ago about prompts should be code and then there were a lot of companies trying to build prompt engineering tooling kind of trying to turn the prompt into a more structured thing what's your thought today now you want to turn the thinking into DAGs like do prompts should still be code any updated ideasJason [00:43:04]: It's the same thing right I think you know with Instructor it is very much like the output model is defined as a code object that code object is sent to the LLM and in return you get a data structure so the outputs of these models I think should also be code objects and the inputs somewhat should be code objects but I think the one thing that instructor tries to do is separate instruction data and the types of the output and beyond that I really just think that most of it should be still like managed pretty closely to the developer like so much of is changing that if you give control of these systems away too early you end up ultimately wanting them back like many companies I know that I reach out or ones were like oh we're going off of the frameworks because now that we know what the business outcomes we're trying to optimize for these frameworks don't work yeah because we do rag but we want to do rag to like sell you supplements or to have you like schedule the fitness appointment the prompts are kind of too baked into the systems to really pull them back out and like start doing upselling or something it's really funny but a lot of it ends up being like once you understand the business outcomes you care way more about the promptSwyx [00:44:07]: Actually this is fun in our prep for this call we were trying to say like what can you as an independent person say that maybe me and Alessio cannot say or me you know someone at a company say what do you think is the market share of the frameworks the LangChain, the LlamaIndex, the everything...Jason [00:44:20]: Oh massive because not everyone wants to care about the code yeah right I think that's a different question to like what is the business model and are they going to be like massively profitable businesses right making hundreds of millions of dollars that feels like so straightforward right because not everyone is a prompt engineer like there's so much productivity to be captured in like back office optim automations right it's not because they care about the prompts that they care about managing these things yeah but those would be sort of low code experiences you yeah I think the bigger challenge is like okay hundred million dollars probably pretty easy it's just time and effort and they have the manpower and the money to sort of solve those problems again if you go the vc route then it's like you're talking about billions and that's really the goal that stuff for me it's like pretty unclear but again that is to say that like I sort of am building things for developers who want to use infrastructure to build their own tooling in terms of the amount of developers there are in the world versus downstream consumers of these things or even just think of how many companies will use like the adobes and the ibms right because they want something that's fully managed and they want something that they know will work and if the incremental 10% requires you to hire another team of 20 people you might not want to do it and I think that kind of organization is really good for uh those are bigger companiesSwyx [00:45:32]: I just want to capture your thoughts on one more thing which is you said you wanted most of the prompts to stay close to the developer and Hamel Husain wrote this post which I really love called f you show me the prompt yeah I think he cites you in one of those part of the blog post and I think ds pi is kind of like the complete antithesis of that which is I think it's interesting because I also hold the strong view that AI is a better prompt engineer than you are and I don't know how to square that wondering if you have thoughtsJason [00:45:58]: I think something like DSPy can work because there are like very short-term metrics to measure success right it is like did you find the pii or like did you write the multi-hop question the correct way but in these workflows that I've been managing a lot of it are we minimizing churn and maximizing retention yeah that's a very long loop it's not really like a uptuna like training loop right like those things are much more harder to capture so we don't actually have those metrics for that right and obviously we can figure out like okay is the summary good but like how do you measure the quality of the summary it's like that feedback loop it ends up being a lot longer and then again when something changes it's really hard to make sure that it works across these like newer models or again like changes to work for the current process like when we migrate from like anthropic to open ai like there's just a ton of change that are like infrastructure related not necessarily around the prompt itself yeah cool any other ai engineering startups that you think should not exist before we wrap up i mean oh my gosh i mean a lot of it again it's just like every time of investors like how does this make a billion dollars like it doesn't i'm gonna go back to just like tweeting and holding my breath underwater yeah like i don't really pay attention too much to most of this like most of the stuff i'm doing is around like the consumer of like llm calls yep i think people just want to move really fast and they will end up pick these vendors but i don't really know if anything has really like blown me out the water like i only trust myself but that's also a function of just being an old man like i think you know many companies are definitely very happy with using most of these tools anyways but i definitely think i occupy a very small space in the engineering ecosystem.Swyx [00:47:41]: Yeah i would say one of the challenges here you know you call about the dealing in the consumer of llm's space i think that's what ai engineering differs from ml engineering and i think a constant disconnect or cognitive dissonance in this field in the ai engineers that have sprung up is that they are not as good as the ml engineers they are not as qualified i think that you know you are someone who has credibility in the mle space and you are also a very authoritative figure in the ai space and i think so and you know i think you've built the de facto leading library i think yours i think instructors should be part of the standard lib even though i try to not use it like i basically also end up rebuilding instructor right like that's a lot of the back and forth that we had over the past two days i think that's the fundamental thing that we're trying to figure out like there's very small supply of MLEs not everyone's going to have that experience that you had but the global demand for AI is going to far outstrip the existing MLEs.Jason [00:48:36]: So what do we do do we force everyone to go through the standard MLE curriculum or do we make a new one? I'

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0
Latent Space Chats: NLW (Four Wars, GPT5), Josh Albrecht/Ali Rohde (TNAI), Dylan Patel/Semianalysis (Groq), Milind Naphade (Nvidia GTC), Personal AI (ft. Harrison Chase — LangFriend/LangMem)

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Apr 6, 2024 121:17


Our next 2 big events are AI UX and the World's Fair. Join and apply to speak/sponsor!Due to timing issues we didn't have an interview episode to share with you this week, but not to worry, we have more than enough “weekend special” content in the backlog for you to get your Latent Space fix, whether you like thinking about the big picture, or learning more about the pod behind the scenes, or talking Groq and GPUs, or AI Leadership, or Personal AI. Enjoy!AI BreakdownThe indefatigable NLW had us back on his show for an update on the Four Wars, covering Sora, Suno, and the reshaped GPT-4 Class Landscape:and a longer segment on AI Engineering trends covering the future LLM landscape (Llama 3, GPT-5, Gemini 2, Claude 4), Open Source Models (Mistral, Grok), Apple and Meta's AI strategy, new chips (Groq, MatX) and the general movement from baby AGIs to vertical Agents:Thursday Nights in AIWe're also including swyx's interview with Josh Albrecht and Ali Rohde to reintroduce swyx and Latent Space to a general audience, and engage in some spicy Q&A:Dylan Patel on GroqWe hosted a private event with Dylan Patel of SemiAnalysis (our last pod here):Not all of it could be released so we just talked about our Groq estimates:Milind Naphade - Capital OneIn relation to conversations at NeurIPS and Nvidia GTC and upcoming at World's Fair, we also enjoyed chatting with Milind Naphade about his AI Leadership work at IBM, Cisco, Nvidia, and now leading the AI Foundations org at Capital One. We covered:* Milind's learnings from ~25 years in machine learning * His first paper citation was 24 years ago* Lessons from working with Jensen Huang for 6 years and being CTO of Metropolis * Thoughts on relevant AI research* GTC takeaways and what makes NVIDIA specialIf you'd like to work on building solutions rather than platform (as Milind put it), his Applied AI Research team at Capital One is hiring, which falls under the Capital One Tech team.Personal AI MeetupIt all started with a meme:Within days of each other, BEE, FRIEND, EmilyAI, Compass, Nox and LangFriend were all launching personal AI wearables and assistants. So we decided to put together a the world's first Personal AI meetup featuring creators and enthusiasts of wearables. The full video is live now, with full show notes within.Timestamps* [00:01:13] AI Breakdown Part 1* [00:02:20] Four Wars* [00:13:45] Sora* [00:15:12] Suno* [00:16:34] The GPT-4 Class Landscape* [00:17:03] Data War: Reddit x Google* [00:21:53] Gemini 1.5 vs Claude 3* [00:26:58] AI Breakdown Part 2* [00:27:33] Next Frontiers: Llama 3, GPT-5, Gemini 2, Claude 4* [00:31:11] Open Source Models - Mistral, Grok* [00:34:13] Apple MM1* [00:37:33] Meta's $800b AI rebrand* [00:39:20] AI Engineer landscape - from baby AGIs to vertical Agents* [00:47:28] Adept episode - Screen Multimodality* [00:48:54] Top Model Research from January Recap* [00:53:08] AI Wearables* [00:57:26] Groq vs Nvidia month - GPU Chip War* [01:00:31] Disagreements* [01:02:08] Summer 2024 Predictions* [01:04:18] Thursday Nights in AI - swyx* [01:33:34] Dylan Patel - Semianalysis + Latent Space Live Show* [01:34:58] GroqTranscript[00:00:00] swyx: Welcome to the Latent Space Podcast Weekend Edition. This is Charlie, your AI co host. Swyx and Alessio are off for the week, making more great content. We have exciting interviews coming up with Elicit, Chroma, Instructor, and our upcoming series on NSFW, Not Safe for Work AI. In today's episode, we're collating some of Swyx and Alessio's recent appearances, all in one place for you to find.[00:00:32] swyx: In part one, we have our first crossover pod of the year. In our listener survey, several folks asked for more thoughts from our two hosts. In 2023, Swyx and Alessio did crossover interviews with other great podcasts like the AI Breakdown, Practical AI, Cognitive Revolution, Thursday Eye, and Chinatalk, all of which you can find in the Latentspace About page.[00:00:56] swyx: NLW of the AI Breakdown asked us back to do a special on the 4Wars framework and the AI engineer scene. We love AI Breakdown as one of the best examples Daily podcasts to keep up on AI news, so we were especially excited to be back on Watch out and take[00:01:12] NLW: care[00:01:13] AI Breakdown Part 1[00:01:13] NLW: today on the AI breakdown. Part one of my conversation with Alessio and Swix from Latent Space.[00:01:19] NLW: All right, fellas, welcome back to the AI Breakdown. How are you doing? I'm good. Very good. With the last, the last time we did this show, we were like, oh yeah, let's do check ins like monthly about all the things that are going on and then. Of course, six months later, and, you know, the, the, the world has changed in a thousand ways.[00:01:36] NLW: It's just, it's too busy to even, to even think about podcasting sometimes. But I, I'm super excited to, to be chatting with you again. I think there's, there's a lot to, to catch up on, just to tap in, I think in the, you know, in the beginning of 2024. And, and so, you know, we're gonna talk today about just kind of a, a, a broad sense of where things are in some of the key battles in the AI space.[00:01:55] NLW: And then the, you know, one of the big things that I, that I'm really excited to have you guys on here for us to talk about where, sort of what patterns you're seeing and what people are actually trying to build, you know, where, where developers are spending their, their time and energy and, and, and any sort of, you know, trend trends there, but maybe let's start I guess by checking in on a framework that you guys actually introduced, which I've loved and I've cribbed a couple of times now, which is this sort of four wars of the, of the AI stack.[00:02:20] Four Wars[00:02:20] NLW: Because first, since I have you here, I'd love, I'd love to hear sort of like where that started gelling. And then and then maybe we can get into, I think a couple of them that are you know, particularly interesting, you know, in the, in light of[00:02:30] swyx: some recent news. Yeah, so maybe I'll take this one. So the four wars is a framework that I came up around trying to recap all of 2023.[00:02:38] swyx: I tried to write sort of monthly recap pieces. And I was trying to figure out like what makes one piece of news last longer than another or more significant than another. And I think it's basically always around battlegrounds. Wars are fought around limited resources. And I think probably the, you know, the most limited resource is talent, but the talent expresses itself in a number of areas.[00:03:01] swyx: And so I kind of focus on those, those areas at first. So the four wars that we cover are the data wars, the GPU rich, poor war, the multi modal war, And the RAG and Ops War. And I think you actually did a dedicated episode to that, so thanks for covering that. Yeah, yeah.[00:03:18] NLW: Not only did I do a dedicated episode, I actually used that.[00:03:22] NLW: I can't remember if I told you guys. I did give you big shoutouts. But I used it as a framework for a presentation at Intel's big AI event that they hold each year, where they have all their folks who are working on AI internally. And it totally resonated. That's amazing. Yeah, so, so, what got me thinking about it again is specifically this inflection news that we recently had, this sort of, you know, basically, I can't imagine that anyone who's listening wouldn't have thought about it, but, you know, inflection is a one of the big contenders, right?[00:03:53] NLW: I think probably most folks would have put them, you know, just a half step behind the anthropics and open AIs of the world in terms of labs, but it's a company that raised 1. 3 billion last year, less than a year ago. Reed Hoffman's a co founder Mustafa Suleyman, who's a co founder of DeepMind, you know, so it's like, this is not a a small startup, let's say, at least in terms of perception.[00:04:13] NLW: And then we get the news that basically most of the team, it appears, is heading over to Microsoft and they're bringing in a new CEO. And you know, I'm interested in, in, in kind of your take on how much that reflects, like hold aside, I guess, you know, all the other things that it might be about, how much it reflects this sort of the, the stark.[00:04:32] NLW: Brutal reality of competing in the frontier model space right now. And, you know, just the access to compute.[00:04:38] Alessio: There are a lot of things to say. So first of all, there's always somebody who's more GPU rich than you. So inflection is GPU rich by startup standard. I think about 22, 000 H100s, but obviously that pales compared to the, to Microsoft.[00:04:55] Alessio: The other thing is that this is probably good news, maybe for the startups. It's like being GPU rich, it's not enough. You know, like I think they were building something pretty interesting in, in pi of their own model of their own kind of experience. But at the end of the day, you're the interface that people consume as end users.[00:05:13] Alessio: It's really similar to a lot of the others. So and we'll tell, talk about GPT four and cloud tree and all this stuff. GPU poor, doing something. That the GPU rich are not interested in, you know we just had our AI center of excellence at Decibel and one of the AI leads at one of the big companies was like, Oh, we just saved 10 million and we use these models to do a translation, you know, and that's it.[00:05:39] Alessio: It's not, it's not a GI, it's just translation. So I think like the inflection part is maybe. A calling and a waking to a lot of startups then say, Hey, you know, trying to get as much capital as possible, try and get as many GPUs as possible. Good. But at the end of the day, it doesn't build a business, you know, and maybe what inflection I don't, I don't, again, I don't know the reasons behind the inflection choice, but if you say, I don't want to build my own company that has 1.[00:06:05] Alessio: 3 billion and I want to go do it at Microsoft, it's probably not a resources problem. It's more of strategic decisions that you're making as a company. So yeah, that was kind of my. I take on it.[00:06:15] swyx: Yeah, and I guess on my end, two things actually happened yesterday. It was a little bit quieter news, but Stability AI had some pretty major departures as well.[00:06:25] swyx: And you may not be considering it, but Stability is actually also a GPU rich company in the sense that they were the first new startup in this AI wave to brag about how many GPUs that they have. And you should join them. And you know, Imadis is definitely a GPU trader in some sense from his hedge fund days.[00:06:43] swyx: So Robin Rhombach and like the most of the Stable Diffusion 3 people left Stability yesterday as well. So yesterday was kind of like a big news day for the GPU rich companies, both Inflection and Stability having sort of wind taken out of their sails. I think, yes, it's a data point in the favor of Like, just because you have the GPUs doesn't mean you can, you automatically win.[00:07:03] swyx: And I think, you know, kind of I'll echo what Alessio says there. But in general also, like, I wonder if this is like the start of a major consolidation wave, just in terms of, you know, I think that there was a lot of funding last year and, you know, the business models have not been, you know, All of these things worked out very well.[00:07:19] swyx: Even inflection couldn't do it. And so I think maybe that's the start of a small consolidation wave. I don't think that's like a sign of AI winter. I keep looking for AI winter coming. I think this is kind of like a brief cold front. Yeah,[00:07:34] NLW: it's super interesting. So I think a bunch of A bunch of stuff here.[00:07:38] NLW: One is, I think, to both of your points, there, in some ways, there, there had already been this very clear demarcation between these two sides where, like, the GPU pores, to use the terminology, like, just weren't trying to compete on the same level, right? You know, the vast majority of people who have started something over the last year, year and a half, call it, were racing in a different direction.[00:07:59] NLW: They're trying to find some edge somewhere else. They're trying to build something different. If they're, if they're really trying to innovate, it's in different areas. And so it's really just this very small handful of companies that are in this like very, you know, it's like the coheres and jaspers of the world that like this sort of, you know, that are that are just sort of a little bit less resourced than, you know, than the other set that I think that this potentially even applies to, you know, everyone else that could clearly demarcate it into these two, two sides.[00:08:26] NLW: And there's only a small handful kind of sitting uncomfortably in the middle, perhaps. Let's, let's come back to the idea of, of the sort of AI winter or, you know, a cold front or anything like that. So this is something that I, I spent a lot of time kind of thinking about and noticing. And my perception is that The vast majority of the folks who are trying to call for sort of, you know, a trough of disillusionment or, you know, a shifting of the phase to that are people who either, A, just don't like AI for some other reason there's plenty of that, you know, people who are saying, You Look, they're doing way worse than they ever thought.[00:09:03] NLW: You know, there's a lot of sort of confirmation bias kind of thing going on. Or two, media that just needs a different narrative, right? Because they're sort of sick of, you know, telling the same story. Same thing happened last summer, when every every outlet jumped on the chat GPT at its first down month story to try to really like kind of hammer this idea that that the hype was too much.[00:09:24] NLW: Meanwhile, you have, you know, just ridiculous levels of investment from enterprises, you know, coming in. You have, you know, huge, huge volumes of, you know, individual behavior change happening. But I do think that there's nothing incoherent sort of to your point, Swyx, about that and the consolidation period.[00:09:42] NLW: Like, you know, if you look right now, for example, there are, I don't know, probably 25 or 30 credible, like, build your own chatbot. platforms that, you know, a lot of which have, you know, raised funding. There's no universe in which all of those are successful across, you know, even with a, even, even with a total addressable market of every enterprise in the world, you know, you're just inevitably going to see some amount of consolidation.[00:10:08] NLW: Same with, you know, image generators. There are, if you look at A16Z's top 50 consumer AI apps, just based on, you know, web traffic or whatever, they're still like I don't know, a half. Dozen or 10 or something, like, some ridiculous number of like, basically things like Midjourney or Dolly three. And it just seems impossible that we're gonna have that many, you know, ultimately as, as, as sort of, you know, going, going concerned.[00:10:33] NLW: So, I don't know. I, I, I think that the, there will be inevitable consolidation 'cause you know. It's, it's also what kind of like venture rounds are supposed to do. You're not, not everyone who gets a seed round is supposed to get to series A and not everyone who gets a series A is supposed to get to series B.[00:10:46] NLW: That's sort of the natural process. I think it will be tempting for a lot of people to try to infer from that something about AI not being as sort of big or as as sort of relevant as, as it was hyped up to be. But I, I kind of think that's the wrong conclusion to come to.[00:11:02] Alessio: I I would say the experimentation.[00:11:04] Alessio: Surface is a little smaller for image generation. So if you go back maybe six, nine months, most people will tell you, why would you build a coding assistant when like Copilot and GitHub are just going to win everything because they have the data and they have all the stuff. If you fast forward today, A lot of people use Cursor everybody was excited about the Devin release on Twitter.[00:11:26] Alessio: There are a lot of different ways of attacking the market that are not completion of code in the IDE. And even Cursors, like they evolved beyond single line to like chat, to do multi line edits and, and all that stuff. Image generation, I would say, yeah, as a, just as from what I've seen, like maybe the product innovation has slowed down at the UX level and people are improving the models.[00:11:50] Alessio: So the race is like, how do I make better images? It's not like, how do I make the user interact with the generation process better? And that gets tough, you know? It's hard to like really differentiate yourselves. So yeah, that's kind of how I look at it. And when we think about multimodality, maybe the reason why people got so excited about Sora is like, oh, this is like a completely It's not a better image model.[00:12:13] Alessio: This is like a completely different thing, you know? And I think the creative mind It's always looking for something that impacts the viewer in a different way, you know, like they really want something different versus the developer mind. It's like, Oh, I, I just, I have this like very annoying thing I want better.[00:12:32] Alessio: I have this like very specific use cases that I want to go after. So it's just different. And that's why you see a lot more companies in image generation. But I agree with you that. If you fast forward there, there's not going to be 10 of them, you know, it's probably going to be one or[00:12:46] swyx: two. Yeah, I mean, to me, that's why I call it a war.[00:12:49] swyx: Like, individually, all these companies can make a story that kind of makes sense, but collectively, they cannot all be true. Therefore, they all, there is some kind of fight over limited resources here. Yeah, so[00:12:59] NLW: it's interesting. We wandered very naturally into sort of another one of these wars, which is the multimodality kind of idea, which is, you know, basically a question of whether it's going to be these sort of big everything models that end up winning or whether, you know, you're going to have really specific things, you know, like something, you know, Dolly 3 inside of sort of OpenAI's larger models versus, you know, a mid journey or something like that.[00:13:24] NLW: And at first, you know, I was kind of thinking like, For most of the last, call it six months or whatever, it feels pretty definitively both and in some ways, you know, and that you're, you're seeing just like great innovation on sort of the everything models, but you're also seeing lots and lots happen at sort of the level of kind of individual use cases.[00:13:45] Sora[00:13:45] NLW: But then Sora comes along and just like obliterates what I think anyone thought you know, where we were when it comes to video generation. So how are you guys thinking about this particular battle or war at the moment?[00:13:59] swyx: Yeah, this was definitely a both and story, and Sora tipped things one way for me, in terms of scale being all you need.[00:14:08] swyx: And the benefit, I think, of having multiple models being developed under one roof. I think a lot of people aren't aware that Sora was developed in a similar fashion to Dolly 3. And Dolly3 had a very interesting paper out where they talked about how they sort of bootstrapped their synthetic data based on GPT 4 vision and GPT 4.[00:14:31] swyx: And, and it was just all, like, really interesting, like, if you work on one modality, it enables you to work on other modalities, and all that is more, is, is more interesting. I think it's beneficial if it's all in the same house, whereas the individual startups who don't, who sort of carve out a single modality and work on that, definitely won't have the state of the art stuff on helping them out on synthetic data.[00:14:52] swyx: So I do think like, The balance is tilted a little bit towards the God model companies, which is challenging for the, for the, for the the sort of dedicated modality companies. But everyone's carving out different niches. You know, like we just interviewed Suno ai, the sort of music model company, and, you know, I don't see opening AI pursuing music anytime soon.[00:15:12] Suno[00:15:12] swyx: Yeah,[00:15:13] NLW: Suno's been phenomenal to play with. Suno has done that rare thing where, which I think a number of different AI product categories have done, where people who don't consider themselves particularly interested in doing the thing that the AI enables find themselves doing a lot more of that thing, right?[00:15:29] NLW: Like, it'd be one thing if Just musicians were excited about Suno and using it but what you're seeing is tons of people who just like music all of a sudden like playing around with it and finding themselves kind of down that rabbit hole, which I think is kind of like the highest compliment that you can give one of these startups at the[00:15:45] swyx: early days of it.[00:15:46] swyx: Yeah, I, you know, I, I asked them directly, you know, in the interview about whether they consider themselves mid journey for music. And he had a more sort of nuanced response there, but I think that probably the business model is going to be very similar because he's focused on the B2C element of that. So yeah, I mean, you know, just to, just to tie back to the question about, you know, You know, large multi modality companies versus small dedicated modality companies.[00:16:10] swyx: Yeah, highly recommend people to read the Sora blog posts and then read through to the Dali blog posts because they, they strongly correlated themselves with the same synthetic data bootstrapping methods as Dali. And I think once you make those connections, you're like, oh, like it, it, it is beneficial to have multiple state of the art models in house that all help each other.[00:16:28] swyx: And these, this, that's the one thing that a dedicated modality company cannot do.[00:16:34] The GPT-4 Class Landscape[00:16:34] NLW: So I, I wanna jump, I wanna kind of build off that and, and move into the sort of like updated GPT-4 class landscape. 'cause that's obviously been another big change over the last couple months. But for the sake of completeness, is there anything that's worth touching on with with sort of the quality?[00:16:46] NLW: Quality data or sort of a rag ops wars just in terms of, you know, anything that's changed, I guess, for you fundamentally in the last couple of months about where those things stand.[00:16:55] swyx: So I think we're going to talk about rag for the Gemini and Clouds discussion later. And so maybe briefly discuss the data piece.[00:17:03] Data War: Reddit x Google[00:17:03] swyx: I think maybe the only new thing was this Reddit deal with Google for like a 60 million dollar deal just ahead of their IPO, very conveniently turning Reddit into a AI data company. Also, very, very interestingly, a non exclusive deal, meaning that Reddit can resell that data to someone else. And it probably does become table stakes.[00:17:23] swyx: A lot of people don't know, but a lot of the web text dataset that originally started for GPT 1, 2, and 3 was actually scraped from GitHub. from Reddit at least the sort of vote scores. And I think, I think that's a, that's a very valuable piece of information. So like, yeah, I think people are figuring out how to pay for data.[00:17:40] swyx: People are suing each other over data. This, this, this war is, you know, definitely very, very much heating up. And I don't think, I don't see it getting any less intense. I, you know, next to GPUs, data is going to be the most expensive thing in, in a model stack company. And. You know, a lot of people are resorting to synthetic versions of it, which may or may not be kosher based on how far along or how commercially blessed the, the forms of creating that synthetic data are.[00:18:11] swyx: I don't know if Alessio, you have any other interactions with like Data source companies, but that's my two cents.[00:18:17] Alessio: Yeah yeah, I actually saw Quentin Anthony from Luther. ai at GTC this week. He's also been working on this. I saw Technium. He's also been working on the data side. I think especially in open source, people are like, okay, if everybody is putting the gates up, so to speak, to the data we need to make it easier for people that don't have 50 million a year to get access to good data sets.[00:18:38] Alessio: And Jensen, at his keynote, he did talk about synthetic data a little bit. So I think that's something that we'll definitely hear more and more of in the enterprise, which never bodes well, because then all the, all the people with the data are like, Oh, the enterprises want to pay now? Let me, let me put a pay here stripe link so that they can give me 50 million.[00:18:57] Alessio: But it worked for Reddit. I think the stock is up. 40 percent today after opening. So yeah, I don't know if it's all about the Google deal, but it's obviously Reddit has been one of those companies where, hey, you got all this like great community, but like, how are you going to make money? And like, they try to sell the avatars.[00:19:15] Alessio: I don't know if that it's a great business for them. The, the data part sounds as an investor, you know, the data part sounds a lot more interesting than, than consumer[00:19:25] swyx: cosmetics. Yeah, so I think, you know there's more questions around data you know, I think a lot of people are talking about the interview that Mira Murady did with the Wall Street Journal, where she, like, just basically had no, had no good answer for where they got the data for Sora.[00:19:39] swyx: I, I think this is where, you know, there's, it's in nobody's interest to be transparent about data, and it's, it's kind of sad for the state of ML and the state of AI research but it is what it is. We, we have to figure this out as a society, just like we did for music and music sharing. You know, in, in sort of the Napster to Spotify transition, and that might take us a decade.[00:19:59] swyx: Yeah, I[00:20:00] NLW: do. I, I agree. I think, I think that you're right to identify it, not just as that sort of technical problem, but as one where society has to have a debate with itself. Because I think that there's, if you rationally within it, there's Great kind of points on all side, not to be the sort of, you know, person who sits in the middle constantly, but it's why I think a lot of these legal decisions are going to be really important because, you know, the job of judges is to listen to all this stuff and try to come to things and then have other judges disagree.[00:20:24] NLW: And, you know, and have the rest of us all debate at the same time. By the way, as a total aside, I feel like the synthetic data right now is like eggs in the 80s and 90s. Like, whether they're good for you or bad for you, like, you know, we, we get one study that's like synthetic data, you know, there's model collapse.[00:20:42] NLW: And then we have like a hint that llama, you know, to the most high performance version of it, which was one they didn't release was trained on synthetic data. So maybe it's good. It's like, I just feel like every, every other week I'm seeing something sort of different about whether it's a good or bad for, for these models.[00:20:56] swyx: Yeah. The branding of this is pretty poor. I would kind of tell people to think about it like cholesterol. There's good cholesterol, bad cholesterol. And you can have, you know, good amounts of both. But at this point, it is absolutely without a doubt that most large models from here on out will all be trained as some kind of synthetic data and that is not a bad thing.[00:21:16] swyx: There are ways in which you can do it poorly. Whether it's commercial, you know, in terms of commercial sourcing or in terms of the model performance. But it's without a doubt that good synthetic data is going to help your model. And this is just a question of like where to obtain it and what kinds of synthetic data are valuable.[00:21:36] swyx: You know, if even like alpha geometry, you know, was, was a really good example from like earlier this year.[00:21:42] NLW: If you're using the cholesterol analogy, then my, then my egg thing can't be that far off. Let's talk about the sort of the state of the art and the, and the GPT 4 class landscape and how that's changed.[00:21:53] Gemini 1.5 vs Claude 3[00:21:53] NLW: Cause obviously, you know, sort of the, the two big things or a couple of the big things that have happened. Since we last talked, we're one, you know, Gemini first announcing that a model was coming and then finally it arriving, and then very soon after a sort of a different model arriving from Gemini and and Cloud three.[00:22:11] NLW: So I guess, you know, I'm not sure exactly where the right place to start with this conversation is, but, you know, maybe very broadly speaking which of these do you think have made a bigger impact? Thank you.[00:22:20] Alessio: Probably the one you can use, right? So, Cloud. Well, I'm sure Gemini is going to be great once they let me in, but so far I haven't been able to.[00:22:29] Alessio: I use, so I have this small podcaster thing that I built for our podcast, which does chapters creation, like named entity recognition, summarization, and all of that. Cloud Tree is, Better than GPT 4. Cloud2 was unusable. So I use GPT 4 for everything. And then when Opus came out, I tried them again side by side and I posted it on, on Twitter as well.[00:22:53] Alessio: Cloud is better. It's very good, you know, it's much better, it seems to me, it's much better than GPT 4 at doing writing that is more, you know, I don't know, it just got good vibes, you know, like the GPT 4 text, you can tell it's like GPT 4, you know, it's like, it always uses certain types of words and phrases and, you know, maybe it's just me because I've now done it for, you know, So, I've read like 75, 80 generations of these things next to each other.[00:23:21] Alessio: Clutter is really good. I know everybody is freaking out on twitter about it, my only experience of this is much better has been on the podcast use case. But I know that, you know, Quran from from News Research is a very big opus pro, pro opus person. So, I think that's also It's great to have people that actually care about other models.[00:23:40] Alessio: You know, I think so far to a lot of people, maybe Entropic has been the sibling in the corner, you know, it's like Cloud releases a new model and then OpenAI releases Sora and like, you know, there are like all these different things, but yeah, the new models are good. It's interesting.[00:23:55] NLW: My my perception is definitely that just, just observationally, Cloud 3 is certainly the first thing that I've seen where lots of people.[00:24:06] NLW: They're, no one's debating evals or anything like that. They're talking about the specific use cases that they have, that they used to use chat GPT for every day, you know, day in, day out, that they've now just switched over. And that has, I think, shifted a lot of the sort of like vibe and sentiment in the space too.[00:24:26] NLW: And I don't necessarily think that it's sort of a A like full you know, sort of full knock. Let's put it this way. I think it's less bad for open AI than it is good for anthropic. I think that because GPT 5 isn't there, people are not quite willing to sort of like, you know get overly critical of, of open AI, except in so far as they're wondering where GPT 5 is.[00:24:46] NLW: But I do think that it makes, Anthropic look way more credible as a, as a, as a player, as a, you know, as a credible sort of player, you know, as opposed to to, to where they were.[00:24:57] Alessio: Yeah. And I would say the benchmarks veil is probably getting lifted this year. I think last year. People were like, okay, this is better than this on this benchmark, blah, blah, blah, because maybe they did not have a lot of use cases that they did frequently.[00:25:11] Alessio: So it's hard to like compare yourself. So you, you defer to the benchmarks. I think now as we go into 2024, a lot of people have started to use these models from, you know, from very sophisticated things that they run in production to some utility that they have on their own. Now they can just run them side by side.[00:25:29] Alessio: And it's like, Hey, I don't care that like. The MMLU score of Opus is like slightly lower than GPT 4. It just works for me, you know, and I think that's the same way that traditional software has been used by people, right? Like you just strive for yourself and like, which one does it work, works best for you?[00:25:48] Alessio: Like nobody looks at benchmarks outside of like sales white papers, you know? And I think it's great that we're going more in that direction. We have a episode with Adapt coming out this weekend. I'll and some of their model releases, they specifically say, We do not care about benchmarks, so we didn't put them in, you know, because we, we don't want to look good on them.[00:26:06] Alessio: We just want the product to work. And I think more and more people will, will[00:26:09] swyx: go that way. Yeah. I I would say like, it does take the wind out of the sails for GPT 5, which I know where, you know, Curious about later on. I think anytime you put out a new state of the art model, you have to break through in some way.[00:26:21] swyx: And what Claude and Gemini have done is effectively take away any advantage to saying that you have a million token context window. Now everyone's just going to be like, Oh, okay. Now you just match the other two guys. And so that puts An insane amount of pressure on what gpt5 is going to be because it's just going to have like the only option it has now because all the other models are multimodal all the other models are long context all the other models have perfect recall gpt5 has to match everything and do more to to not be a flop[00:26:58] AI Breakdown Part 2[00:26:58] NLW: hello friends back again with part two if you haven't heard part one of this conversation i suggest you go check it out but to be honest they are kind of actually separable In this conversation, we get into a topic that I think Alessio and Swyx are very well positioned to discuss, which is what developers care about right now, what people are trying to build around.[00:27:16] NLW: I honestly think that one of the best ways to see the future in an industry like AI is to try to dig deep on what developers and entrepreneurs are attracted to build, even if it hasn't made it to the news pages yet. So consider this your preview of six months from now, and let's dive in. Let's bring it to the GPT 5 conversation.[00:27:33] Next Frontiers: Llama 3, GPT-5, Gemini 2, Claude 4[00:27:33] NLW: I mean, so, so I think that that's a great sort of assessment of just how the stakes have been raised, you know is your, I mean, so I guess maybe, maybe I'll, I'll frame this less as a question, just sort of something that, that I, that I've been watching right now, the only thing that makes sense to me with how.[00:27:50] NLW: Fundamentally unbothered and unstressed OpenAI seems about everything is that they're sitting on something that does meet all that criteria, right? Because, I mean, even in the Lex Friedman interview that, that Altman recently did, you know, he's talking about other things coming out first. He's talking about, he's just like, he, listen, he, he's good and he could play nonchalant, you know, if he wanted to.[00:28:13] NLW: So I don't want to read too much into it, but. You know, they've had so long to work on this, like unless that we are like really meaningfully running up against some constraint, it just feels like, you know, there's going to be some massive increase, but I don't know. What do you guys think?[00:28:28] swyx: Hard to speculate.[00:28:29] swyx: You know, at this point, they're, they're pretty good at PR and they're not going to tell you anything that they don't want to. And he can tell you one thing and change their minds the next day. So it's, it's, it's really, you know, I've always said that model version numbers are just marketing exercises, like they have something and it's always improving and at some point you just cut it and decide to call it GPT 5.[00:28:50] swyx: And it's more just about defining an arbitrary level at which they're ready and it's up to them on what ready means. We definitely did see some leaks on GPT 4. 5, as I think a lot of people reported and I'm not sure if you covered it. So it seems like there might be an intermediate release. But I did feel, coming out of the Lex Friedman interview, that GPT 5 was nowhere near.[00:29:11] swyx: And you know, it was kind of a sharp contrast to Sam talking at Davos in February, saying that, you know, it was his top priority. So I find it hard to square. And honestly, like, there's also no point Reading too much tea leaves into what any one person says about something that hasn't happened yet or has a decision that hasn't been taken yet.[00:29:31] swyx: Yeah, that's, that's my 2 cents about it. Like, calm down, let's just build .[00:29:35] Alessio: Yeah. The, the February rumor was that they were gonna work on AI agents, so I don't know, maybe they're like, yeah,[00:29:41] swyx: they had two agent two, I think two agent projects, right? One desktop agent and one sort of more general yeah, sort of GPTs like agent and then Andre left, so he was supposed to be the guy on that.[00:29:52] swyx: What did Andre see? What did he see? I don't know. What did he see?[00:29:56] Alessio: I don't know. But again, it's just like the rumors are always floating around, you know but I think like, this is, you know, we're not going to get to the end of the year without Jupyter you know, that's definitely happening. I think the biggest question is like, are Anthropic and Google.[00:30:13] Alessio: Increasing the pace, you know, like it's the, it's the cloud four coming out like in 12 months, like nine months. What's the, what's the deal? Same with Gemini. They went from like one to 1. 5 in like five days or something. So when's Gemini 2 coming out, you know, is that going to be soon? I don't know.[00:30:31] Alessio: There, there are a lot of, speculations, but the good thing is that now you can see a world in which OpenAI doesn't rule everything. You know, so that, that's the best, that's the best news that everybody got, I would say.[00:30:43] swyx: Yeah, and Mistral Large also dropped in the last month. And, you know, not as, not quite GPT 4 class, but very good from a new startup.[00:30:52] swyx: So yeah, we, we have now slowly changed in landscape, you know. In my January recap, I was complaining that nothing's changed in the landscape for a long time. But now we do exist in a world, sort of a multipolar world where Cloud and Gemini are legitimate challengers to GPT 4 and hopefully more will emerge as well hopefully from meta.[00:31:11] Open Source Models - Mistral, Grok[00:31:11] NLW: So speak, let's actually talk about sort of the open source side of this for a minute. So Mistral Large, notable because it's, it's not available open source in the same way that other things are, although I think my perception is that the community has largely given them Like the community largely recognizes that they want them to keep building open source stuff and they have to find some way to fund themselves that they're going to do that.[00:31:27] NLW: And so they kind of understand that there's like, they got to figure out how to eat, but we've got, so, you know, there there's Mistral, there's, I guess, Grok now, which is, you know, Grok one is from, from October is, is open[00:31:38] swyx: sourced at, yeah. Yeah, sorry, I thought you thought you meant Grok the chip company.[00:31:41] swyx: No, no, no, yeah, you mean Twitter Grok.[00:31:43] NLW: Although Grok the chip company, I think is even more interesting in some ways, but and then there's the, you know, obviously Llama3 is the one that sort of everyone's wondering about too. And, you know, my, my sense of that, the little bit that, you know, Zuckerberg was talking about Llama 3 earlier this year, suggested that, at least from an ambition standpoint, he was not thinking about how do I make sure that, you know, meta content, you know, keeps, keeps the open source thrown, you know, vis a vis Mistral.[00:32:09] NLW: He was thinking about how you go after, you know, how, how he, you know, releases a thing that's, you know, every bit as good as whatever OpenAI is on at that point.[00:32:16] Alessio: Yeah. From what I heard in the hallways at, at GDC, Llama 3, the, the biggest model will be, you 260 to 300 billion parameters, so that that's quite large.[00:32:26] Alessio: That's not an open source model. You know, you cannot give people a 300 billion parameters model and ask them to run it. You know, it's very compute intensive. So I think it is, it[00:32:35] swyx: can be open source. It's just, it's going to be difficult to run, but that's a separate question.[00:32:39] Alessio: It's more like, as you think about what they're doing it for, you know, it's not like empowering the person running.[00:32:45] Alessio: llama. On, on their laptop, it's like, oh, you can actually now use this to go after open AI, to go after Anthropic, to go after some of these companies at like the middle complexity level, so to speak. Yeah. So obviously, you know, we estimate Gentala on the podcast, they're doing a lot here, they're making PyTorch better.[00:33:03] Alessio: You know, they want to, that's kind of like maybe a little bit of a shorted. Adam Bedia, in a way, trying to get some of the CUDA dominance out of it. Yeah, no, it's great. The, I love the duck destroying a lot of monopolies arc. You know, it's, it's been very entertaining. Let's bridge[00:33:18] NLW: into the sort of big tech side of this, because this is obviously like, so I think actually when I did my episode, this was one of the I added this as one of as an additional war that, that's something that I'm paying attention to.[00:33:29] NLW: So we've got Microsoft's moves with inflection, which I think pretend, potentially are being read as A shift vis a vis the relationship with OpenAI, which also the sort of Mistral large relationship seems to reinforce as well. We have Apple potentially entering the race, finally, you know, giving up Project Titan and and, and kind of trying to spend more effort on this.[00:33:50] NLW: Although, Counterpoint, we also have them talking about it, or there being reports of a deal with Google, which, you know, is interesting to sort of see what their strategy there is. And then, you know, Meta's been largely quiet. We kind of just talked about the main piece, but, you know, there's, and then there's spoilers like Elon.[00:34:07] NLW: I mean, you know, what, what of those things has sort of been most interesting to you guys as you think about what's going to shake out for the rest of this[00:34:13] Apple MM1[00:34:13] swyx: year? I'll take a crack. So the reason we don't have a fifth war for the Big Tech Wars is that's one of those things where I just feel like we don't cover differently from other media channels, I guess.[00:34:26] swyx: Sure, yeah. In our anti interestness, we actually say, like, we try not to cover the Big Tech Game of Thrones, or it's proxied through Twitter. You know, all the other four wars anyway, so there's just a lot of overlap. Yeah, I think absolutely, personally, the most interesting one is Apple entering the race.[00:34:41] swyx: They actually released, they announced their first large language model that they trained themselves. It's like a 30 billion multimodal model. People weren't that impressed, but it was like the first time that Apple has kind of showcased that, yeah, we're training large models in house as well. Of course, like, they might be doing this deal with Google.[00:34:57] swyx: I don't know. It sounds very sort of rumor y to me. And it's probably, if it's on device, it's going to be a smaller model. So something like a Jemma. It's going to be smarter autocomplete. I don't know what to say. I'm still here dealing with, like, Siri, which hasn't, probably hasn't been updated since God knows when it was introduced.[00:35:16] swyx: It's horrible. I, you know, it, it, it makes me so angry. So I, I, one, as an Apple customer and user, I, I'm just hoping for better AI on Apple itself. But two, they are the gold standard when it comes to local devices, personal compute and, and trust, like you, you trust them with your data. And. I think that's what a lot of people are looking for in AI, that they have, they love the benefits of AI, they don't love the downsides, which is that you have to send all your data to some cloud somewhere.[00:35:45] swyx: And some of this data that we're going to feed AI is just the most personal data there is. So Apple being like one of the most trusted personal data companies, I think it's very important that they enter the AI race, and I hope to see more out of them.[00:35:58] Alessio: To me, the, the biggest question with the Google deal is like, who's paying who?[00:36:03] Alessio: Because for the browsers, Google pays Apple like 18, 20 billion every year to be the default browser. Is Google going to pay you to have Gemini or is Apple paying Google to have Gemini? I think that's, that's like what I'm most interested to figure out because with the browsers, it's like, it's the entry point to the thing.[00:36:21] Alessio: So it's really valuable to be the default. That's why Google pays. But I wonder if like the perception in AI is going to be like, Hey. You just have to have a good local model on my phone to be worth me purchasing your device. And that was, that's kind of drive Apple to be the one buying the model. But then, like Shawn said, they're doing the MM1 themselves.[00:36:40] Alessio: So are they saying we do models, but they're not as good as the Google ones? I don't know. The whole thing is, it's really confusing, but. It makes for great meme material on on Twitter.[00:36:51] swyx: Yeah, I mean, I think, like, they are possibly more than OpenAI and Microsoft and Amazon. They are the most full stack company there is in computing, and so, like, they own the chips, man.[00:37:05] swyx: Like, they manufacture everything so if, if, if there was a company that could do that. You know, seriously challenge the other AI players. It would be Apple. And it's, I don't think it's as hard as self driving. So like maybe they've, they've just been investing in the wrong thing this whole time. We'll see.[00:37:21] swyx: Wall Street certainly thinks[00:37:22] NLW: so. Wall Street loved that move, man. There's a big, a big sigh of relief. Well, let's, let's move away from, from sort of the big stuff. I mean, the, I think to both of your points, it's going to.[00:37:33] Meta's $800b AI rebrand[00:37:33] NLW: Can I, can[00:37:34] swyx: I, can I, can I jump on factoid about this, this Wall Street thing? I went and looked at when Meta went from being a VR company to an AI company.[00:37:44] swyx: And I think the stock I'm trying to look up the details now. The stock has gone up 187% since Lamo one. Yeah. Which is $830 billion in market value created in the past year. . Yeah. Yeah.[00:37:57] NLW: It's, it's, it's like, remember if you guys haven't Yeah. If you haven't seen the chart, it's actually like remarkable.[00:38:02] NLW: If you draw a little[00:38:03] swyx: arrow on it, it's like, no, we're an AI company now and forget the VR thing.[00:38:10] NLW: It's it, it is an interesting, no, it's, I, I think, alessio, you called it sort of like Zuck's Disruptor Arc or whatever. He, he really does. He is in the midst of a, of a total, you know, I don't know if it's a redemption arc or it's just, it's something different where, you know, he, he's sort of the spoiler.[00:38:25] NLW: Like people loved him just freestyle talking about why he thought they had a better headset than Apple. But even if they didn't agree, they just loved it. He was going direct to camera and talking about it for, you know, five minutes or whatever. So that, that's a fascinating shift that I don't think anyone had on their bingo card, you know, whatever, two years ago.[00:38:41] NLW: Yeah. Yeah,[00:38:42] swyx: we still[00:38:43] Alessio: didn't see and fight Elon though, so[00:38:45] swyx: that's what I'm really looking forward to. I mean, hey, don't, don't, don't write it off, you know, maybe just these things take a while to happen. But we need to see and fight in the Coliseum. No, I think you know, in terms of like self management, life leadership, I think he has, there's a lot of lessons to learn from him.[00:38:59] swyx: You know he might, you know, you might kind of quibble with, like, the social impact of Facebook, but just himself as a in terms of personal growth and, and, you know, Per perseverance through like a lot of change and you know, everyone throwing stuff his way. I think there's a lot to say about like, to learn from, from Zuck, which is crazy 'cause he's my age.[00:39:18] swyx: Yeah. Right.[00:39:20] AI Engineer landscape - from baby AGIs to vertical Agents[00:39:20] NLW: Awesome. Well, so, so one of the big things that I think you guys have, you know, distinct and, and unique insight into being where you are and what you work on is. You know, what developers are getting really excited about right now. And by that, I mean, on the one hand, certainly, you know, like startups who are actually kind of formalized and formed to startups, but also, you know, just in terms of like what people are spending their nights and weekends on what they're, you know, coming to hackathons to do.[00:39:45] NLW: And, you know, I think it's a, it's a, it's, it's such a fascinating indicator for, for where things are headed. Like if you zoom back a year, right now was right when everyone was getting so, so excited about. AI agent stuff, right? Auto, GPT and baby a GI. And these things were like, if you dropped anything on YouTube about those, like instantly tens of thousands of views.[00:40:07] NLW: I know because I had like a 50,000 view video, like the second day that I was doing the show on YouTube, you know, because I was talking about auto GPT. And so anyways, you know, obviously that's sort of not totally come to fruition yet, but what are some of the trends in what you guys are seeing in terms of people's, people's interest and, and, and what people are building?[00:40:24] Alessio: I can start maybe with the agents part and then I know Shawn is doing a diffusion meetup tonight. There's a lot of, a lot of different things. The, the agent wave has been the most interesting kind of like dream to reality arc. So out of GPT, I think they went, From zero to like 125, 000 GitHub stars in six weeks, and then one year later, they have 150, 000 stars.[00:40:49] Alessio: So there's kind of been a big plateau. I mean, you might say there are just not that many people that can start it. You know, everybody already started it. But the promise of, hey, I'll just give you a goal, and you do it. I think it's like, amazing to get people's imagination going. You know, they're like, oh, wow, this This is awesome.[00:41:08] Alessio: Everybody, everybody can try this to do anything. But then as technologists, you're like, well, that's, that's just like not possible, you know, we would have like solved everything. And I think it takes a little bit to go from the promise and the hope that people show you to then try it yourself and going back to say, okay, this is not really working for me.[00:41:28] Alessio: And David Wong from Adept, you know, they in our episode, he specifically said. We don't want to do a bottom up product. You know, we don't want something that everybody can just use and try because it's really hard to get it to be reliable. So we're seeing a lot of companies doing vertical agents that are narrow for a specific domain, and they're very good at something.[00:41:49] Alessio: Mike Conover, who was at Databricks before, is also a friend of Latentspace. He's doing this new company called BrightWave doing AI agents for financial research, and that's it, you know, and they're doing very well. There are other companies doing it in security, doing it in compliance, doing it in legal.[00:42:08] Alessio: All of these things that like, people, nobody just wakes up and say, Oh, I cannot wait to go on AutoGPD and ask it to do a compliance review of my thing. You know, just not what inspires people. So I think the gap on the developer side has been the more bottom sub hacker mentality is trying to build this like very Generic agents that can do a lot of open ended tasks.[00:42:30] Alessio: And then the more business side of things is like, Hey, If I want to raise my next round, I can not just like sit around the mess, mess around with like super generic stuff. I need to find a use case that really works. And I think that that is worth for, for a lot of folks in parallel, you have a lot of companies doing evals.[00:42:47] Alessio: There are dozens of them that just want to help you measure how good your models are doing. Again, if you build evals, you need to also have a restrained surface area to actually figure out whether or not it's good, right? Because you cannot eval anything on everything under the sun. So that's another category where I've seen from the startup pitches that I've seen, there's a lot of interest in, in the enterprise.[00:43:11] Alessio: It's just like really. Fragmented because the production use cases are just coming like now, you know, there are not a lot of long established ones to, to test against. And so does it, that's kind of on the virtual agents and then the robotic side it's probably been the thing that surprised me the most at NVIDIA GTC, the amount of robots that were there that were just like robots everywhere.[00:43:33] Alessio: Like, both in the keynote and then on the show floor, you would have Boston Dynamics dogs running around. There was, like, this, like fox robot that had, like, a virtual face that, like, talked to you and, like, moved in real time. There were industrial robots. NVIDIA did a big push on their own Omniverse thing, which is, like, this Digital twin of whatever environments you're in that you can use to train the robots agents.[00:43:57] Alessio: So that kind of takes people back to the reinforcement learning days, but yeah, agents, people want them, you know, people want them. I give a talk about the, the rise of the full stack employees and kind of this future, the same way full stack engineers kind of work across the stack. In the future, every employee is going to interact with every part of the organization through agents and AI enabled tooling.[00:44:17] Alessio: This is happening. It just needs to be a lot more narrow than maybe the first approach that we took, which is just put a string in AutoGPT and pray. But yeah, there's a lot of super interesting stuff going on.[00:44:27] swyx: Yeah. Well, he Let's recover a lot of stuff there. I'll separate the robotics piece because I feel like that's so different from the software world.[00:44:34] swyx: But yeah, we do talk to a lot of engineers and you know, that this is our sort of bread and butter. And I do agree that vertical agents have worked out a lot better than the horizontal ones. I think all You know, the point I'll make here is just the reason AutoGPT and maybe AGI, you know, it's in the name, like they were promising AGI.[00:44:53] swyx: But I think people are discovering that you cannot engineer your way to AGI. It has to be done at the model level and all these engineering, prompt engineering hacks on top of it weren't really going to get us there in a meaningful way without much further, you know, improvements in the models. I would say, I'll go so far as to say, even Devin, which is, I would, I think the most advanced agent that we've ever seen, still requires a lot of engineering and still probably falls apart a lot in terms of, like, practical usage.[00:45:22] swyx: Or it's just, Way too slow and expensive for, you know, what it's, what it's promised compared to the video. So yeah, that's, that's what, that's what happened with agents from, from last year. But I, I do, I do see, like, vertical agents being very popular and, and sometimes you, like, I think the word agent might even be overused sometimes.[00:45:38] swyx: Like, people don't really care whether or not you call it an AI agent, right? Like, does it replace boring menial tasks that I do That I might hire a human to do, or that the human who is hired to do it, like, actually doesn't really want to do. And I think there's absolutely ways in sort of a vertical context that you can actually go after very routine tasks that can be scaled out to a lot of, you know, AI assistants.[00:46:01] swyx: So, so yeah, I mean, and I would, I would sort of basically plus one what let's just sit there. I think it's, it's very, very promising and I think more people should work on it, not less. Like there's not enough people. Like, we, like, this should be the, the, the main thrust of the AI engineer is to look out, look for use cases and, and go to a production with them instead of just always working on some AGI promising thing that never arrives.[00:46:21] swyx: I,[00:46:22] NLW: I, I can only add that so I've been fiercely making tutorials behind the scenes around basically everything you can imagine with AI. We've probably done, we've done about 300 tutorials over the last couple of months. And the verticalized anything, right, like this is a solution for your particular job or role, even if it's way less interesting or kind of sexy, it's like so radically more useful to people in terms of intersecting with how, like those are the ways that people are actually.[00:46:50] NLW: Adopting AI in a lot of cases is just a, a, a thing that I do over and over again. By the way, I think that's the same way that even the generalized models are getting adopted. You know, it's like, I use midjourney for lots of stuff, but the main thing I use it for is YouTube thumbnails every day. Like day in, day out, I will always do a YouTube thumbnail, you know, or two with, with Midjourney, right?[00:47:09] NLW: And it's like you can, you can start to extrapolate that across a lot of things and all of a sudden, you know, a AI doesn't. It looks revolutionary because of a million small changes rather than one sort of big dramatic change. And I think that the verticalization of agents is sort of a great example of how that's[00:47:26] swyx: going to play out too.[00:47:28] Adept episode - Screen Multimodality[00:47:28] swyx: So I'll have one caveat here, which is I think that Because multi modal models are now commonplace, like Cloud, Gemini, OpenAI, all very very easily multi modal, Apple's easily multi modal, all this stuff. There is a switch for agents for sort of general desktop browsing that I think people so much for joining us today, and we'll see you in the next video.[00:48:04] swyx: Version of the the agent where they're not specifically taking in text or anything They're just watching your screen just like someone else would and and I'm piloting it by vision And you know in the the episode with David that we'll have dropped by the time that this this airs I think I think that is the promise of adept and that is a promise of what a lot of these sort of desktop agents Are and that is the more general purpose system That could be as big as the browser, the operating system, like, people really want to build that foundational piece of software in AI.[00:48:38] swyx: And I would see, like, the potential there for desktop agents being that, that you can have sort of self driving computers. You know, don't write the horizontal piece out. I just think we took a while to get there.[00:48:48] NLW: What else are you guys seeing that's interesting to you? I'm looking at your notes and I see a ton of categories.[00:48:54] Top Model Research from January Recap[00:48:54] swyx: Yeah so I'll take the next two as like as one category, which is basically alternative architectures, right? The two main things that everyone following AI kind of knows now is, one, the diffusion architecture, and two, the let's just say the, Decoder only transformer architecture that is popularized by GPT.[00:49:12] swyx: You can read, you can look on YouTube for thousands and thousands of tutorials on each of those things. What we are talking about here is what's next, what people are researching, and what could be on the horizon that takes the place of those other two things. So first of all, we'll talk about transformer architectures and then diffusion.[00:49:25] swyx: So transformers the, the two leading candidates are effectively RWKV and the state space models the most recent one of which is Mamba, but there's others like the Stripe, ENA, and the S four H three stuff coming out of hazy research at Stanford. And all of those are non quadratic language models that scale the promise to scale a lot better than the, the traditional transformer.[00:49:47] swyx: That this might be too theoretical for most people right now, but it's, it's gonna be. It's gonna come out in weird ways, where, imagine if like, Right now the talk of the town is that Claude and Gemini have a million tokens of context and like whoa You can put in like, you know, two hours of video now, okay But like what if you put what if we could like throw in, you know, two hundred thousand hours of video?[00:50:09] swyx: Like how does that change your usage of AI? What if you could throw in the entire genetic sequence of a human and like synthesize new drugs. Like, well, how does that change things? Like, we don't know because we haven't had access to this capability being so cheap before. And that's the ultimate promise of these two models.[00:50:28] swyx: They're not there yet but we're seeing very, very good progress. RWKV and Mamba are probably the, like, the two leading examples, both of which are open source that you can try them today and and have a lot of progress there. And the, the, the main thing I'll highlight for audio e KV is that at, at the seven B level, they seem to have beat LAMA two in all benchmarks that matter at the same size for the same amount of training as an open source model.[00:50:51] swyx: So that's exciting. You know, they're there, they're seven B now. They're not at seven tb. We don't know if it'll. And then the other thing is diffusion. Diffusions and transformers are are kind of on the collision course. The original stable diffusion already used transformers in in parts of its architecture.[00:51:06] swyx: It seems that transformers are eating more and more of those layers particularly the sort of VAE layer. So that's, the Diffusion Transformer is what Sora is built on. The guy who wrote the Diffusion Transformer paper, Bill Pebbles, is, Bill Pebbles is the lead tech guy on Sora. So you'll just see a lot more Diffusion Transformer stuff going on.[00:51:25] swyx: But there's, there's more sort of experimentation with diffusion. I'm holding a meetup actually here in San Francisco that's gonna be like the state of diffusion, which I'm pretty excited about. Stability's doing a lot of good work. And if you look at the, the architecture of how they're creating Stable Diffusion 3, Hourglass Diffusion, and the inconsistency models, or SDXL Turbo.[00:51:45] swyx: All of these are, like, very, very interesting innovations on, like, the original idea of what Stable Diffusion was. So if you think that it is expensive to create or slow to create Stable Diffusion or an AI generated art, you are not up to date with the latest models. If you think it is hard to create text and images, you are not up to date with the latest models.[00:52:02] swyx: And people still are kind of far behind. The last piece of which is the wildcard I always kind of hold out, which is text diffusion. So Instead of using autogenerative or autoregressive transformers, can you use text to diffuse? So you can use diffusion models to diffuse and create entire chunks of text all at once instead of token by token.[00:52:22] swyx: And that is something that Midjourney confirmed today, because it was only rumored the past few months. But they confirmed today that they were looking into. So all those things are like very exciting new model architectures that are, Maybe something that we'll, you'll see in production two to three years from now.[00:52:37] swyx: So the couple of the trends[00:52:38] NLW: that I want to just get your takes on, because they're sort of something that, that seems like they're coming up are one sort of these, these wearable, you know, kind of passive AI experiences where they're absorbing a lot of what's going on around you and then, and then kind of bringing things back.[00:52:53] NLW: And then the, the other one that I, that I wanted to see if you guys had thoughts on were sort of this next generation of chip companies. Obviously there's a huge amount of emphasis. On on hardware and silicon and, and, and different ways of doing things, but, y

america god tv love ceo spotify amazon netflix world learning ai europe english google apple lessons pr magic san francisco phd friend digital chinese marvel reading data predictions elon musk microsoft events funny fortune startups white house weird economics wall street memory wall street journal reddit wars auto curious cloud gate vr singapore stanford connections mix israelis context ibm mark zuckerberg senior vice president average intel ram signal cto state of the union tigers minecraft vc adapt ipo gemini siri sol openai transformers instructors lsu clouds nvidia stability ux rust lemon patel api gi davos nsfw cisco compass luther progression b2c bro sweep d d gpt bing makes disagreement mythology ml lama github llama apis token thursday night stripe amd quran vcs llm captive devops copilot sora baldur opus sam altman anthropic embody silicon dozen gpu grok altman bobo capital one tab agi mamba generic waymo boba waze upfront midjourney dali ide approve napster cloudflare gdc golem prs zuck coliseum rag git kv klarna gpus albrecht diffusion coders deepmind gan tldr boston dynamics alessio minefields gitlab sergei fragmented suno cursor mistral json ppa gpts lex fridman ena jensen huang nox stable diffusion inflection databricks decibel a16z counterpoint mts cuda rohde adept chroma asr gtc sundar lemurian decoder iou nvidia gpus stability ai etched sram cerebros practical ai singaporeans omniverse netlify eac pytorch lamo mustafa suleyman day6 tpu devtools agis vae not safe groq nvidia gtc elicit jupyter kubecon autogpt project titan personal ai andrej karpathy milind demis neurips hbm ai engineer marginally jeff dean imbue positron nlw slido nat friedman entropic ppap lstm simon willison c300 technium mbu xla lpu boba guys latent space you look swix medex metax mxu lstms
Skisporet
Sesongavslutning: Johaug-comeback, Birken recap og landslagsuttak til VM-sesongen

Skisporet

Play Episode Listen Later Mar 21, 2024 52:49


I takt med at vintersesongen – og med det skisesongen – nærmer seg slutten, ønsker vi velkommen til den aller siste episoden av Skisporet podcast for denne gang. Bli med når vi diskuterer landslagsuttak, reflekterer over Birken, og ser frem mot NM og VM. Med Mikael Gunnulfsen og Håvard Rønning. Skisporet podcast gis ut av Swix og Skisporet.no.

Skisporet
Petter Northug er tilbake! Blir det VM-comeback? Hvordan mye legger han i skisatsningen og hva skal til for å lykkes i Birken? Svarene får du her

Skisporet

Play Episode Listen Later Mar 6, 2024 62:09


ENDELIG er Petter Northug tilbake i Skisporet podcast. Denne gang for å snakke om alt fra comeback-spekulasjoner til egen sesong og mål som langløper. Her får du et unikt innblikk i treningshverdagen og tankesettet til den største av de største i skisporten. Uansett om du er ung skiløper eller en ivrig mosjonsist, så kommer Petter med tips og råd som gjør deg til en bedre skiløper. Petter Northug gjester podcasten sammen med Team Swix-løper Mikael Gunnulfsen. Programleder er Håvard Rønning. Podcasten gis ut av Swix og Skisporet.no. 

Skisporet
Hjelp, jeg skal gå skirenn, men har blitt syk. Hva gjør jeg nå? | Erfaringer rundt sykdom før skirenn, Vasaloppet og Birken

Skisporet

Play Episode Listen Later Feb 28, 2024 41:20


Alle skiløpere på øverste nivå har opplevd å bli syk før et viktig renn. Joda, det er mer krise å miste et VM, OL eller en Verdenscup-debut enn at en du som mosjonist går glipp av Birken eller Vasaloppet på grunn av feber. Men fellesnevneren, uansett nivå på skiløperen, er den samme: Du har trent og jobbet for å nå et mål, men mist muligheten på grunn av sykdom. Hva skal du gjøre da? Skal du starte Birken og håpe på det beste? Eller skal du kaste inn håndkle og rettet fokus mot neste år?  I denne podcasten tar vi en god prat rundt temaet. Du hører Team Swix-løper Mikael Gunnulfsen som i 2020 måtte snu på vei til sin første World Cup på grunn av sykdom. Du hører også mosjonist Erik Gundersen som akkurat nå er sengeliggende med feber - kun dager før hans store mål: Vasaloppet. Podcasten gis ut av Swix og Skisporet.no. Programleder er Håvard Rønning.

Skisporet
Slik testes du for fluor i Birken og andre turrenn, og lær hvor enkelt du sikrer du gode, fluorfrie ski - Med Petter Skinstad og Swix Racing Service

Skisporet

Play Episode Listen Later Feb 20, 2024 48:53


Fluorforbudet er her for å bli, og i de siste helgene har både mosjonister og turløpere fått erfare fluor-testingen på kloss hold. I Trysil skimaraton ble faktisk ti prosent av de testede deltakerne diskvalifisert for å ha ski med for høyt fluorinnhold. Med økt fokus på testing i kjente løp som Birkebeinerrennet og Vasaloppet, er det essensielt å sikre at skiene er klare og frie for fluor hvis du har ambisjoner om å delta. Å oppnå fluorfrie ski er heldigvis ikke en utfordring. I denne episoden tar vi for oss spørsmål og oppklarer misforståelser om fluor, dens testing og hvordan du sikrer rene ski. I podcasten hører du Team Coop Madshus-løper Petter Skinstad og Henrik Johnsen fra Swix Racing Service. Programleder er Håvard Rønning. Podcasten gis ut av Swix og Skisporet.no. I denne artikkelen lærer du hvordan du renser skiene dine for fluor: https://www.swixsport.com/no/tips/artikler/skismoring/vask-og-rens-av-fluor/

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

We're writing this one day after the monster release of OpenAI's Sora and Gemini 1.5. We covered this on ‘s ThursdAI space, so head over there for our takes.IRL: We're ONE WEEK away from Latent Space: Final Frontiers, the second edition and anniversary of our first ever Latent Space event! Also: join us on June 25-27 for the biggest AI Engineer conference of the year!Online: All three Discord clubs are thriving. Join us every Wednesday/Friday!Almost 12 years ago, while working at Spotify, Erik Bernhardsson built one of the first open source vector databases, Annoy, based on ANN search. He also built Luigi, one of the predecessors to Airflow, which helps data teams orchestrate and execute data-intensive and long-running jobs. Surprisingly, he didn't start yet another vector database company, but instead in 2021 founded Modal, the “high-performance cloud for developers”. In 2022 they opened doors to developers after their seed round, and in 2023 announced their GA with a $16m Series A.More importantly, they have won fans among both household names like Ramp, Scale AI, Substack, and Cohere, and newer startups like (upcoming guest!) Suno.ai and individual hackers (Modal was the top tool of choice in the Vercel AI Accelerator):We've covered the nuances of GPU workloads, and how we need new developer tooling and runtimes for them (see our episodes with Chris Lattner of Modular and George Hotz of tiny to start). In this episode, we run through the major limitations of the actual infrastructure behind the clouds that run these models, and how Erik envisions the “postmodern data stack”. In his 2021 blog post “Software infrastructure 2.0: a wishlist”, Erik had “Truly serverless” as one of his points:* The word cluster is an anachronism to an end-user in the cloud! I'm already running things in the cloud where there's elastic resources available at any time. Why do I have to think about the underlying pool of resources? Just maintain it for me.* I don't ever want to provision anything in advance of load.* I don't want to pay for idle resources. Just let me pay for whatever resources I'm actually using.* Serverless doesn't mean it's a burstable VM that saves its instance state to disk during periods of idle.Swyx called this Self Provisioning Runtimes back in the day. Modal doesn't put you in YAML hell, preferring to colocate infra provisioning right next to the code that utilizes it, so you can just add GPU (and disk, and retries…):After 3 years, we finally have a big market push for this: running inference on generative models is going to be the killer app for serverless, for a few reasons:* AI models are stateless: even in conversational interfaces, each message generation is a fully-contained request to the LLM. There's no knowledge that is stored in the model itself between messages, which means that tear down / spin up of resources doesn't create any headaches with maintaining state.* Token-based pricing is better aligned with serverless infrastructure than fixed monthly costs of traditional software.* GPU scarcity makes it really expensive to have reserved instances that are available to you 24/7. It's much more convenient to build with a serverless-like infrastructure.In the episode we covered a lot more topics like maximizing GPU utilization, why Oracle Cloud rocks, and how Erik has never owned a TV in his life. Enjoy!Show Notes* Modal* ErikBot* Erik's Blog* Software Infra 2.0 Wishlist* Luigi* Annoy* Hetzner* CoreWeave* Cloudflare FaaS* Poolside AI* Modular Inference EngineChapters* [00:00:00] Introductions* [00:02:00] Erik's OSS work at Spotify: Annoy and Luigi* [00:06:22] Starting Modal* [00:07:54] Vision for a "postmodern data stack"* [00:10:43] Solving container cold start problems* [00:12:57] Designing Modal's Python SDK* [00:15:18] Self-Revisioning Runtime* [00:19:14] Truly Serverless Infrastructure* [00:20:52] Beyond model inference* [00:22:09] Tricks to maximize GPU utilization* [00:26:27] Differences in AI and data science workloads* [00:28:08] Modal vs Replicate vs Modular and lessons from Heroku's "graduation problem"* [00:34:12] Creating Erik's clone "ErikBot"* [00:37:43] Enabling massive parallelism across thousands of GPUs* [00:39:45] The Modal Sandbox for agents* [00:43:51] Thoughts on the AI Inference War* [00:49:18] Erik's best tweets* [00:51:57] Why buying hardware is a waste of money* [00:54:18] Erik's competitive programming backgrounds* [00:59:02] Why does Sweden have the best Counter Strike players?* [00:59:53] Never owning a car or TV* [01:00:21] Advice for infrastructure startupsTranscriptAlessio [00:00:00]: Hey everyone, welcome to the Latent Space podcast. This is Alessio, partner and CTO-in-Residence at Decibel Partners, and I'm joined by my co-host Swyx, founder of Smol AI.Swyx [00:00:14]: Hey, and today we have in the studio Erik Bernhardsson from Modal. Welcome.Erik [00:00:19]: Hi. It's awesome being here.Swyx [00:00:20]: Yeah. Awesome seeing you in person. I've seen you online for a number of years as you were building on Modal and I think you're just making a San Francisco trip just to see people here, right? I've been to like two Modal events in San Francisco here.Erik [00:00:34]: Yeah, that's right. We're based in New York, so I figured sometimes I have to come out to capital of AI and make a presence.Swyx [00:00:40]: What do you think is the pros and cons of building in New York?Erik [00:00:45]: I mean, I never built anything elsewhere. I lived in New York the last 12 years. I love the city. Obviously, there's a lot more stuff going on here and there's a lot more customers and that's why I'm out here. I do feel like for me, where I am in life, I'm a very boring person. I kind of work hard and then I go home and hang out with my kids. I don't have time to go to events and meetups and stuff anyway. In that sense, New York is kind of nice. I walk to work every morning. It's like five minutes away from my apartment. It's very time efficient in that sense. Yeah.Swyx [00:01:10]: Yeah. It's also a good life. So we'll do a brief bio and then we'll talk about anything else that people should know about you. Actually, I was surprised to find out you're from Sweden. You went to college in KTH and your master's was in implementing a scalable music recommender system. Yeah.Erik [00:01:27]: I had no idea. Yeah. So I actually studied physics, but I grew up coding and I did a lot of programming competition and then as I was thinking about graduating, I got in touch with an obscure music streaming startup called Spotify, which was then like 30 people. And for some reason, I convinced them, why don't I just come and write a master's thesis with you and I'll do some cool collaborative filtering, despite not knowing anything about collaborative filtering really. But no one knew anything back then. So I spent six months at Spotify basically building a prototype of a music recommendation system and then turned that into a master's thesis. And then later when I graduated, I joined Spotify full time.Swyx [00:02:00]: So that was the start of your data career. You also wrote a couple of popular open source tooling while you were there. Is that correct?Erik [00:02:09]: No, that's right. I mean, I was at Spotify for seven years, so this is a long stint. And Spotify was a wild place early on and I mean, data space is also a wild place. I mean, it was like Hadoop cluster in the like foosball room on the floor. It was a lot of crude, like very basic infrastructure and I didn't know anything about it. And like I was hired to kind of figure out data stuff. And I started hacking on a recommendation system and then, you know, got sidetracked in a bunch of other stuff. I fixed a bunch of reporting things and set up A-B testing and started doing like business analytics and later got back to music recommendation system. And a lot of the infrastructure didn't really exist. Like there was like Hadoop back then, which is kind of bad and I don't miss it. But I spent a lot of time with that. As a part of that, I ended up building a workflow engine called Luigi, which is like briefly like somewhat like widely ended up being used by a bunch of companies. Sort of like, you know, kind of like Airflow, but like before Airflow. I think it did some things better, some things worse. I also built a vector database called Annoy, which is like for a while, it was actually quite widely used. In 2012, so it was like way before like all this like vector database stuff ended up happening. And funny enough, I was actually obsessed with like vectors back then. Like I was like, this is going to be huge. Like just give it like a few years. I didn't know it was going to take like nine years and then there's going to suddenly be like 20 startups doing vector databases in one year. So it did happen. In that sense, I was right. I'm glad I didn't start a startup in the vector database space. I would have started way too early. But yeah, that was, yeah, it was a fun seven years as part of it. It was a great culture, a great company.Swyx [00:03:32]: Yeah. Just to take a quick tangent on this vector database thing, because we probably won't revisit it but like, has anything architecturally changed in the last nine years?Erik [00:03:41]: I'm actually not following it like super closely. I think, you know, some of the best algorithms are still the same as like hierarchical navigable small world.Swyx [00:03:51]: Yeah. HNSW.Erik [00:03:52]: Exactly. I think now there's like product quantization, there's like some other stuff that I haven't really followed super closely. I mean, obviously, like back then it was like, you know, it's always like very simple. It's like a C++ library with Python bindings and you could mmap big files and into memory and like they had some lookups. I used like this kind of recursive, like hyperspace splitting strategy, which is not that good, but it sort of was good enough at that time. But I think a lot of like HNSW is still like what people generally use. Now of course, like databases are much better in the sense like to support like inserts and updates and stuff like that. I know I never supported that. Yeah, it's sort of exciting to finally see like vector databases becoming a thing.Swyx [00:04:30]: Yeah. Yeah. And then maybe one takeaway on most interesting lesson from Daniel Ek?Erik [00:04:36]: I mean, I think Daniel Ek, you know, he started Spotify very young. Like he was like 25, something like that. And that was like a good lesson. But like he, in a way, like I think he was a very good leader. Like there was never anything like, no scandals or like no, he wasn't very eccentric at all. It was just kind of like very like level headed, like just like ran the company very well, like never made any like obvious mistakes or I think it was like a few bets that maybe like in hindsight were like a little, you know, like took us, you know, too far in one direction or another. But overall, I mean, I think he was a great CEO, like definitely, you know, up there, like generational CEO, at least for like Swedish startups.Swyx [00:05:09]: Yeah, yeah, for sure. Okay, we should probably move to make our way towards Modal. So then you spent six years as CTO of Better. You were an early engineer and then you scaled up to like 300 engineers.Erik [00:05:21]: I joined as a CTO when there was like no tech team. And yeah, that was a wild chapter in my life. Like the company did very well for a while. And then like during the pandemic, yeah, it was kind of a weird story, but yeah, it kind of collapsed.Swyx [00:05:32]: Yeah, laid off people poorly.Erik [00:05:34]: Yeah, yeah. It was like a bunch of stories. Yeah. I mean, the company like grew from like 10 people when I joined at 10,000, now it's back to a thousand. But yeah, they actually went public a few months ago, kind of crazy. They're still around, like, you know, they're still, you know, doing stuff. So yeah, very kind of interesting six years of my life for non-technical reasons, like I managed like three, four hundred, but yeah, like learning a lot of that, like recruiting. I spent all my time recruiting and stuff like that. And so managing at scale, it's like nice, like now in a way, like when I'm building my own startup. It's actually something I like, don't feel nervous about at all. Like I've managed a scale, like I feel like I can do it again. It's like very different things that I'm nervous about as a startup founder. But yeah, I started Modal three years ago after sort of, after leaving Better, I took a little bit of time off during the pandemic and, but yeah, pretty quickly I was like, I got to build something. I just want to, you know. Yeah. And then yeah, Modal took form in my head, took shape.Swyx [00:06:22]: And as far as I understand, and maybe we can sort of trade off questions. So the quick history is started Modal in 2021, got your seed with Sarah from Amplify in 2022. You just announced your Series A with Redpoint. That's right. And that brings us up to mostly today. Yeah. Most people, I think, were expecting you to build for the data space.Erik: But it is the data space.Swyx:: When I think of data space, I come from like, you know, Snowflake, BigQuery, you know, Fivetran, Nearby, that kind of stuff. And what Modal became is more general purpose than that. Yeah.Erik [00:06:53]: Yeah. I don't know. It was like fun. I actually ran into like Edo Liberty, the CEO of Pinecone, like a few weeks ago. And he was like, I was so afraid you were building a vector database. No, I started Modal because, you know, like in a way, like I work with data, like throughout my most of my career, like every different part of the stack, right? Like I thought everything like business analytics to like deep learning, you know, like building, you know, training neural networks, the scale, like everything in between. And so one of the thoughts, like, and one of the observations I had when I started Modal or like why I started was like, I just wanted to make, build better tools for data teams. And like very, like sort of abstract thing, but like, I find that the data stack is, you know, full of like point solutions that don't integrate well. And still, when you look at like data teams today, you know, like every startup ends up building their own internal Kubernetes wrapper or whatever. And you know, all the different data engineers and machine learning engineers end up kind of struggling with the same things. So I started thinking about like, how do I build a new data stack, which is kind of a megalomaniac project, like, because you kind of want to like throw out everything and start over.Swyx [00:07:54]: It's almost a modern data stack.Erik [00:07:55]: Yeah, like a postmodern data stack. And so I started thinking about that. And a lot of it came from like, like more focused on like the human side of like, how do I make data teams more productive? And like, what is the technology tools that they need? And like, you know, drew out a lot of charts of like, how the data stack looks, you know, what are different components. And it shows actually very interesting, like workflow scheduling, because it kind of sits in like a nice sort of, you know, it's like a hub in the graph of like data products. But it was kind of hard to like, kind of do that in a vacuum, and also to monetize it to some extent. I got very interested in like the layers below at some point. And like, at the end of the day, like most people have code to have to run somewhere. So I think about like, okay, well, how do you make that nice? Like how do you make that? And in particular, like the thing I always like thought about, like developer productivity is like, I think the best way to measure developer productivity is like in terms of the feedback loops, like how quickly when you iterate, like when you write code, like how quickly can you get feedback. And at the innermost loop, it's like writing code and then running it. And like, as soon as you start working with the cloud, like it's like takes minutes suddenly, because you have to build a Docker container and push it to the cloud and like run it, you know. So that was like the initial focus for me was like, I just want to solve that problem. Like I want to, you know, build something less, you run things in the cloud and like retain the sort of, you know, the joy of productivity as when you're running things locally. And in particular, I was quite focused on data teams, because I think they had a couple unique needs that wasn't well served by the infrastructure at that time, or like still is in like, in particular, like Kubernetes, I feel like it's like kind of worked okay for back end teams, but not so well for data teams. And very quickly, I got sucked into like a very deep like rabbit hole of like...Swyx [00:09:24]: Not well for data teams because of burstiness. Yeah, for sure.Erik [00:09:26]: So like burstiness is like one thing, right? Like, you know, like you often have this like fan out, you want to like apply some function over very large data sets. Another thing tends to be like hardware requirements, like you need like GPUs and like, I've seen this in many companies, like you go, you know, data scientists go to a platform team and they're like, can we add GPUs to the Kubernetes? And they're like, no, like, that's, you know, complex, and we're not gonna, so like just getting GPU access. And then like, I mean, I also like data code, like frankly, or like machine learning code like tends to be like, super annoying in terms of like environments, like you end up having like a lot of like custom, like containers and like environment conflicts. And like, it's very hard to set up like a unified container that like can serve like a data scientist, because like, there's always like packages that break. And so I think there's a lot of different reasons why the technology wasn't well suited for back end. And I think the attitude at that time is often like, you know, like you had friction between the data team and the platform team, like, well, it works for the back end stuff, you know, why don't you just like, you know, make it work. But like, I actually felt like data teams, you know, or at this point now, like there's so much, so many people working with data, and like they, to some extent, like deserve their own tools and their own tool chains, and like optimizing for that is not something people have done. So that's, that's sort of like very abstract philosophical reason why I started Model. And then, and then I got sucked into this like rabbit hole of like container cold start and, you know, like whatever, Linux, page cache, you know, file system optimizations.Swyx [00:10:43]: Yeah, tell people, I think the first time I met you, I think you told me some numbers, but I don't remember, like, what are the main achievements that you were unhappy with the status quo? And then you built your own container stack?Erik [00:10:52]: Yeah, I mean, like, in particular, it was like, in order to have that loop, right? You want to be able to start, like take code on your laptop, whatever, and like run in the cloud very quickly, and like running in custom containers, and maybe like spin up like 100 containers, 1000, you know, things like that. And so container cold start was the initial like, from like a developer productivity point of view, it was like, really, what I was focusing on is, I want to take code, I want to stick it in container, I want to execute in the cloud, and like, you know, make it feel like fast. And when you look at like, how Docker works, for instance, like Docker, you have this like, fairly convoluted, like very resource inefficient way, they, you know, you build a container, you upload the whole container, and then you download it, and you run it. And Kubernetes is also like, not very fast at like starting containers. So like, I started kind of like, you know, going a layer deeper, like Docker is actually like, you know, there's like a couple of different primitives, but like a lower level primitive is run C, which is like a container runner. And I was like, what if I just take the container runner, like run C, and I point it to like my own root file system, and then I built like my own virtual file system that exposes files over a network instead. And that was like the sort of very crude version of model, it's like now I can actually start containers very quickly, because it turns out like when you start a Docker container, like, first of all, like most Docker images are like several gigabytes, and like 99% of that is never going to be consumed, like there's a bunch of like, you know, like timezone information for like Uzbekistan, like no one's going to read it. And then there's a very high overlap between the files are going to be read, there's going to be like lib torch or whatever, like it's going to be read. So you can also cache it very well. So that was like the first sort of stuff we started working on was like, let's build this like container file system. And you know, coupled with like, you know, just using run C directly. And that actually enabled us to like, get to this point of like, you write code, and then you can launch it in the cloud within like a second or two, like something like that. And you know, there's been many optimizations since then, but that was sort of starting point.Alessio [00:12:33]: Can we talk about the developer experience as well, I think one of the magic things about Modal is at the very basic layers, like a Python function decorator, it's just like stub and whatnot. But then you also have a way to define a full container, what were kind of the design decisions that went into it? Where did you start? How easy did you want it to be? And then maybe how much complexity did you then add on to make sure that every use case fit?Erik [00:12:57]: I mean, Modal, I almost feel like it's like almost like two products kind of glued together. Like there's like the low level like container runtime, like file system, all that stuff like in Rust. And then there's like the Python SDK, right? Like how do you express applications? And I think, I mean, Swix, like I think your blog was like the self-provisioning runtime was like, to me, always like to sort of, for me, like an eye-opening thing. It's like, so I didn't think about like...Swyx [00:13:15]: You wrote your post four months before me. Yeah? The software 2.0, Infra 2.0. Yeah.Erik [00:13:19]: Well, I don't know, like convergence of minds. I guess we were like both thinking. Maybe you put, I think, better words than like, you know, maybe something I was like thinking about for a long time. Yeah.Swyx [00:13:29]: And I can tell you how I was thinking about it on my end, but I want to hear you say it.Erik [00:13:32]: Yeah, yeah, I would love to. So to me, like what I always wanted to build was like, I don't know, like, I don't know if you use like Pulumi. Like Pulumi is like nice, like in the sense, like it's like Pulumi is like you describe infrastructure in code, right? And to me, that was like so nice. Like finally I can like, you know, put a for loop that creates S3 buckets or whatever. And I think like Modal sort of goes one step further in the sense that like, what if you also put the app code inside the infrastructure code and like glue it all together and then like you only have one single place that defines everything and it's all programmable. You don't have any config files. Like Modal has like zero config. There's no config. It's all code. And so that was like the goal that I wanted, like part of that. And then the other part was like, I often find that so much of like my time was spent on like the plumbing between containers. And so my thing was like, well, if I just build this like Python SDK and make it possible to like bridge like different containers, just like a function call, like, and I can say, oh, this function runs in this container and this other function runs in this container and I can just call it just like a normal function, then, you know, I can build these applications that may span a lot of different environments. Maybe they fan out, start other containers, but it's all just like inside Python. You just like have this beautiful kind of nice like DSL almost for like, you know, how to control infrastructure in the cloud. So that was sort of like how we ended up with the Python SDK as it is, which is still evolving all the time, by the way. We keep changing syntax quite a lot because I think it's still somewhat exploratory, but we're starting to converge on something that feels like reasonably good now.Swyx [00:14:54]: Yeah. And along the way you, with this expressiveness, you enabled the ability to, for example, attach a GPU to a function. Totally.Erik [00:15:02]: Yeah. It's like you just like say, you know, on the function decorator, you're like GPU equals, you know, A100 and then or like GPU equals, you know, A10 or T4 or something like that. And then you get that GPU and like, you know, you just run the code and it runs like you don't have to, you know, go through hoops to, you know, start an EC2 instance or whatever.Swyx [00:15:18]: Yeah. So it's all code. Yeah. So one of the reasons I wrote Self-Revisioning Runtimes was I was working at AWS and we had AWS CDK, which is kind of like, you know, the Amazon basics blew me. Yeah, totally. And then, and then like it creates, it compiles the cloud formation. Yeah. And then on the other side, you have to like get all the config stuff and then put it into your application code and make sure that they line up. So then you're writing code to define your infrastructure, then you're writing code to define your application. And I was just like, this is like obvious that it's going to converge, right? Yeah, totally.Erik [00:15:48]: But isn't there like, it might be wrong, but like, was it like SAM or Chalice or one of those? Like, isn't that like an AWS thing that where actually they kind of did that? I feel like there's like one.Swyx [00:15:57]: SAM. Yeah. Still very clunky. It's not, not as elegant as modal.Erik [00:16:03]: I love AWS for like the stuff it's built, you know, like historically in order for me to like, you know, what it enables me to build, but like AWS is always like struggle with developer experience.Swyx [00:16:11]: I mean, they have to not break things.Erik [00:16:15]: Yeah. Yeah. And totally. And they have to build products for a very wide range of use cases. And I think that's hard.Swyx [00:16:21]: Yeah. Yeah. So it's, it's easier to design for. Yeah. So anyway, I was, I was pretty convinced that this, this would happen. I wrote, wrote that thing. And then, you know, I imagine my surprise that you guys had it on your landing page at some point. I think, I think Akshad was just like, just throw that in there.Erik [00:16:34]: Did you trademark it?Swyx [00:16:35]: No, I didn't. But I definitely got sent a few pitch decks with my post on there and it was like really interesting. This is my first time like kind of putting a name to a phenomenon. And I think this is a useful skill for people to just communicate what they're trying to do.Erik [00:16:48]: Yeah. No, I think it's a beautiful concept.Swyx [00:16:50]: Yeah. Yeah. Yeah. But I mean, obviously you implemented it. What became more clear in your explanation today is that actually you're not that tied to Python.Erik [00:16:57]: No. I mean, I, I think that all the like lower level stuff is, you know, just running containers and like scheduling things and, you know, serving container data and stuff. So like one of the benefits of data teams is obviously like they're all like using Python, right? And so that made it a lot easier. I think, you know, if we had focused on other workloads, like, you know, for various reasons, we've like been kind of like half thinking about like CI or like things like that. But like, in a way that's like harder because like you also, then you have to be like, you know, multiple SDKs, whereas, you know, focusing on data teams, you can only, you know, Python like covers like 95% of all teams. That made it a lot easier. But like, I mean, like definitely like in the future, we're going to have others support, like supporting other languages. JavaScript for sure is the obvious next language. But you know, who knows, like, you know, Rust, Go, R, whatever, PHP, Haskell, I don't know.Swyx [00:17:42]: You know, I think for me, I actually am a person who like kind of liked the idea of programming language advancements being improvements in developer experience. But all I saw out of the academic sort of PLT type people is just type level improvements. And I always think like, for me, like one of the core reasons for self-provisioning runtimes and then why I like Modal is like, this is actually a productivity increase, right? Like, it's a language level thing, you know, you managed to stick it on top of an existing language, but it is your own language, a DSL on top of Python. And so language level increase on the order of like automatic memory management. You know, you could sort of make that analogy that like, maybe you lose some level of control, but most of the time you're okay with whatever Modal gives you. And like, that's fine. Yeah.Erik [00:18:26]: Yeah. Yeah. I mean, that's how I look at about it too. Like, you know, you look at developer productivity over the last number of decades, like, you know, it's come in like small increments of like, you know, dynamic typing or like is like one thing because not suddenly like for a lot of use cases, you don't need to care about type systems or better compiler technology or like, you know, the cloud or like, you know, relational databases. And, you know, I think, you know, you look at like that, you know, history, it's a steadily, you know, it's like, you know, you look at the developers have been getting like probably 10X more productive every decade for the last four decades or something that was kind of crazy. Like on an exponential scale, we're talking about 10X or is there a 10,000X like, you know, improvement in developer productivity. What we can build today, you know, is arguably like, you know, a fraction of the cost of what it took to build it in the eighties. Maybe it wasn't even possible in the eighties. So that to me, like, that's like so fascinating. I think it's going to keep going for the next few decades. Yeah.Alessio [00:19:14]: Yeah. Another big thing in the infra 2.0 wishlist was truly serverless infrastructure. The other on your landing page, you called them native cloud functions, something like that. I think the issue I've seen with serverless has always been people really wanted it to be stateful, even though stateless was much easier to do. And I think now with AI, most model inference is like stateless, you know, outside of the context. So that's kind of made it a lot easier to just put a model, like an AI model on model to run. How do you think about how that changes how people think about infrastructure too? Yeah.Erik [00:19:48]: I mean, I think model is definitely going in the direction of like doing more stateful things and working with data and like high IO use cases. I do think one like massive serendipitous thing that happened like halfway, you know, a year and a half into like the, you know, building model was like Gen AI started exploding and the IO pattern of Gen AI is like fits the serverless model like so well, because it's like, you know, you send this tiny piece of information, like a prompt, right, or something like that. And then like you have this GPU that does like trillions of flops, and then it sends back like a tiny piece of information, right. And that turns out to be something like, you know, if you can get serverless working with GPU, that just like works really well, right. So I think from that point of view, like serverless always to me felt like a little bit of like a solution looking for a problem. I don't actually like don't think like backend is like the problem that needs to serve it or like not as much. But I look at data and in particular, like things like Gen AI, like model inference, like it's like clearly a good fit. So I think that is, you know, to a large extent explains like why we saw, you know, the initial sort of like killer app for model being model inference, which actually wasn't like necessarily what we're focused on. But that's where we've seen like by far the most usage. Yeah.Swyx [00:20:52]: And this was before you started offering like fine tuning of language models, it was mostly stable diffusion. Yeah.Erik [00:20:59]: Yeah. I mean, like model, like I always built it to be a very general purpose compute platform, like something where you can run everything. And I used to call model like a better Kubernetes for data team for a long time. What we realized was like, yeah, that's like, you know, a year and a half in, like we barely had any users or any revenue. And like we were like, well, maybe we should look at like some use case, trying to think of use case. And that was around the same time stable diffusion came out. And the beauty of model is like you can run almost anything on model, right? Like model inference turned out to be like the place where we found initially, well, like clearly this has like 10x like better agronomics than anything else. But we're also like, you know, going back to my original vision, like we're thinking a lot about, you know, now, okay, now we do inference really well. Like what about training? What about fine tuning? What about, you know, end-to-end lifecycle deployment? What about data pre-processing? What about, you know, I don't know, real-time streaming? What about, you know, large data munging, like there's just data observability. I think there's so many things, like kind of going back to what I said about like redefining the data stack, like starting with the foundation of compute. Like one of the exciting things about model is like we've sort of, you know, we've been working on that for three years and it's maturing, but like this is so many things you can do like with just like a better compute primitive and also go up to stack and like do all this other stuff on top of it.Alessio [00:22:09]: How do you think about or rather like I would love to learn more about the underlying infrastructure and like how you make that happen because with fine tuning and training, it's a static memory. Like you exactly know what you're going to load in memory one and it's kind of like a set amount of compute versus inference, just like data is like very bursty. How do you make batches work with a serverless developer experience? You know, like what are like some fun technical challenge you solve to make sure you get max utilization on these GPUs? What we hear from people is like, we have GPUs, but we can really only get like, you know, 30, 40, 50% maybe utilization. What's some of the fun stuff you're working on to get a higher number there?Erik [00:22:48]: Yeah, I think on the inference side, like that's where we like, you know, like from a cost perspective, like utilization perspective, we've seen, you know, like very good numbers and in particular, like it's our ability to start containers and stop containers very quickly. And that means that we can auto scale extremely fast and scale down very quickly, which means like we can always adjust the sort of capacity, the number of GPUs running to the exact traffic volume. And so in many cases, like that actually leads to a sort of interesting thing where like we obviously run our things on like the public cloud, like AWS GCP, we run on Oracle, but in many cases, like users who do inference on those platforms or those clouds, even though we charge a slightly higher price per GPU hour, a lot of users like moving their large scale inference use cases to model, they end up saving a lot of money because we only charge for like with the time the GPU is actually running. And that's a hard problem, right? Like, you know, if you have to constantly adjust the number of machines, if you have to start containers, stop containers, like that's a very hard problem. Starting containers quickly is a very difficult thing. I mentioned we had to build our own file system for this. We also, you know, built our own container scheduler for that. We've implemented recently CPU memory checkpointing so we can take running containers and snapshot the entire CPU, like including registers and everything, and restore it from that point, which means we can restore it from an initialized state. We're looking at GPU checkpointing next, it's like a very interesting thing. So I think with inference stuff, that's where serverless really shines because you can drive, you know, you can push the frontier of latency versus utilization quite substantially, you know, which either ends up being a latency advantage or a cost advantage or both, right? On training, it's probably arguably like less of an advantage doing serverless, frankly, because you know, you can just like spin up a bunch of machines and try to satisfy, like, you know, train as much as you can on each machine. For that area, like we've seen, like, you know, arguably like less usage, like for modal, but there are always like some interesting use case. Like we do have a couple of customers, like RAM, for instance, like they do fine tuning with modal and they basically like one of the patterns they have is like very bursty type fine tuning where they fine tune 100 models in parallel. And that's like a separate thing that modal does really well, right? Like you can, we can start up 100 containers very quickly, run a fine tuning training job on each one of them for that only runs for, I don't know, 10, 20 minutes. And then, you know, you can do hyper parameter tuning in that sense, like just pick the best model and things like that. So there are like interesting training. I think when you get to like training, like very large foundational models, that's a use case we don't support super well, because that's very high IO, you know, you need to have like infinite band and all these things. And those are things we haven't supported yet and might take a while to get to that. So that's like probably like an area where like we're relatively weak in. Yeah.Alessio [00:25:12]: Have you cared at all about lower level model optimization? There's other cloud providers that do custom kernels to get better performance or are you just given that you're not just an AI compute company? Yeah.Erik [00:25:24]: I mean, I think like we want to support like a generic, like general workloads in a sense that like we want users to give us a container essentially or a code or code. And then we want to run that. So I think, you know, we benefit from those things in the sense that like we can tell our users, you know, to use those things. But I don't know if we want to like poke into users containers and like do those things automatically. That's sort of, I think a little bit tricky from the outside to do, because we want to be able to take like arbitrary code and execute it. But certainly like, you know, we can tell our users to like use those things. Yeah.Swyx [00:25:53]: I may have betrayed my own biases because I don't really think about modal as for data teams anymore. I think you started, I think you're much more for AI engineers. My favorite anecdotes, which I think, you know, but I don't know if you directly experienced it. I went to the Vercel AI Accelerator, which you supported. And in the Vercel AI Accelerator, a bunch of startups gave like free credits and like signups and talks and all that stuff. The only ones that stuck are the ones that actually appealed to engineers. And the top usage, the top tool used by far was modal.Erik [00:26:24]: That's awesome.Swyx [00:26:25]: For people building with AI apps. Yeah.Erik [00:26:27]: I mean, it might be also like a terminology question, like the AI versus data, right? Like I've, you know, maybe I'm just like old and jaded, but like, I've seen so many like different titles, like for a while it was like, you know, I was a data scientist and a machine learning engineer and then, you know, there was like analytics engineers and there was like an AI engineer, you know? So like, to me, it's like, I just like in my head, that's to me just like, just data, like, or like engineer, you know, like I don't really, so that's why I've been like, you know, just calling it data teams. But like, of course, like, you know, AI is like, you know, like such a massive fraction of our like workloads.Swyx [00:26:59]: It's a different Venn diagram of things you do, right? So the stuff that you're talking about where you need like infinite bands for like highly parallel training, that's not, that's more of the ML engineer, that's more of the research scientist and less of the AI engineer, which is more sort of trying to put, work at the application.Erik [00:27:16]: Yeah. I mean, to be fair to it, like we have a lot of users that are like doing stuff that I don't think fits neatly into like AI. Like we have a lot of people using like modal for web scraping, like it's kind of nice. You can just like, you know, fire up like a hundred or a thousand containers running Chromium and just like render a bunch of webpages and it takes, you know, whatever. Or like, you know, protein folding is that, I mean, maybe that's, I don't know, like, but like, you know, we have a bunch of users doing that or, or like, you know, in terms of, in the realm of biotech, like sequence alignment, like people using, or like a couple of people using like modal to run like large, like mixed integer programming problems, like, you know, using Gurobi or like things like that. So video processing is another thing that keeps coming up, like, you know, let's say you have like petabytes of video and you want to just like transcode it, like, or you can fire up a lot of containers and just run FFmpeg or like, so there are those things too. Like, I mean, like that being said, like AI is by far our biggest use case, but you know, like, again, like modal is kind of general purpose in that sense.Swyx [00:28:08]: Yeah. Well, maybe I'll stick to the stable diffusion thing and then we'll move on to the other use cases for AI that you want to highlight. The other big player in my mind is replicate. Yeah. In this, in this era, they're much more, I guess, custom built for that purpose, whereas you're more general purpose. How do you position yourself with them? Are they just for like different audiences or are you just heads on competing?Erik [00:28:29]: I think there's like a tiny sliver of the Venn diagram where we're competitive. And then like 99% of the area we're not competitive. I mean, I think for people who, if you look at like front-end engineers, I think that's where like really they found good fit is like, you know, people who built some cool web app and they want some sort of AI capability and they just, you know, an off the shelf model is like perfect for them. That's like, I like use replicate. That's great. I think where we shine is like custom models or custom workflows, you know, running things at very large scale. We need to care about utilization, care about costs. You know, we have much lower prices because we spend a lot more time optimizing our infrastructure, you know, and that's where we're competitive, right? Like, you know, and you look at some of the use cases, like Suno is a big user, like they're running like large scale, like AI. Oh, we're talking with Mikey.Swyx [00:29:12]: Oh, that's great. Cool.Erik [00:29:14]: In a month. Yeah. So, I mean, they're, they're using model for like production infrastructure. Like they have their own like custom model, like custom code and custom weights, you know, for AI generated music, Suno.AI, you know, that, that, those are the types of use cases that we like, you know, things that are like very custom or like, it's like, you know, and those are the things like it's very hard to run and replicate, right? And that's fine. Like I think they, they focus on a very different part of the stack in that sense.Swyx [00:29:35]: And then the other company pattern that I pattern match you to is Modular. I don't know.Erik [00:29:40]: Because of the names?Swyx [00:29:41]: No, no. Wow. No, but yeah, yes, the name is very similar. I think there's something that might be insightful there from a linguistics point of view. Oh no, they have Mojo, the sort of Python SDK. And they have the Modular Inference Engine, which is their sort of their cloud stack, their sort of compute inference stack. I don't know if anyone's made that comparison to you before, but like I see you evolving a little bit in parallel there.Erik [00:30:01]: No, I mean, maybe. Yeah. Like it's not a company I'm like super like familiar, like, I mean, I know the basics, but like, I guess they're similar in the sense like they want to like do a lot of, you know, they have sort of big picture vision.Swyx [00:30:12]: Yes. They also want to build very general purpose. Yeah. So they're marketing themselves as like, if you want to do off the shelf stuff, go out, go somewhere else. If you want to do custom stuff, we're the best place to do it. Yeah. Yeah. There is some overlap there. There's not overlap in the sense that you are a closed source platform. People have to host their code on you. That's true. Whereas for them, they're very insistent on not running their own cloud service. They're a box software. Yeah. They're licensed software.Erik [00:30:37]: I'm sure their VCs at some point going to force them to reconsider. No, no.Swyx [00:30:40]: Chris is very, very insistent and very convincing. So anyway, I would just make that comparison, let people make the links if they want to. But it's an interesting way to see the cloud market develop from my point of view, because I came up in this field thinking cloud is one thing, and I think your vision is like something slightly different, and I see the different takes on it.Erik [00:31:00]: Yeah. And like one thing I've, you know, like I've written a bit about it in my blog too, it's like I think of us as like a second layer of cloud provider in the sense that like I think Snowflake is like kind of a good analogy. Like Snowflake, you know, is infrastructure as a service, right? But they actually run on the like major clouds, right? And I mean, like you can like analyze this very deeply, but like one of the things I always thought about is like, why does Snowflake arbitrarily like win over Redshift? And I think Snowflake, you know, to me, one, because like, I mean, in the end, like AWS makes all the money anyway, like and like Snowflake just had the ability to like focus on like developer experience or like, you know, user experience. And to me, like really proved that you can build a cloud provider, a layer up from, you know, the traditional like public clouds. And in that layer, that's also where I would put Modal, it's like, you know, we're building a cloud provider, like we're, you know, we're like a multi-tenant environment that runs the user code. But we're also building on top of the public cloud. So I think there's a lot of room in that space, I think is very sort of interesting direction.Alessio [00:31:55]: How do you think of that compared to the traditional past history, like, you know, you had AWS, then you had Heroku, then you had Render, Railway.Erik [00:32:04]: Yeah, I mean, I think those are all like great. I think the problem that they all faced was like the graduation problem, right? Like, you know, Heroku or like, I mean, like also like Heroku, there's like a counterfactual future of like, what would have happened if Salesforce didn't buy them, right? Like, that's a sort of separate thing. But like, I think what Heroku, I think always struggled with was like, eventually companies would get big enough that you couldn't really justify running in Heroku. So they would just go and like move it to, you know, whatever AWS or, you know, in particular. And you know, that's something that keeps me up at night too, like, what does that graduation risk like look like for modal? I always think like the only way to build a successful infrastructure company in the long run in the cloud today is you have to appeal to the entire spectrum, right? Or at least like the enterprise, like you have to capture the enterprise market. But the truly good companies capture the whole spectrum, right? Like I think of companies like, I don't like Datadog or Mongo or something that were like, they both captured like the hobbyists and acquire them, but also like, you know, have very large enterprise customers. I think that arguably was like where I, in my opinion, like Heroku struggle was like, how do you maintain the customers as they get more and more advanced? I don't know what the solution is, but I think there's, you know, that's something I would have thought deeply if I was at Heroku at that time.Alessio [00:33:14]: What's the AI graduation problem? Is it, I need to fine tune the model, I need better economics, any insights from customer discussions?Erik [00:33:22]: Yeah, I mean, better economics, certainly. But although like, I would say like, even for people who like, you know, needs like thousands of GPUs, just because we can drive utilization so much better, like we, there's actually like a cost advantage of staying on modal. But yeah, I mean, certainly like, you know, and like the fact that VCs like love, you know, throwing money at least used to, you know, add companies who need it to buy GPUs. I think that didn't help the problem. And in training, I think, you know, there's less software differentiation. So in training, I think there's certainly like better economics of like buying big clusters. But I mean, my hope it's going to change, right? Like I think, you know, we're still pretty early in the cycle of like building AI infrastructure. And I think a lot of these companies over in the long run, like, you know, they're, except it may be super big ones, like, you know, on Facebook and Google, they're always going to build their own ones. But like everyone else, like some extent, you know, I think they're better off like buying platforms. And, you know, someone's going to have to build those platforms.Swyx [00:34:12]: Yeah. Cool. Let's move on to language models and just specifically that workload just to flesh it out a little bit. You already said that RAMP is like fine tuning 100 models at once simultaneously on modal. Closer to home, my favorite example is ErikBot. Maybe you want to tell that story.Erik [00:34:30]: Yeah. I mean, it was a prototype thing we built for fun, but it's pretty cool. Like we basically built this thing that hooks up to Slack. It like downloads all the Slack history and, you know, fine-tunes a model based on a person. And then you can chat with that. And so you can like, you know, clone yourself and like talk to yourself on Slack. I mean, it's like nice like demo and it's just like, I think like it's like fully contained modal. Like there's a modal app that does everything, right? Like it downloads Slack, you know, integrates with the Slack API, like downloads the stuff, the data, like just runs the fine-tuning and then like creates like dynamically an inference endpoint. And it's all like self-contained and like, you know, a few hundred lines of code. So I think it's sort of a good kind of use case for, or like it kind of demonstrates a lot of the capabilities of modal.Alessio [00:35:08]: Yeah. On a more personal side, how close did you feel ErikBot was to you?Erik [00:35:13]: It definitely captured the like the language. Yeah. I mean, I don't know, like the content, I always feel this way about like AI and it's gotten better. Like when you look at like AI output of text, like, and it's like, when you glance at it, it's like, yeah, this seems really smart, you know, but then you actually like look a little bit deeper. It's like, what does this mean?Swyx [00:35:32]: What does this person say?Erik [00:35:33]: It's like kind of vacuous, right? And that's like kind of what I felt like, you know, talking to like my clone version, like it's like says like things like the grammar is correct. Like some of the sentences make a lot of sense, but like, what are you trying to say? Like there's no content here. I don't know. I mean, it's like, I got that feeling also with chat TBT in the like early versions right now it's like better, but.Alessio [00:35:51]: That's funny. So I built this thing called small podcaster to automate a lot of our back office work, so to speak. And it's great at transcript. It's great at doing chapters. And then I was like, okay, how about you come up with a short summary? And it's like, it sounds good, but it's like, it's not even the same ballpark as like, yeah, end up writing. Right. And it's hard to see how it's going to get there.Swyx [00:36:11]: Oh, I have ideas.Erik [00:36:13]: I'm certain it's going to get there, but like, I agree with you. Right. And like, I have the same thing. I don't know if you've read like AI generated books. Like they just like kind of seem funny, right? Like there's off, right? But like you glance at it and it's like, oh, it's kind of cool. Like looks correct, but then it's like very weird when you actually read them.Swyx [00:36:30]: Yeah. Well, so for what it's worth, I think anyone can join the modal slack. Is it open to the public? Yeah, totally.Erik [00:36:35]: If you go to modal.com, there's a button in the footer.Swyx [00:36:38]: Yeah. And then you can talk to Erik Bot. And then sometimes I really like picking Erik Bot and then you answer afterwards, but then you're like, yeah, mostly correct or whatever. Any other broader lessons, you know, just broadening out from like the single use case of fine tuning, like what are you seeing people do with fine tuning or just language models on modal in general? Yeah.Erik [00:36:59]: I mean, I think language models is interesting because so many people get started with APIs and that's just, you know, they're just dominating a space in particular opening AI, right? And that's not necessarily like a place where we aim to compete. I mean, maybe at some point, but like, it's just not like a core focus for us. And I think sort of separately, it's sort of a question of like, there's economics in that long term. But like, so we tend to focus on more like the areas like around it, right? Like fine tuning, like another use case we have is a bunch of people, Ramp included, is doing batch embeddings on modal. So let's say, you know, you have like a, actually we're like writing a blog post, like we take all of Wikipedia and like parallelize embeddings in 15 minutes and produce vectors for each article. So those types of use cases, I think modal suits really well for. I think also a lot of like custom inference, like yeah, I love that.Swyx [00:37:43]: Yeah. I think you should give people an idea of the order of magnitude of parallelism, because I think people don't understand how parallel. So like, I think your classic hello world with modal is like some kind of Fibonacci function, right? Yeah, we have a bunch of different ones. Some recursive function. Yeah.Erik [00:37:59]: Yeah. I mean, like, yeah, I mean, it's like pretty easy in modal, like fan out to like, you know, at least like 100 GPUs, like in a few seconds. And you know, if you give it like a couple of minutes, like we can, you know, you can fan out to like thousands of GPUs. Like we run it relatively large scale. And yeah, we've run, you know, many thousands of GPUs at certain points when we needed, you know, big backfills or some customers had very large compute needs.Swyx [00:38:21]: Yeah. Yeah. And I mean, that's super useful for a number of things. So one of my early interactions with modal as well was with a small developer, which is my sort of coding agent. The reason I chose modal was a number of things. One, I just wanted to try it out. I just had an excuse to try it. Akshay offered to onboard me personally. But the most interesting thing was that you could have that sort of local development experience as it was running on my laptop, but then it would seamlessly translate to a cloud service or like a cloud hosted environment. And then it could fan out with concurrency controls. So I could say like, because like, you know, the number of times I hit the GPT-3 API at the time was going to be subject to the rate limit. But I wanted to fan out without worrying about that kind of stuff. With modal, I can just kind of declare that in my config and that's it. Oh, like a concurrency limit?Erik [00:39:07]: Yeah. Yeah.Swyx [00:39:09]: Yeah. There's a lot of control. And that's why it's like, yeah, this is a pretty good use case for like writing this kind of LLM application code inside of this environment that just understands fan out and rate limiting natively. You don't actually have an exposed queue system, but you have it under the hood, you know, that kind of stuff. Totally.Erik [00:39:28]: It's a self-provisioning cloud.Swyx [00:39:30]: So the last part of modal I wanted to touch on, and obviously feel free, I know you're working on new features, was the sandbox that was introduced last year. And this is something that I think was inspired by Code Interpreter. You can tell me the longer history behind that.Erik [00:39:45]: Yeah. Like we originally built it for the use case, like there was a bunch of customers who looked into code generation applications and then they came to us and asked us, is there a safe way to execute code? And yeah, we spent a lot of time on like container security. We used GeoVisor, for instance, which is a Google product that provides pretty strong isolation of code. So we built a product where you can basically like run arbitrary code inside a container and monitor its output or like get it back in a safe way. I mean, over time it's like evolved into more of like, I think the long-term direction is actually I think more interesting, which is that I think modal as a platform where like I think the core like container infrastructure we offer could actually be like, you know, unbundled from like the client SDK and offer to like other, you know, like we're talking to a couple of like other companies that want to run, you know, through their packages, like run, execute jobs on modal, like kind of programmatically. So that's actually the direction like Sandbox is going. It's like turning into more like a platform for platforms is kind of what I've been thinking about it as.Swyx [00:40:45]: Oh boy. Platform. That's the old Kubernetes line.Erik [00:40:48]: Yeah. Yeah. Yeah. But it's like, you know, like having that ability to like programmatically, you know, create containers and execute them, I think, I think is really cool. And I think it opens up a lot of interesting capabilities that are sort of separate from the like core Python SDK in modal. So I'm really excited about C. It's like one of those features that we kind of released and like, you know, then we kind of look at like what users actually build with it and people are starting to build like kind of crazy things. And then, you know, we double down on some of those things because when we see like, you know, potential new product features and so Sandbox, I think in that sense, it's like kind of in that direction. We found a lot of like interesting use cases in the direction of like platformized container runner.Swyx [00:41:27]: Can you be more specific about what you're double down on after seeing users in action?Erik [00:41:32]: I mean, we're working with like some companies that, I mean, without getting into specifics like that, need the ability to take their users code and then launch containers on modal. And it's not about security necessarily, like they just want to use modal as a back end, right? Like they may already provide like Kubernetes as a back end, Lambda as a back end, and now they want to add modal as a back end, right? And so, you know, they need a way to programmatically define jobs on behalf of their users and execute them. And so, I don't know, that's kind of abstract, but does that make sense? I totally get it.Swyx [00:42:03]: It's sort of one level of recursion to sort of be the Modal for their customers.Erik [00:42:09]: Exactly.Swyx [00:42:10]: Yeah, exactly. And Cloudflare has done this, you know, Kenton Vardar from Cloudflare, who's like the tech lead on this thing, called it sort of functions as a service as a service.Erik [00:42:17]: Yeah, that's exactly right. FaSasS.Swyx [00:42:21]: FaSasS. Yeah, like, I mean, like that, I think any base layer, second layer cloud provider like yourself, compute provider like yourself should provide, you know, it's a mark of maturity and success that people just trust you to do that. They'd rather build on top of you than compete with you. The more interesting thing for me is like, what does it mean to serve a computer like an LLM developer, rather than a human developer, right? Like, that's what a sandbox is to me, that you have to redefine modal to serve a different non-human audience.Erik [00:42:51]: Yeah. Yeah, and I think there's some really interesting people, you know, building very cool things.Swyx [00:42:55]: Yeah. So I don't have an answer, but, you know, I imagine things like, hey, the way you give feedback is different. Maybe you have to like stream errors, log errors differently. I don't really know. Yeah. Obviously, there's like safety considerations. Maybe you have an API to like restrict access to the web. Yeah. I don't think anyone would use it, but it's there if you want it.Erik [00:43:17]: Yeah.Swyx [00:43:18]: Yeah. Any other sort of design considerations? I have no idea.Erik [00:43:21]: With sandboxes?Swyx [00:43:22]: Yeah. Yeah.Erik [00:43:24]: Open-ended question here. Yeah. I mean, no, I think, yeah, the network restrictions, I think, make a lot of sense. Yeah. I mean, I think, you know, long-term, like, I think there's a lot of interesting use cases where like the LLM, in itself, can like decide, I want to install these packages and like run this thing. And like, obviously, for a lot of those use cases, like you want to have some sort of control that it doesn't like install malicious stuff and steal your secrets and things like that. But I think that's what's exciting about the sandbox primitive, is like it lets you do that in a relatively safe way.Alessio [00:43:51]: Do you have any thoughts on the inference wars? A lot of providers are just rushing to the bottom to get the lowest price per million tokens. Some of them, you know, the Sean Randomat, they're just losing money and there's like the physics of it just don't work out for them to make any money on it. How do you think about your pricing and like how much premium you can get and you can kind of command versus using lower prices as kind of like a wedge into getting there, especially once you have model instrumented? What are the tradeoffs and any thoughts on strategies that work?Erik [00:44:23]: I mean, we focus more on like custom models and custom code. And I think in that space, there's like less competition and I think we can have a pricing markup, right? Like, you know, people will always compare our prices to like, you know, the GPU power they can get elsewhere. And so how big can that markup be? Like it never can be, you know, we can never charge like 10x more, but we can certainly charge a premium. And like, you know, for that reason, like we can have pretty good margins. The LLM space is like the opposite, like the switching cost of LLMs is zero. If all you're doing is like straight up, like at least like open source, right? Like if all you're doing is like, you know, using some, you know, inference endpoint that serves an open source model and, you know, some other provider comes along and like offers a lower price, you're just going to switch, right? So I don't know, to me that reminds me a lot of like all this like 15 minute delivery wars or like, you know, like Uber versus Lyft, you know, and like maybe going back even further, like I think a lot about like sort of, you know, flip side of this is like, it's actually a positive side, which is like, I thought a lot about like fiber optics boom of like 98, 99, like the other day, or like, you know, and also like the overinvestment in GPU today. Like, like, yeah, like, you know, I don't know, like in the end, like, I don't think VCs will have the return they expected, like, you know, in these things, but guess who's going to benefit, like, you know, is the consumers, like someone's like reaping the value of this. And that's, I think an amazing flip side is that, you know, we should be very grateful, the fact that like VCs want to subsidize these things, which is, you know, like you go back to fiber optics, like there was an extreme, like overinvestment in fiber optics network in like 98. And no one made money who did that. But consumers, you know, got tremendous benefits of all the fiber optics cables that were led, you know, throughout the country in the decades after. I feel something similar abou

Skisporet
Slik trener verdens beste langløpere – Med Kasper Stadaas

Skisporet

Play Episode Listen Later Feb 8, 2024 68:30


Team Ragde-løper Kasper Stadaas har så langt denne sesongen vist seg å være verdens aller beste langløpere, og Oslo-gutten leder nå Ski Classics sammenlagt før vi går inn i en periode med blant annet Vasaloppet og Birken på programmet. I denne episoden snakker Kasper blant annet om overgangen fra tradisjonell langrenn til langløp, hvordan du som mosjonist kan stake bedre og han deler innsikt i sin treningshverdag og -filosofi. Dette er en dedikert skiløper som legger inn rulleskiturer på over 23 mil på sommerhalvåret. Episoden passer perfekt for deg som vil prestere bedre i et langløp denne sesongen, uansett nivå eller ambisjoner du har får rennet. Mikael Gunnulfsen er igjen med i podcasten, som gis ut av Swix og Skisporet.no. Programleder er Håvard Rønning.

Skisporet
Hemmeligheter fra verdenscupen, tips for å toppe formen til Birken og slik takler du nerver før start

Skisporet

Play Episode Listen Later Feb 1, 2024 82:33


Mattis Stenshagen kom sen er rakett ut fra nyttårs-isolasjon og sikret plass i verdenscupen resten av sesongen. I denne podcasten deler han sine erfaringer og tips til det å toppe formen ukene i forkant av viktige renn – uansett nivå du er på.  Han gjester episoden sammen med Team Swix-kollega Mikael Gunnulfsen. Programleder er Håvard Rønning. Denne episoden er perfekt for deg som ønsker inspirasjon til å gå fortere på ski, lære hvordan proffene takler presset og nerver før start og hvordan du kan ta grep for å bli en bedre skiløper i vinter. Skisporet gis ut av Swix og Skisporet.no. 

Våre vinnere
Åge Skinstad - Fellessmart med enkeltstart

Våre vinnere

Play Episode Listen Later Jan 26, 2024 66:36


Han er fellessmart. Men han verdsetter høyt at hver aktør får sin enkeltstart og sin egen sekundering. Dette er en episode om langrenn, men først og fremst om ledelse. Om utøveren. Eksperten. Kommentatoren. Sportssjefen. Totningen. Lederen og menneske. Åge Skinstad er tidenes mestvinnende sjef for nasjonalsporten vår langrenn - og nå sjef for VM på ski i Trondheim 2025. Er det mulig å gjenskape folkefestene fra Trondheim 1997 og Oslo 2011? Hva kommer den ufravikelige kjærligheten til skisporten fra? Hvordan leder han? Hva har han lært fra rollene som leder og direktør i Adidas, Swix, NHO og Hapro? Hvorfor ble han ikke like god som kompisen Bjørn Dæhlie i sporet? Og hvorfor ble alkohol tema i denne episoden? Åge Skinstad, folkens! 

Skisporet
«Hjelp, jeg skal stake mitt første skirenn. Hva gjør jeg?» – Tips og triks til deg som skal gå skirenn i vinter

Skisporet

Play Episode Listen Later Jan 23, 2024 46:44


Hva kreves for å kunne stake et skirenn? Hvordan skal du forberede deg uken i forkant? Hva er de beste tipsene for å være perfekt forberedt på renndagen? I denne podcasten deler Team Swix-løper Mikael Gunnulfsen sine beste tips inn mot konkurransen. I studio har vi med oss Erik Gundersen som planlegger å Marcialonga denne helgen - det er første gang han går renn på blanke ski. Her tar vi opp aktuelle temaer og problemstillinger som alle mosjonister der ute kan dra nytte av. Podcasten Skisporet gis ut av Swix og Skisporet.no. Programleder er Håvard Rønning.

Skisporet
Helene Marie Fossesholm om den utfordrende vinteren, skigleden som forsvant og veien videre

Skisporet

Play Episode Listen Later Jan 11, 2024 56:08


Helene Marie Fossesholm er kjent som en sprudlende jente med et stort smil og et smittende, godt humør. Men plutselig var skigleden borte hos 22-åringen. Hun kjente ikke lenger glede av å komme seg ut på økt, presse seg på trening eller mye annet som hun tidligere elsket å gjøre. I denne episoden får du en ærlig, fin og unik prat der skistjernen forteller om den tunge sesongen, hva som har holdt henne oppe og hva hun tenker om veien videre. I podcasten hører du også Team Swix-løper Mikael Gunnulfsen og programleder Håvard Rønning. Podcasten gis ut av Swix og Skisporet.no. Skisporet podcast tar deg tettest på det som rører seg i skisportens verden. Husk å abonnere på kanalen for å holde deg oppdatert på fremtidige episoder. 

NÅ ER DET ALVOR
#210 - Vegard Breie | Norges Mest Stoka Ekstremsport-fotograf

NÅ ER DET ALVOR

Play Episode Listen Later Jan 11, 2024 83:17 Transcription Available


Vegard Breie er en skibums og fotograf fra Ål i Hallingdal. Det siste tiåret har levd av å fotografere og filme ekstremsportutøvere, og har jobbet med aktører som f.eks. Red Bull, Norrøna, NRK, Swix, UFC, K2, Suunto, Firestone og Mastercard. Han har hatt bildet sitt på coveret til Photography Annual, og fått prisen for årets bilde i Freeskier Mag to ganger. I tillegg til å kjøre ski er han en ihuga klatrer, downhill-syklist, og generelt friluftsliventusiast. Nå er han aktuell med å ha laget to klatrefilmer som han turnerer rundt i landet med. Sjekk ut klatrefilm.no for mer info. Takk til Piteraq og Urkraft for samarbeidet i denne episoden. Gira på å bidra til produksjonen av dokumentarfilmen om Western States? Doner en slant på Spleisen her. Sjekk også ut Patreon-kontoen til NEDA. Der får du tilgang til hundrevis av slike episoder som denne som ikke ligger på Spotify og Apple Podcasts. Support the show

Skisporet
Slik utnytter du julen maksimalt – Med Mattis Stenshagen og Mikael Gunnulfsen

Skisporet

Play Episode Listen Later Dec 21, 2023 59:47


«Julen er tid for familien ETTER at du har fått deg en skitur». Det er to formløpere som besøker podcasten. Mattis Stenshagen herjet med alt og alle forrige helg, men får allikevel ikke gå Tour de Ski. Hør hvordan han og lagkompis Mikael Gunnulfsen reagerer når de er gjest i Skisporet podcast. I denne episoden får du et innblikk i langrennssportens mer ukjente side når de to Team Swix-løperne forteller om prioriteringene som de må gjøre for å slå seg inn på fremtidige verdenscuplag. I tillegg får du tips til egne skiturer, trening og hvordan du kan stille perfekt forberedt til et av vinterens mange skirenn for mosjonister. Skisporet podcast gis ut av Swix og Skisporet.no. Programleder er Håvard Rønning

Skisporet
En kilo ribbe til kvelds, kastet klærne i media og masse god treningsprat for skiløpere. Med Petter og Mårten Skinstad og Mathias Aas Rolid

Skisporet

Play Episode Listen Later Dec 8, 2023 60:09


Team Coop Madshus teller ned til sesongstart for langløperne. Petter Skinstad, Mathias Aas Rolid og Mårten Solen Skinstad gjester studio for å snakke om alt fra trening og mat til Gran Canaria.. og mye, mye mer. Programleder er Håvard Rønning. Skisporet podcast gis ut av Swix og Skisporet.no. Husk å abonnere på podcasten for å holde deg oppdatert på alt innen skisportens verden i vinter.

Skisporet
Trening, motivasjon og utvikling for yngre løpere - Med Kristin Austgulen Fosnæs og Håvard Moseby

Skisporet

Play Episode Listen Later Nov 30, 2023 42:14


Kristin Austgulen Fosnæs og Håvard Moseby, begge fra det norske rekruttlandslaget, gjester ukens episode av podcasten Skisporet. Bli kjent med den energiske duoen, som deler sine tips og råd til både yngre og erfarne skiløpere. Podcasten Skisporet gis ut av Swix og Skisporet.no. Programleder er Håvard Rønning. Sørg for å abonnere på podcasten for å få med deg alle episodene gjennom vinteren.

Skisporet
Røper ukjent drama før start på Beitostølen - Med hele Team Swix

Skisporet

Play Episode Listen Later Nov 21, 2023 49:18


Lytt inn når vi i denne episoden oppsummerer sesongstarten på Beitostølen sammen med Mikael Gunnulfsen, Mattis Stenshagen, Eirik Mysen og Jonas Vika i Team Swix. I tillegg ser vi fremover mot kommende konkurranser og deler tips til deg som skal ut på ski selv i løpet av de kommende ukene. Skisporet podcast gis ut av Swix og Skisporet.no.

drama hele lytt ukjent swix beitost
Skisporet
Endelig skisesong! Dette må du vite om sesongåpningen på Beitostølen – Med Åge Skinstad

Skisporet

Play Episode Listen Later Nov 17, 2023 27:53


Ventetiden er over og endelig er langrenssesongen tilbake på TV-skjermen. I denne episoden oppdaterer vi deg på det viktigste før den nasjonale åpningen på Beitostølen. I denne episoden hører du Åge Skinstad, Morten Sætha og Håvard Rønning. Podcasten gis ut av Swix og Skisporet.no. Husk å abonner på podcasten for å få med deg alt som skjer i skisportens verden denne vinteren.

Skisporet
Langrennsguide for nybegynnere: Utstyr, trening, motivasjon - Med Team Swix

Skisporet

Play Episode Listen Later Nov 8, 2023 49:03


Hvordan skal du finne skigleden? For mange starter det med mestring. Og hvordan mestrer du ski hvis ikke har forhold til det eller synes det er vanskelig? Det er sikkert mange som skulle ønske at de fant skigleden, men ikke helt hvor de skal starte.  Derfor har vi laget denne episoden ,ed tips til teknikk, utstyr, motivasjon og hvordan du oppnår fremgang på ski - uavhengig av startnivået ditt. Gjester i podcasten er Mattis Stenshagen, Mikael Gunnulfsen og Torstein Dagestad fra langrennslaget Team Swix. Programleder er Håvard Rønning. Podcasten gis ut av Swix og Skisporet.no. Sjekk ut Skisporet for å finne preparerte løyper i nærheten av deg og Swixsport.com for å få tak i alt du trenger til skisesongen. Takk for at du lytter.  

Skisporet
Skisporet podcast: Slik blir fluorforbudet i vinter

Skisporet

Play Episode Listen Later Oct 30, 2023 49:13


Reglene for vinteren er like enten du skal Birken eller konkurrere i World Cup: Du har ikke lov til å bruke fluor enda skia. I denne episoden tar vi for oss «alt » rundt det kommende fluorforbudet: Hva betyr dette for deg som skiløper? Hva skjer hvis du blir tatt med fluor på skia? Hva gjør du nå for å bli fluorfri? Er fluorfri skismøring like bra som fluorsmøring? Dette og mye mer får du svar på i denne episoden av podcasten Skisporet. Gjester er tidligere smøresjef for Norge, Stein Olav Snesrud, og forskningssjef for Swix, Christian Gløgård. Programleder er Håvard Rønning. Podcasten gis ut av Swix og Skisporet.no.

The Seder-Skier Podcast
Sederskier gets humbled by Barr Trail

The Seder-Skier Podcast

Play Episode Listen Later Oct 30, 2023 50:46


We discuss our trip down to Colorado Springs to cover the state xc meet, which included a little scoping out of the Barr Trail, the Pikes Peak Ascent course. Plus, Zak Ketterson switches to Leki poles and Erik Bjornsen launches his own pole company....we talk about OneWay, Swix, Leki and the USSPC differences....and the ski pole invention we don't think will change really anything.... Snow has fallen in Leadville - so it's time to get out and log some K's... Keep on striving. Keep on skiing. --- Send in a voice message: https://podcasters.spotify.com/pod/show/seder-skier/message Support this podcast: https://podcasters.spotify.com/pod/show/seder-skier/support

Skisporet
Sesongpremiere: Den perfekte skiløper, slik knuste de landslaget på test og dette irriterer Team Swix

Skisporet

Play Episode Listen Later Oct 18, 2023 50:28


Moralen er på topp og de har trent godt i flere måneder. Team Swix teller ned til Beitostølen, og duellene som kan definere hele sesongen for det nyoppstartede laget. De har fått gode svar på trening og tester, og vi kan merke at selvtilliten er på topp, både på godt og vondt, hos denne gjengen. Mikael Gunnulfsen, Mattis Stenshagen, Eirik Mysen og Jonas Vika gjester sesongpremieren av podcasten Skisporet, når vi treffer den rappkjefta gjengen på samling på Østlandet. Dette er en episode som garantert får deg i god skimodus, bare noen uker før en ny sesong igjen braker løs. Podcasten Skisporet gis ut av Swix og Skisporet.no. Programleder er Håvard Rønning. God lytting.

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Want to help define the AI Engineer stack? >800 folks have weighed in on the top tools, communities and builders for the first State of AI Engineering survey, which we will present for the first time at next week's AI Engineer Summit. Join us online!This post had robust discussion on HN and Twitter.In October 2022, Robust Intelligence hosted an internal hackathon to play around with LLMs which led to the creation of two of the most important AI Engineering tools: LangChain

Skisporet
Bak Kulissene med Team Swix: Gunnulfsen, Stenshagen, Mysen og Vika om oppstarten av sitt nye skilag

Skisporet

Play Episode Listen Later May 26, 2023 32:59


Mikael Gunnulfsen, Mattis Stenshagen, Eirik Mysen og Jonas Vika har nettopp blitt introdusert som de nye stjernene på Team Swix. I denne episoden dykker vi dypt inn i skapelsen av dette nye langrennslaget, med løperne selv delende sine forhåpninger og forventninger for dette prosjektet - å danne sitt eget lag. Få et nærmere innblikk i løpernes liv og karrierer, gode historier fra livet som skiløper og personlige erfaringer rundt trening og sesongoppkjøring. Podcasten Skisporet gis av Swix. Programleder er Håvard Rønning.

Skisporet
William Poromaa om gaming-dueller med Klæbo, trening på sommeren og livet etter VM-suksessen

Skisporet

Play Episode Listen Later May 16, 2023 29:53


William Poromaa har vært et hett navn både i og utenfor skisporet denne vinteren. Det svenske stjerneskuddet leverte fantastiske resultater i VM og kronet mesterskapet med å bli en av tidenes yngste medaljevinner på femmila. Samtidig stjal 23-åringen også overskriftene for også utenomsportslige ting. Willliam Poromaa har etablert seg som et av de heteste navnene i internasjonal langrenn. I denne podcasten blir du bedre kjent med «Sveriges frelser» som elsker alt fra gaming til hard trening. Her får du et unikt innblikk i livet til William Poromaa, når han besøker oss i Swix på Lillehammer. Episoden gis ut av Swix sammen med Skisporet.no. William har et samarbeid med Swix på staver og klær. Programleder for Skisporet podcast er Håvard Rønning. 

Skisporet
Tips til Birken-forberedelsene med Andrew Musgrave og Swix Racing Service

Skisporet

Play Episode Listen Later Mar 13, 2023 37:29


Hva skal du gjøre dagene før Birken? Skal du ligge på latsiden og spare krefter eller skal du ut trene som normalt for å holde kroppen i gang? Også er det dette med ski, da. Det ser ut til å bli et Birken-vær bestående av snø og nedbør. Hvilke utfordringer gir dette og hva skal til for at nettopp du treffer med skiene i år? I denne episoden har vi fått med oss skiløper Andrew Musgrave og World Cup-smører Morten Sætha til å dele sine siste tips og råd før årets Birken. Dette vil du ikke gå glipp av dersom du planlegger å gå de 54 kilometerne mellom Rena og Lillehammer. Denne podcasten gis ut av Swix sammen med Skisporet.no. På Swixsport.com vil du finne oppdaterte smøretips hver dag frem til start.

Skisporet
På innsiden av stafettdramaet i Planica: Slik var den store norske dagen

Skisporet

Play Episode Listen Later Mar 2, 2023 34:07


I denne episoden av podcasten Skisporet tar vi deg med på innsiden av VM-sirkuset i Planica. Her kommer du tett på all dramatikken før, under og etter Norges fantastiske lagseier på stafetten torsdag. Du får høre fra utøverne selv, eksperter som Johan Olsson, trenere og mange flere i denne spesialepisoden av podcasten. Podcasten gis av Swix sammen med Skisporet.no.

Skisporet
Jørgen Graabak og Jens Lurås Oftebro om trening, suksessen og VM

Skisporet

Play Episode Listen Later Feb 22, 2023 28:48


VM-klare Jørgen Graabak og Jens Lurås Oftebro gjester podcasten Skisporet for å prate om den eventyrlige suksessen til det norske kombinertlandslaget. På både herre- og damesiden har Norge de største gullfavorittene i alle distanser under VM i Planica. Hvordan har det blitt slik? Hva skal til for å bli best i kombinert? Hvordan trener Graabak, Oftebro og de andre på landslaget? Denne episoden gir deg et godt innblikk i livene til to av våre største vintersportsutøvere. Podcasten gis ut av Swix, sammen med Skisporet.no. Programleder er Håvard Rønning. La oss kjempe for det første likestilte OL i historien. Visste du at kombinertkvinnene ikke får lov til å delta i OL? Det kan vi ikke godta og trenger din hjelp for å legge press på IOC. Vis din støtte ved å signere kampanjen under: https://www.swixsport.com/no/kampanjer/noexception/

Skisporet
Styrketrening for skiløpere med Anders Aukland ++ Mange treningstips fra skilegenden

Skisporet

Play Episode Listen Later Feb 15, 2023 53:51


Få har mer erfaring fra toppidretten enn Anders Aukland, som gjester podcasten for å dele sine treningstips til mosjonister og satsende skiløpere.  Anders er inne i sin siste sesong som eliteløper der han akkurat nå har satt inn støtet for å prestere best mulig på Vasaloppet i mars. I denne episoden får du den detaljerte planen for hvordan 50-åringen topper formen de siste ukene, og vi kan avsløre at det her er et par oppsiktsvekkende detaljer i treningen hans. Økt fokus på styrketrening er en viktig faktor for at Anders Aukland har holdt nivået gjennom 40-årene. Hør hans treningstips til mosjonister, du vil her blant annet få servert et komplett styrketreningsprogram for skiløpere som ønsker å stake bedre. Vi toucher innom flere temaer med Anders Aukland. I denne episoden av Skisporet podcast hører du blant annet mer om dette: Effektiv trening: Hva skal du prioritere i en hektisk uke og hvordan får du mest ut av treningstimene dine? Stakemaskin / SkiErg: Slik trener Anders Aukland og dette er hans tips til deg som vil få mer ut av stakemaskinen. Overgangen til blanke ski: Når kan du stille til start på blanke ski eller bør du fortsette med festevoks? Er du i tvil, så har Anders en klar test som du må prøve. I podcasten hører du også Lars Myrseth Kurås og Håvard Rønning som begge jobber i Swix. Podcasten gis ut i samarbeid med Skisporet.no som nå har kommet i en helt ny drakt med flere nye funksjoner. Sjekk gjerne ut disse på www.skisporet.no. God lytting – og god trening.

Skisporet
Prepp skiene dine selv: Slik får du superski i konkurranser

Skisporet

Play Episode Listen Later Feb 2, 2023 32:46


Bli med når vi gjør et dypdykk i hva som skal til for å få best mulig ski på vinterens langturer eller i konkurranser.  I denne episoden lærer du hva som er viktig å tenke på når du skal preppe skiene dine selv. Hvilke feller kan du gå i og hva bør du unngå når det kommer til konkurranseprepp? Hva du bør gjøre allerede nå for å være best forberedt til store renn som Vasaloppet eller Birken? Og hva skal til for at du unngår siste-liten-stresset i dagene før du skal gå renn? Svarene får du her.  I denne podcast-episoden hører du Henrik Johnsen, Jan Olav Bjørn Gjermundshaug og Håvard Rønning som alle jobber i Swix. Henrik prepper tusenvis av skipar hver vinter for mosjonister og toppløpere, mens Jan Olav utvikler produktene du bruker til skiprepp. Her har du mulighet til å lære en hel del nytt fra to av de mest kunnskapsrike innen temaet. Temaer vi prater om: Sliping av ski, justering av festesoner, hvordan mette ski, flytende glidere, produktene du må ha, her finner du smøretips til Birken og mye, mye mer. Skisporet podcast gis ut av Swix i samarbeid med Skisporet.no. Husk at du finner alt du trenger til vinterens skirenn på Swixsport.com.

Skisporet
Tiril og Lotta Udnes Weng om duellene, trening og hvordan de ble blant verdens beste på ski

Skisporet

Play Episode Listen Later Jan 13, 2023 35:23


Tvillingsøstrene har tatt store steg som skiløpere denne vinteren, der de begge har etablert seg helt i verdenstoppen. Tiril er leder verdenscupen sammenagt og de tok begge sine først WC-seire under årets Tour de Ski, der et av høydepunktene var da de to eneggede tvillingene havnet side-om-side på oppløpet i duell om førsteplassen. I denne episoden lærer du de to energibuntene bedre å kjenne. Har de en forklaring på den umiddelbare suksessen? Har de egentlig noen ulikheter? Og hvilke treningstips har de å komme med til mosjonister der ute? Skisporet podcast gis ut av Swix i samarbeid med Skisporet.no. Programleder er Håvard Rønning. På Swixsport.com finner du alt du trenger for å være perfekt forberedt til vintrens skiturer og skirenn. Tiril og Lotta Udnes Weng og Swix har et samarbeid gjennom at de to bruker skistaven Swix Triac 4.0 Aero.

Skisporet
Petter Northug og Petter Skinstad om trening, langløp og comeback: Hør skilegendens skumle planer for vinteren

Skisporet

Play Episode Listen Later Nov 29, 2022 41:04


Det er en glede å ønske Petter Northug og Petter Skinstad tilbake i podcasten Skisporet også denne vinteren for å snakke om trening, motivasjon og satsningen inn mot Ski Classics-sesongen. I denne episoden får du et unikt innblikk i satsningen til tidenes beste skiløper, Petter Northug. Hvordan har han trent, hva motiverer skilegenden og hva kan vi forvente at 36-åringen leverer av reultater til vinteren? Som treningspartner har han med seg sin gode venn Petter Skinstad. De to har blant annet tilbrakt en hel måned sammen på Gran Canaria for å trene og bygge grunnlag inn mot sesongen. Nå nærmer vinteren seg, og du kan føle på spenningen om hvem av de to som stiller best forberedt når startskuddet for Ski Classics går i desember. I tillegg til å høre historier fra livet til Petter og Petter, vil du i denne episoden lære mye nyttig om trening og oppladning til langløp. Podcasten gis ut av Swix i samarbeid med Skisporet.no.  Programleder: Håvard Rønning

Skisporet
Gyda nektes å delta i OL fordi hun er kvinne. Hør om hennes kamp for å få IOC til å snu | Med Åge Skinstad og Gyda Westvold Hansen

Skisporet

Play Episode Listen Later Nov 19, 2022 23:11


Vi er kun én gren unna det første likestilte olympiske leker i historien. I juni 2022 bestemte den Internasjonale Olympiske Komité at kvinner kun får delta i 15 av 16 grener i vinter-OL 2026. Kombinertkvinnene nektes deltakelse og det kan vi ikke godta. Med Gyda Westvold Hansen i spissen, har dette blitt kraftig markert under helgens sesongåpning på Beitostølen. Og markeringene vil fortsette utover vinteren.  I denne episoden blir du bedre kjent med verdens beste kombinertløper, Gyda Westvold Hansen, og hennes kamp for å få delta i OL. Den ferske VM-sjefen Åge Skinstad er også med i podcasten, som gis ut av Swix. Programleder: Håvard Rønning. Målet er å få IOC til å  endre avgjørelsen og inkludere kombinert-kvinnene i OL programmet 2026. Vis din støtte ved å signere underskriftskampanjen: https://www.swixsport.com/no/kampanjer/noexception/

Skisporet
Skiskytter Elisabeth (19) ble spådd et liv i rullestol. Så motbeviste hun alle. Hør hele historien her.

Skisporet

Play Episode Listen Later Nov 10, 2022 29:56


Skiskytteren Elisabeth Hartz Braathen fikk livet snudd på hodet da hun plutselig en morgen våknet opp lam fra livet og ned. Spådommen fra legene var klare: Du kommer aldri til å gå igjen. Det ville ikke Elisabeth akseptere. I denne episoden hører du den utrolige historien om hvordan Elisabeth lærte seg å gå igjen, hvordan familien reagerte og hva hun nå tenker om kommende vinter der hun kan gjøre det hun elsker aller mest: Å gå på ski. --- Skisporet podcast holder deg oppdatert på det som rører seg innen skisportens verden. Hos oss kommer du tett på de største profilene og lærer noe nytt om trening, prepping av ski eller hvor du finner best skiløyper. Husk å abonnere på podcasten for å få med deg alle episodene denne vinteren. Denne podcasten gis ut av Swix.

Skisporet
Trening mot skisesongen med Hans Christer Holund og Petter Skinstad | Råd, tips og treningsprogram

Skisporet

Play Episode Listen Later Oct 21, 2022 75:52


Hans Christer Holund og Petter Skinstad gjester podcasten Skisporet for å dele sine beste treningstips til mosjonister og ivrige skiløpere som ønsker å gå fort på ski i vinter. I denne episoden får du tips til skispesifikke treningsøkter for høsten og tidlig vinter som du bør gjøre for å stille godt best mulig trent til Birken, Vasaloppet eller andre turrenn. Du vil også få presentert et treningsprogram for skiløpere, satt sammen av Holund og Skinstad. Gjennom å lytte til denne episoden vil du plukke opp flere tips som kan være nyttige å følge uansett hvilket nivå du ligger på fra før. Her er noen temaer: Høsttrening for mosjonister: Dette bør du gjøre nå. Stryketrening for skiløpere. Slik finner du riktig intensitet på intervaller. Stigning på mølla når du løper intervall. Derfor skal du ikke trene for hardt. Stakemaskin: Tips, triks og økter. Rulleskitreningen bør gjøre på høsten. Derfor er den rolige langturen viktig. Derfor er den harde langturen enda viktigere. De første ukene med skiføre: Slik bør du trene. Staketrening. Treningsprogram for skiløpere Dette er altså en episode som gir deg mange svar på det du lurer på rundt trening for skiløpere. Programleder er Håvard Rønning. Podcasten gis ut av Swix. Både Holund og Skinstad er ambassadører for Swix, gjennom at de begge trener og konkurrerer med skistaven Swix Triac 4.0 Aero. Petter Skinstad bruker også klær fra Swix. På Swixsport.com finner du alt du trenger til skisesongen. Vinterens nyheter har nettopp kommet inn på lager: https://www.swixsport.com/

Skisporet
Sesongpremiere: Frida Karlsson og Maja Dahlqvist om livet utenfor landslaget

Skisporet

Play Episode Listen Later Oct 6, 2022 28:01


Frida Karlsson og Maja Dahlqvist sjokkerte en hel skiverdenen da de takket nei til en plass på det svenske landslaget. Nå er de to godt på vei mot første sesong med eget opplegg. Hva har fungerte bra og hva savner de mest med det å være på landslag? I denne sesongpremieren av Skisporet podcast, snakker Frida Karlsson og Maja Dahlqvist om det å bli verdens beste skiløpere på egen hånd. Hør Maja Dahlqvist fortelle om rekord-sesongen i 21/22, og om den ville OL-festen der hun stjal norske smørehemmeligheter. Frida Karlsson tror ikke hun slipper unna Therese Johaug i fremtiden. Slik er live utenfor landslaget. Skisporten er i endring. Hør hva de to svenskene tenker om diskusjonen rundt privatlag og landslag. Hva tenker Maja om å utfordre langdistanseløperne i fremtiden? Derfor har Frida Karlsson flyttet på seg. Treningstipset: En økt du kan gjøre med venn. Veien til sesongens mål: VM-gull. Skisporet podcast lages av Swix og ledes av Håvard Rønning. Gjennom vinteren skal vi sørge for at du kommer tettest på det som rører seg i skisportens verden. Husk å abonner på kanalen og gi anmeldelse for å få med deg alle episodene denne vinteren. Du kan lese mer om podcasten på Swixsport.com.

Skisporet
Sommertrening: Slik kan du bli bedre til å løpe langt i terrenget – Med ultraløper Tobias Dahl Fenre og Birken-vinner Kristin Størmer Steira

Skisporet

Play Episode Listen Later Jun 23, 2022 68:06


Hva er det som gjør at noen klarer å løpe et ultraløp 100 km? Hvordan trener disse menneskene og hva kan du plukke opp av tips for å få mer ut av løpingen i sommer? I denne sommerspesialen av podcasten Skisporet, har vi invitert Birken-vinner Kristin Størmer Steira og ultraløper Tobias Dahl Fenre.  Kristin er kjent for de fleste nordmenn gjennom sin eventyrlige skikarriere. 41-åringen, som nå jobber i Swix / Brav, lever et travelt familieliv, men klarte altså å knuse alle på Birkebeinerrennet i juni. Lær hva hemmeligheten hennes er når det kommer til å det å trene de riktige øktene når du har begrenset med tid til trening. Tobias har i de senere årene slått seg opp som en av landets aller beste ultraløpere. Tidligere i år vant han blant annet EcoTrail 50k. I denne episoden lærer du hvordan han trener for å holde de lengste distansene, hva kroppen trenger før lange løp og hvordan han i det hele tatt ente opp fra å være en  pensjonert skiløper til å bli en kvalitetsløper. Denne episoden er laget for alle, uansett hvor mye du løper fra før, hva ambisjonsnivået ditt er eller hvilken bakgrunn du har med utholdenhetsidrett. Programleder for podcasten er Håvard Rønning. Skisporet podcast gis ut i samarbeid mellom Swix og Skisporet, som begge eies av Brav. Dette er en sommerspesial. Vi er tilbake med en helt ny sesong om langrenn og skitrening til høsten. God sommer - og god trening! Få også med deg: Lytt til sommerspesialen fra i fjor med Didrik Tønseth som gjest for å lære enda mer om trening på sommeren.

Build a Business Success Secrets
Peak Performance with FIS World Cup Speed Skier Jacob Perkins | Ep. 321

Build a Business Success Secrets

Play Episode Listen Later May 30, 2022 62:39


Jacob and I talk about what it's like to go over 100 mph (160 kph) on two skies, how he trains, what he eats and how he maintains his discipline to train as a speed skier while working a full time job. About Jacob Perkins Jacob Perkins recently placed in the 2022 FIS World Cup. He is sponsored by ski manufacturer Atomic and ski wax and tuning manufacturer Swix. He relies heavily on his background in manufacturing and engineering for innovation in the sport of speed skiing where not only the athlete, but equipment can be a deciding factor of winning and losing.  By day Jacob is a Manufacturing Engineer for Amatrol Inc. Amatrol Inc is a global leader in technical education and training for industry, community colleges, and technical colleges. Jacob played Division 1 tennis for Wright State University and Southern Illinois University Edwardsville (SIUe) and competed on ITF and UTR Pro Tennis circuit. He is an active member in the community and helps with Habitat for Humanity, Optimist Club, IUS tennis, 1SI, Y1SI, and Metro Manufacturing Alliance. EDGE's Weekly NewsletterJoin over 17,000 others and sign up to receive bonus content. It's free sign up here >>> EPISODE LINKS: Jacob Perkins PODCAST INFO: Apple Podcasts: EDGE on Apple Podcasts Spotify: EDGE on Spotify  RSS Feed: EDGE's RSS Feed Website: EDGE Podcast SUPPORT & CONNECT EDGE's Weekly NewsletterJoin over 17,000 others and sign up to receive bonus content. It's free sign up here >>> Please Support this Podcast by checking out our Sponsors: Mad River Botanicals 100% certified organic CBD products. The product is controlled from seed to end product by it's owners. Use code: EDGE22 to get 10% off all your orders. Shop here>>> *We respect your privacy and hate spam. We will not sell your information to others. Jacob received his Masters in Industrial Systems Engineering from Wright State University and his Undergrad in Mechanical Engineering from Wright State University.  He is a certified ISO 9001 internal auditor and Purdue MEP Six Sigma Black Belt.

Skisporet
Sesongavslutning: Linn Svahn er tilbake! Hør om hennes ekstreme treningsmengder for å komme tilbake til toppen

Skisporet

Play Episode Listen Later Apr 13, 2022 31:12


Over fem måneder har gått siden Linn Svahn slapp sjokkbeskjeden i Skisporet podcast om at hun kom til å miste hele OL-sesongen grunnet skade og operasjon i skulderen. Hvordan har vinteren vært for den svenske skistjernen? Hva er status på skaden nå? Får vi se henne på start igjen neste vinter?  Linn Svahn er tilbake i Skisporet for å gi oss svar på alt vi lurer på rundt hennes situasjon – og mye mer! I denne episoden får du også høre  om 22-åringens ville ekstremtrening i vinter: Skiturer på 100 km uten staver! Dette er noen av temaene i denne episoden: Dette mener Linn Svahn om de svenske jentenes fantastiske sesong Linn Svahn om Therese Johaug: – Håpet hun ville fortsette Neste sesong: Linn har sagt at hun skal vinne alt i kommende VM. Står hun fortsatt på denne planen? Slik har langløperne hjulpet Linn å bli bedre. Dette er den siste episoden av Skisporet denne vintersesongen. Podcasten er et samarbeid mellom Swix og Skisporet.no. Programleder for podcasten er Håvard Rønning. Liker du episoden? Abonner på podcasten for å få med deg de beste historiene fra vintersportens verden. Swix sponser Linn Svahn med stavene Swix Triac 4.0 Aero og tekstil utenfor sesong.

Skisporet
Ellen la alt til side for å bli skiløper i voksen alder: – Det er et sært liv

Skisporet

Play Episode Listen Later Apr 7, 2022 22:41


Ellen Søhol Lie var i gang med studier, hadde jobb og så for seg at hun skulle kjøpe leilighet og starte «voksenlivet», men gjorde brått en helomvending: Hun la alle planer på is for å satse fulltid som skiløper med mål om å slå seg inn på det australske OL-laget. I fire år har hun satset, levd det hun kaller et «sært» toppidrettsliv og gjort alt for å bli en bedre skiløper! Hele historien hører du i denne episoden av Skisporet podcast. Podcasten er et samarbeid mellom Swix og Skisporet.no. Programleder for podcasten er Håvard Rønning. Liker du episoden? Abonner på podcasten for å få med deg de beste historiene fra vintersportens verden. Swix sponser Ellen Søhol Lie med bekledning, hansker og stavene Swix Triac 4.0 Aero.