POPULARITY
Categories
Sourish Samanta, Director AI and ML at Advance Auto Parts, joins The Tech Trek for a grounded conversation on where machine learning still creates the most business value, where generative AI fits, and why many teams are chasing the wrong solution. This episode is worth your time if you want a clearer view of how serious operators think about AI strategy, product delivery, and practical use cases that can ship now. This conversation cuts through the noise around AI and gets back to first principles. Sourish explains why machine learning remains the foundation behind today's AI wave, how to choose between deterministic and creative systems, and what it actually takes to build production ready products that solve real business problems.In this episode:Why machine learning is still the core layer behind modern AIWhen to use machine learning, when to use generative AI, and when simple analytics is enoughWhat a real product mindset looks like for AI and ML teamsHow pod based teams can ship faster with better cross functional alignmentWhy AI and ML talent need to spend time continuously reskillingTimestamped highlights:00:00 Why machine learning remains the foundation of today's AI stack01:57 The difference between ML teams, AI teams, and agent focused workflows05:56 Choosing the right solve, from forecasting and inventory to creative content generation10:09 The product mindset required to turn AI ideas into working systems13:51 Why some business problems need analytics, not AI15:52 Why AI teams need to spend part of their time learning, testing, and staying currentStandout line:AI is not the strategy. Solving the right problem is.Practical takeaway:If you are leading an AI initiative, start by classifying the problem. If the outcome needs consistency, prediction, or forecasting, machine learning may be the better path. If the outcome needs creativity or flexible generation, generative AI may be a better fit. And in some cases, the best answer is still a clean dashboard and strong analytics.Follow The Tech Trek for more conversations on AI, data, engineering, and how technology actually gets applied inside real businesses.
Shawn shares his adventures with balloons before ML goes ham on a (former?) listener. STRAIGHT DOPE Shawn is back in the […]
Today is Monday, March 9, 2026. Welcome to In Case You Missed It, our weekly five-minute rundown of important channel news stories that might have flown under the radar last week. In this edition: Ingram Micro Q4 and full year 2025 results: Ingram Micro reported fourth quarter net sales of $14.9 billion (up 11.5%) and full year net sales of $52.6 billion (up 9.5%), with its Xvantage platform now driving “billions” in transacted revenue. The company debuted the “AgenTeq” brand for its agentic AI capabilities, including a Sales Brief Agent initially piloted in Canada. Memory pricing crisis update: Dell is “compressing discounting” and shortening quote windows. HP says memory costs doubled in one quarter to 35% of PC production costs. Intel’s CEO says there’s no relief until 2028. The message to partners: quote fast, communicate pricing risk early, and plan for volatility. MSP Well launches as the channel’s first mental health community: Co-founded by Joe Ussia (Infinite IT Solutions), James Mignacca (Cavelo), and Miguel Ribeiro (VBS IT Services), MSP Well is a free peer-support network for IT and MSP professionals dealing with burnout, stress, and the mental health impact of cybersecurity work. Launched at XChange March 2026 in Orlando. ServiceNow claims AI bot resolves 90% of its own help desk tickets: The “Autonomous Workforce” agent handles Level 1 IT issues end-to-end, including password resets, VPN issues, and software access, with 99%+ resolution rates in targeted categories. GA expected in the second half of this year. Read Full Transcript Hello and welcome to In Case You Missed It from ChannelBuzz.ca. Your Monday morning recap where we catch you up on some of the channel news and trend headlines you may have missed in the last week. I’m Robert Dutt, editor of ChannelBuzz.ca. Today is Monday, March 9, 2026. Let’s get your week started right. Ingram Micro closed out fiscal 2025 with some pretty strong numbers. The distributor reported fourth quarter net sales of just under $14.9 billion, up 11.5% year over year and above the high end of its guidance range. For the full year, net sales came in at $52.6 billion, up nearly 10%. The company attributed the growth to strong demand across its core distribution business, an uptick in cloud marketplace revenue, and continued traction from its Xvantage digital platform, which management now says drives “billions of dollars” in transacted revenue. But the detail that caught my attention is a word, not a figure. During the earnings call, Ingram introduced the name AgenTeq – T-E-Q, by the way – as its branding for its agentic AI capabilities within the Xvantage platform. AgenTeq encompasses over 400 AI and ML models that Ingram’s been building, including a tool called the Sales Brief Agent, which gives Ingram sales teams real-time AI-generated intelligence on partner and customer accounts to help uncover growth opportunities. And in a detail worth noting for this audience, the Sales Brief Agent was initially piloted here in Canada before its planned global rollout in the first half of this year. We’re still learning what AgenTeq means in practical terms for channel partners and it’s early days for the branding, but the combination of its financial results and the platform investment suggests Ingram is placing a very deliberate bet on AI-driven distribution. A story we’ll be following up very soon here on In The Channel. If you listened last week, you heard us lead with the component shortage story. Cisco rewriting partner contract terms, Lenovo warning of March price hikes, Western Digital’s entire 2026 production already spoken for. The situation has not gotten better. In fact, it’s getting worse and faster than most of us expected. Dell COO Jeff Clarke told analysts last week the company’s compressing discounting and that quotes are now valid for “the shortest period of time they’ve ever been.” HP’s CFO disclosed that memory costs have doubled in a single quarter and now represent about 35% of PC production costs, up from 15 to 18% a few months ago. And Intel CEO Lip-Bu Tan says there’s no relief coming until 2028, a timeline backed by both SK Hynix and Micron. The takeaway for partners hasn’t changed from last week, but it’s more urgent now. Shorten your quote windows, have the pricing conversation with customers early, and assume that anything you quote today can and will cost more by the time it ships. Grab your helmet. Switching gears to something that doesn’t come up nearly enough. A new community initiative called MSP Well was formally launched this week at The Channel Company’s XChange conference in Orlando. MSP Well is a peer-support community dedicated to mental health and resilience among IT, MSP, and MSSP professionals. It was co-founded by Joe Ussia, CEO of Infinite IT Solutions, James Mignacca, CEO of Canadian vendor Cavelo, and Miguel Ribeiro of VBS IT Services. As Ussia put it, “the channel talks constantly about tools, threats, and uptime, but rarely about the human cost to the people doing the work.” MSP Well aims to change that, offering peer support, a Discord community, an anonymous call line, and partnerships with certified counsellors. It’s a meaningful initiative, and it’s something we’re looking forward to following up on here on In The Channel. And finally, ServiceNow says it has built an AI agent that’s now resolving 90% of inbound IT tickets on its own internal employee help desk. The system handles high-volume Level 1 issues like password resets, software access, VPN connectivity, and hardware troubleshooting, with resolution rates above 99% in those categories. When it gets stuck, it escalates rather than guessing. It’s an internal deployment for now, with general availability scheduled for the second half of the year. ServiceNow’s annual Knowledge conference takes place in May, and I’d expect we’ll hear a lot more about it there. Those are some of the things we were paying attention to last week. This week on In The Channel, we take a look at Check Point’s recent acquisition spree and how it all comes together with their chief strategy officer, Roi Karo. Sit down with frequent guest Tony Anscombe from ESET to talk about the current threat landscape. And break down the most meaningful findings of the Nutanix Enterprise Cloud Index report. I’m Robert Dutt for ChannelBuzz.ca. Have a great week!
Ve Špindlerově Mlýně v sobotu soutěžili nejlepší světoví snowboardisté. Nejhlasitější podpora jednoznačně patřila domácí závodnici a vítězce paralelního obřího slalomu v Livignu Zuzaně Maděrové.
All speakers are announced at AIE EU, schedule coming soon. Join us there or in Miami with the renowned organizers of React Miami! Singapore CFP also open!We've called this out a few times over in AINews, but the overwhelming consensus in the Valley is that “the IDE is Dead”. In November it was just a gut feeling, but now we actually have data: even at the canonical “VSCode Fork” company, people are officially using more agents than tab autocomplete (the first wave of AI coding):Cursor has launched cloud agents for a few months now, and this specific launch is around Computer Use, which has come a long way since we first talked with Anthropic about it in 2024, and which Jonas productized as Autotab:We also take the opportunity to do a live demo, talk about slash commands and subagents, and the future of continual learning and personalized coding models, something that Sam previously worked on at New Computer. (The fact that both of these folks are top tier CEOs of their own startups that have now joined the insane talent density gathering at Cursor should also not be overlooked).Full Episode on YouTube!please like and subscribe!Timestamps00:00 Agentic Code Experiments00:53 Why Cloud Agents Matter02:08 Testing First Pillar03:36 Video Reviews Second Pillar04:29 Remote Control Third Pillar06:17 Meta Demos and Bug Repro13:36 Slash Commands and MCPs18:19 From Tab to Team Workflow31:41 Minimal Web UI Philosophy32:40 Why No File Editor34:38 Full Stack Cursor Debate36:34 Model Choice and Auto Routing38:34 Parallel Agents and Best Of N41:41 Subagents and Context Management44:48 Grind Mode and Throughput Future01:00:24 Cloud Agent Onboarding and MemoryTranscriptEP 77 - CURSOR - Audio version[00:00:00]Agentic Code ExperimentsSamantha: This is another experiment that we ran last year and didn't decide to ship at that time, but may come back to LM Judge, but one that was also agentic and could write code. So it wasn't just picking but also taking the learnings from two models or and models that it was looking at and writing a new diff.And what we found was that there were strengths to using models from different model providers as the base level of this process. Basically you could get almost like a synergistic output that was better than having a very unified like bottom model tier.Jonas: We think that over the coming months, the big unlock is not going to be one person with a model getting more done, like the water flowing faster and we'll be making the pipe much wider and so paralyzing more, whether that's swarms of agents or parallel agents, both of those are things that contribute to getting much more done in the same amount of time.Why Cloud Agents Matterswyx: This week, one of the biggest launches that Cursor's ever done is cloud agents. I think you, you had [00:01:00] cloud agents before, but this was like, you give cursor a computer, right? Yeah. So it's just basically they bought auto tab and then they repackaged it. Is that what's going on, or,Jonas: that's a big part of it.Yeah. Cloud agents already ran in their own computers, but they were sort of site reading code. Yeah. And those computers were not, they were like blank VMs typically that were not set up for the Devrel X for whatever repo the agents working on. One of the things that we talk about is if you put yourself in the model shoes and you were seeing tokens stream by and all you could do was cite read code and spit out tokens and hope that you had done the right thing,swyx: no chanceJonas: I'd be so bad.Like you obviously you need to run the code. And so that I think also is probably not that contrarian of a take, but no one has done that yet. And so giving the model the tools to onboard itself and then use full computer use end-to-end pixels in coordinates out and have the cloud computer with different apps in it is the big unlock that we've seen internally in terms of use usage of this going from, oh, we use it for little copy changes [00:02:00] to no.We're really like driving new features with this kind of new type of entech workflow. Alright, let's see it. Cool.Live Demo TourJonas: So this is what it looks like in cursor.com/agents. So this is one I kicked off a while ago. So on the left hand side is the chat. Very classic sort of agentic thing. The big new thing here is that the agent will test its changes.So you can see here it worked for half an hour. That is because it not only took time to write the tokens of code, it also took time to test them end to end. So it started Devrel servers iterate when needed. And so that's one part of it is like model works for longer and doesn't come back with a, I tried some things pr, but a I tested at pr that's ready for your review.One of the other intuition pumps we use there is if a human gave you a PR asked you to review it and you hadn't, they hadn't tested it, you'd also be annoyed because you'd be like, only ask me for a review once it's actually ready. So that's what we've done withTesting Defaults and Controlsswyx: simple question I wanted to gather out front.Some prs are way smaller, [00:03:00] like just copy change. Does it always do the video or is it sometimes,Jonas: Sometimes.swyx: Okay. So what's the judgment?Jonas: The model does it? So we we do some default prompting with sort. What types of changes to test? There's a slash command that people can do called slash no test, where if you do that, the model will not test,swyx: but the default is test.Jonas: The default is to be calibrated. So we tell it don't test, very simple copy changes, but test like more complex things. And then users can also write their agents.md and specify like this type of, if you're editing this subpart of my mono repo, never tested ‘cause that won't work or whatever.Videos and Remote ControlJonas: So pillar one is the model actually testing Pillar two is the model coming back with a video of what it did.We have found that in this new world where agents can end-to-end, write much more code, reviewing the code is one of these new bottlenecks that crop up. And so reviewing a video is not a substitute for reviewing code, but it is an entry point that is much, much easier to start with than glancing at [00:04:00] some giant diff.And so typically you kick one off you, it's done you come back and the first thing that you would do is watch this video. So this is a, video of it. In this case I wanted a tool tip over this button. And so it went and showed me what that looks like in, in this video that I think here, it actually used a gallery.So sometimes it will build storybook type galleries where you can see like that component in action. And so that's pillar two is like these demo videos of what it built. And then pillar number three is I have full remote control access to this vm. So I can go heat in here. I can hover things, I can type, I have full control.And same thing for the terminal. I have full access. And so that is also really useful because sometimes the video is like all you need to see. And oftentimes by the way, the video's not perfect, the video will show you, is this worth either merging immediately or oftentimes is this worth iterating with to get it to that final stage where I am ready to merge in.So I can go through some other examples where the first video [00:05:00] wasn't perfect, but it gave me confidence that we were on the right track and two or three follow-ups later, it was good to go. And then I also have full access here where some things you just wanna play around with. You wanna get a feel for what is this and there's no substitute to a live preview.And the VNC kind of VM remote access gives you that.swyx: Amazing What, sorry? What is VN. AndJonas: just the remote desktop. Remote desktop. Yeah.swyx: Sam, any other details that you always wanna call out?Samantha: Yeah, for me the videos have been super helpful. I would say, especially in cases where a common problem for me with agents and cloud agents beforehand was almost like under specification in my requests where our plan mode and going really back and forth and getting detailed implementation spec is a way to reduce the risk of under specification, but then similar to how human communication breaks down over time, I feel like you have this risk where it's okay, when I pull down, go to the triple of pulling down and like running this branch locally, I'm gonna see that, like I said, this should be a toggle and you have a checkbox and like, why didn't you get that detail?And having the video up front just [00:06:00] has that makes that alignment like you're talking about a shared artifact with the agent. Very clear, which has been just super helpful for me.Jonas: I can quickly run through some other Yes. Examples.Meta Agents and More DemosJonas: So this is a very front end heavy one. So one question I wasswyx: gonna say, is this only for frontJonas: end?Exactly. One question you might have is this only for front end? So this is another example where the thing I wanted it to implement was a better error message for saving secrets. So the cloud agents support adding secrets, that's part of what it needs to access certain systems. Part of onboarding that is giving access.This is cloud is working onswyx: cloud agents. Yes.Jonas: So this is a fun thing isSamantha: it can get super meta. ItJonas: can get super meta, it can start its own cloud agents, it can talk to its own cloud agents. Sometimes it's hard to wrap your mind around that. We have disabled, it's cloud agents starting more cloud agents. So we currently disallow that.Someday you might. Someday we might. Someday we might. So this actually was mostly a backend change in terms of the error handling here, where if the [00:07:00] secret is far too large, it would oh, this is actually really cool. Wow. That's the Devrel tools. That's the Devrel tools. So if the secret is far too large, we.Allow secrets above a certain size. We have a size limit on them. And the error message there was really bad. It was just some generic failed to save message. So I was like, Hey, we wanted an error message. So first cool thing it did here, zero prompting on how to test this. Instead of typing out the, like a character 5,000 times to hit the limit, it opens Devrel tools, writes js, or to paste into the input 5,000 characters of the letter A and then hit save, closes the Devrel tools, hit save and gets this new gets the new error message.So that looks like the video actually cut off, but here you can see the, here you can see the screenshot of the of the error message. What, so that is like frontend backend end-to-end feature to, to get that,swyx: yeah.Jonas: Andswyx: And you just need a full vm, full computer run everything.Okay. Yeah.Jonas: Yeah. So we've had versions of this. This is one of the auto tab lessons where we started that in 2022. [00:08:00] No, in 2023. And at the time it was like browser use, DOM, like all these different things. And I think we ended up very sort of a GI pilled in the sense that just give the model pixels, give it a box, a brain in a box is what you want and you want to remove limitations around context and capabilities such that the bottleneck should be the intelligence.And given how smart models are today, that's a very far out bottleneck. And so giving it its full VM and having it be onboarded with Devrel X set up like a human would is just been for us internally a really big step change in capability.swyx: Yeah I would say, let's call it a year ago the models weren't even good enough to do any of this stuff.SoSamantha: even six months ago. Yeah.swyx: So yeah what people have told me is like round about Sonder four fire is when this started being good enough to just automate fully by pixel.Jonas: Yeah, I think it's always a question of when is good enough. I think we found in particular with Opus 4 5, 4, 6, and Codex five three, that those were additional step [00:09:00] changes in the autonomy grade capabilities of the model to just.Go off and figure out the details and come back when it's done.swyx: I wanna appreciate a couple details. One 10 Stack Router. I see it. Yeah. I'm a big fan. Do you know any, I have to name the 10 Stack.Jonas: No.swyx: This just a random lore. Some buddy Sue Tanner. My and then the other thing if you switch back to the video.Jonas: Yeah.swyx: I wanna shout out this thing. Probably Sam did it. I don't knowJonas: the chapters.swyx: What is this called? Yeah, this is called Chapters. Yeah. It's like a Vimeo thing. I don't know. But it's so nice the design details, like the, and obviously a company called Cursor has to have a beautiful cursorSamantha: and it isswyx: the cursor.Samantha: Cursor.swyx: You see it branded? It's the cursor. Cursor, yeah. Okay, cool. And then I was like, I complained to Evan. I was like, okay, but you guys branded everything but the wallpaper. And he was like, no, that's a cursor wallpaper. I was like, what?Samantha: Yeah. Rio picked the wallpaper, I think. Yeah. The video.That's probably Alexi and yeah, a few others on the team with the chapters on the video. Matthew Frederico. There's been a lot of teamwork on this. It's a huge effort.swyx: I just, I like design details.Samantha: Yeah.swyx: And and then when you download it adds like a little cursor. Kind of TikTok clip. [00:10:00] Yes. Yes.So it's to make it really obvious is from Cursor,Jonas: we did the TikTok branding at the end. This was actually in our launch video. Alexi demoed the cloud agent that built that feature. Which was funny because that was an instance where one of the things that's been a consequence of having these videos is we use best of event where you run head to head different models on the same prompt.We use that a lot more because one of the complications with doing that before was you'd run four models and they would come back with some giant diff, like 700 lines of code times four. It's what are you gonna do? You're gonna review all that's horrible. But if you come back with four 22nd videos, yeah, I'll watch four 22nd videos.And then even if none of them is perfect, you can figure out like, which one of those do you want to iterate with, to get it over the line. Yeah. And so that's really been really fun.Bug Repro WorkflowJonas: Here's another example. That's we found really cool, which is we've actually turned since into a slash command as well slash [00:11:00] repro, where for bugs in particular, the model of having full access to the to its own vm, it can first reproduce the bug, make a video of the bug reproducing, fix the bug, make a video of the bug being fixed, like doing the same pattern workflow with obviously the bug not reproducing.And that has been the single category that has gone from like these types of bugs, really hard to reproduce and pick two tons of time locally, even if you try a cloud agent on it. Are you confident it actually fixed it to when this happens? You'll merge it in 90 seconds or something like that.So this is an example where, let me see if this is the broken one or the, okay, this is the fixed one. Okay. So we had a bug on cursor.com/agents where if you would attach images where remove them. Then still submit your prompt. They would actually still get attached to the prompt. Okay. And so here you can see Cursor is using, its full desktop by the way.This is one of the cases where if you just do, browse [00:12:00] use type stuff, you'll have a bad time. ‘cause now it needs to upload files. Like it just uses its native file viewer to do that. And so you can see here it's uploading files. It's going to submit a prompt and then it will go and open up. So this is the meta, this is cursor agent, prompting cursor agent inside its own environment.And so you can see here bug, there's five images attached, whereas when it's submitted, it only had one image.swyx: I see. Yeah. But you gotta enable that if you're gonna use cur agent inside cur.Jonas: Exactly. And so here, this is then the after video where it went, it does the same thing. It attaches images, removes, some of them hit send.And you can see here, once this agent is up, only one of the images is left in the attachments. Yeah.swyx: Beautiful.Jonas: Okay. So easy merge.swyx: So yeah. When does it choose to do this? Because this is an extra step.Jonas: Yes. I think I've not done a great job yet of calibrating the model on when to reproduce these things.Yeah. Sometimes it will do it of its own accord. Yeah. We've been conservative where we try to have it only do it when it's [00:13:00] quite sure because it does add some amount of time to how long it takes it to work on it. But we also have added things like the slash repro command where you can just do, fix this bug slash repro and then it will know that it should first make you a video of it actually finding and making sure it can reproduce the bug.swyx: Yeah. Yeah. One sort of ML topic this ties into is reward hacking, where while you write test that you update only pass. So first write test, it shows me it fails, then make you test pass, which is a classic like red green.Jonas: Yep.swyx: LikeJonas: A-T-D-D-T-D-Dswyx: thing.No, very cool. Was that the last demo? Is thereJonas: Yeah.Anything I missed on the demos or points that you think? I think thatSamantha: covers it well. Yeah.swyx: Cool. Before we stop the screen share, can you gimme like a, just a tour of the slash commands ‘cause I so God ready. Huh, what? What are the good ones?Samantha: Yeah, we wanna increase discoverability around this too.I think that'll be like a future thing we work on. Yeah. But there's definitely a lot of good stuff nowJonas: we have a lot of internal ones that I think will not be that interesting. Here's an internal one that I've made. I don't know if anyone else at Cursor uses this one. Fix bb.Samantha: I've never heard of it.Jonas: Yeah.[00:14:00]Fix Bug Bot. So this is a thing that we want to integrate more tightly on. So you made it forswyx: yourself.Jonas: I made this for myself. It's actually available to everyone in the team, but yeah, no one knows about it. But yeah, there will be Bug bot comments and so Bug Bot has a lot of cool things. We actually just launched Bug Bot Auto Fix, where you can click a button and or change a setting and it will automatically fix its own things, and that works great in a bunch of cases.There are some cases where having the context of the original agent that created the PR is really helpful for fixing the bugs, because it might be like, oh, the bug here is that this, is a regression and actually you meant to do something more like that. And so having the original prompt and all of the context of the agent that worked on it, and so here I could just do, fix or we used to be able to do fixed PB and it would do that.No test is another one that we've had. Slash repro is in here. We mentioned that one.Samantha: One of my favorites is cloud agent diagnosis. This is one that makes heavy use of the Datadog MCP. Okay. And I [00:15:00] think Nick and David on our team wrote, and basically if there is a problem with a cloud agent we'll spin up a bunch of subs.Like a singleswyx: instance.Samantha: Yeah. We'll take the ideas and argument and spin up a bunch of subagents using the Datadog MCP to explore the logs and find like all of the problems that could have happened with that. It takes the debugging time, like from potentially you can do quick stuff quickly with the Datadog ui, but it takes it down to, again, like a single agent call as opposed to trolling through logs yourself.Jonas: You should also talk about the stuff we've done with transcripts.Samantha: Yes. Also so basically we've also done some things internally. There'll be some versions of this as we ship publicly soon, where you can spit up an agent and give it access to another agent's transcript to either basically debug something that happened.So act as an external debugger. I see. Or continue the conversation. Almost like forking it.swyx: A transcript includes all the chain of thought for the 11 minutes here. 45 minutes there.Samantha: Yeah. That way. Exactly. So basically acting as a like secondary agent that debugs the first, so we've started to push more andswyx: they're all the same [00:16:00] code.It is just the different prompts, but the sa the same.Samantha: Yeah. So basically same cloud agent infrastructure and then same harness. And then like when we do things like include, there's some extra infrastructure that goes into piping in like an external transcript if we include it as an attachment.But for things like the cloud agent diagnosis, that's mostly just using the Datadog MCP. ‘Cause we also launched CPS along with along with this cloud agent launch, launch support for cloud agent cps.swyx: Oh, that was drawn out.Jonas: We won't, we'll be doing a bigger marketing moment for it next week, but, and you can now use CPS andswyx: People will listen to it as well.Yeah,Jonas: they'llSamantha: be ahead of the third. They'll be ahead. And I would I actually don't know if the Datadog CP is like publicly available yet. I realize this not sure beta testing it, but it's been one of my favorites to use. Soswyx: I think that one's interesting for Datadog. ‘cause Datadog wants to own that site.Interesting with Bits. I don't know if you've tried bits.Samantha: I haven't tried bits.swyx: Yeah.Jonas: That's their cloud agentswyx: product. Yeah. Yeah. They want to be like we own your logs and give us our, some part of the, [00:17:00] self-healing software that everyone wants. Yeah. But obviously Cursor has a strong opinion on coding agents and you, you like taking away from the which like obviously you're going to do, and not every company's like Cursor, but it's interesting if you're a Datadog, like what do you do here?Do you expose your logs to FDP and let other people do it? Or do you try to own that it because it's extra business for you? Yeah. It's like an interesting one.Samantha: It's a good question. All I know is that I love the Datadog MCP,Jonas: And yeah, it is gonna be no, no surprise that people like will demand it, right?Samantha: Yeah.swyx: It's, it's like anysystemswyx: of record company like this, it's like how much do you give away? Cool. I think that's that for the sort of cloud agents tour. Cool. And we just talk about like cloud agents have been when did Kirsten loves cloud agents? Do you know, in JuneJonas: last year.swyx: June last year. So it's been slowly develop the thing you did, like a bunch of, like Michael did a post where himself, where he like showed this chart of like ages overtaking tap. And I'm like, wow, this is like the biggest transition in code.Jonas: Yeah.swyx: Like in, in [00:18:00] like the last,Jonas: yeah. I think that kind of got turned out.Yeah. I think it's a very interest,swyx: not at all. I think it's been highlighted by our friend Andre Kati today.Jonas: Okay.swyx: Talk more about it. What does it mean? Yeah. Is I just got given like the cursor tab key.Jonas: Yes. Yes.swyx: That's that'sSamantha: cool.swyx: I know, but it's gonna be like put in a museum.Jonas: It is.Samantha: I have to say I haven't used tab a little bit myself.Jonas: Yeah. I think that what it looks like to code with AI code generally creates software, even if you want to go higher level. Is changing very rapidly. No, not a hot take, but I think from our vendor's point at Cursor, I think one of the things that is probably underappreciated from the outside is that we are extremely self-aware about that fact and Kerscher, got its start in phase one, era one of like tab and auto complete.And that was really useful in its time. But a lot of people start looking at text files and editing code, like we call it hand coding. Now when you like type out the actual letters, it'sswyx: oh that's cute.Jonas: Yeah.swyx: Oh that's cute.Jonas: You're so boomer. So boomer. [00:19:00] And so that I think has been a slowly accelerating and now in the last few months, rapidly accelerating shift.And we think that's going to happen again with the next thing where the, I think some of the pains around tab of it's great, but I actually just want to give more to the agent and I don't want to do one tab at a time. I want to just give it a task and it goes off and does a larger unit of work and I can.Lean back a little bit more and operate at that higher level of abstraction that's going to happen again, where it goes from agents handing you back diffs and you're like in the weeds and giving it, 32nd to three minute tasks, to, you're giving it, three minute to 30 minute to three hour tasks and you're getting back videos and trying out previews rather than immediately looking at diffs every single time.swyx: Yeah. Anything to add?Samantha: One other shift that I've noticed as our cloud agents have really taken off internally has been a shift from primarily individually driven development to almost this collaborative nature of development for us, slack is actually almost like a development on [00:20:00] Id basically.So Iswyx: like maybe don't even build a custom ui, like maybe that's like a debugging thing, but actually it's that.Samantha: I feel like, yeah, there's still so much to left to explore there, but basically for us, like Slack is where a lot of development happens. Like we will have these issue channels or just like this product discussion channels where people are always at cursing and that kicks off a cloud agent.And for us at least, we have team follow-ups enabled. So if Jonas kicks off at Cursor in a thread, I can follow up with it and add more context. And so it turns into almost like a discussion service where people can like collaborate on ui. Oftentimes I will kick off an investigation and then sometimes I even ask it to get blame and then tag people who should be brought in. ‘cause it can tag people in Slack and then other people will comeswyx: in, can tag other people who are not involved in conversation. Yes. Can just do at Jonas if say, was talking to,Samantha: yeah.swyx: That's cool. You should, you guys should make a big good deal outta that.Samantha: I know. It's a lot to, I feel like there's a lot more to do with our slack surface area to show people externally. But yeah, basically like it [00:21:00] can bring other people in and then other people can also contribute to that thread and you can end up with a PR again, with the artifacts visible and then people can be like, okay, cool, we can merge this.So for us it's like the ID is almost like moving into Slack in some ways as well.swyx: I have the same experience with, but it's not developers, it's me. Designer salespeople.Samantha: Yeah.swyx: So me on like technical marketing, vision, designer on design and then salespeople on here's the legal source of what we agreed on.And then they all just collaborate and correct. The agents,Jonas: I think that we found when these threads is. The work that is left, that the humans are discussing in these threads is the nugget of what is actually interesting and relevant. It's not the boring details of where does this if statement go?It's do we wanna ship this? Is this the right ux? Is this the right form factor? Yeah. How do we make this more obvious to the user? It's like those really interesting kind of higher order questions that are so easy to collaborate with and leave the implementation to the cloud agent.Samantha: Totally. And no more discussion of am I gonna do this? Are you [00:22:00] gonna do this cursor's doing it? You just have to decide. You like it.swyx: Sometimes the, I don't know if there's a, this probably, you guys probably figured this out already, but since I, you need like a mute button. So like cursor, like we're going to take this offline, but still online.But like we need to talk among the humans first. Before you like could stop responding to everything.Jonas: Yeah. This is a design decision where currently cursor won't chime in unless you explicitly add Mention it. Yeah. Yeah.Samantha: So it's not always listening.Yeah.Jonas: I can see all the intermediate messages.swyx: Have you done the recursive, can cursor add another cursor or spawn another cursor?Samantha: Oh,Jonas: we've done some versions of this.swyx: Because, ‘cause it can add humans.Jonas: Yes. One of the other things we've been working on that's like an implication of generating the code is so easy is getting it to production is still harder than it should be.And broadly, you solve one bottleneck and three new ones pop up. Yeah. And so one of the new bottlenecks is getting into production and we have a like joke internally where you'll be talking about some feature and someone says, I have a PR for that. Which is it's so easy [00:23:00] to get to, I a PR for that, but it's hard still relatively to get from I a PR for that to, I'm confident and ready to merge this.And so I think that over the coming weeks and months, that's a thing that we think a lot about is how do we scale up compute to that pipeline of getting things from a first draft An agent did.swyx: Isn't that what Merge isn't know what graphite's for, likeJonas: graphite is a big part of that. The cloud agent testingswyx: Is it fully integrated or still different companiesJonas: working on I think we'll have more to share there in the future, but the goal is to have great end-to-end experience where Cursor doesn't just help you generate code tokens, it helps you create software end-to-end.And so review is a big part of that, that I think especially as models have gotten much better at writing code, generating code, we've felt that relatively crop up more,swyx: sorry this is completely unplanned, but like there I have people arguing one to you need ai. To review ai and then there is another approach, thought school of thought where it's no, [00:24:00] reviews are dead.Like just show me the video. It's it like,Samantha: yeah. I feel again, for me, the video is often like alignment and then I often still wanna go through a code review process.swyx: Like still look at the files andSamantha: everything. Yeah. There's a spectrum of course. Like the video, if it's really well done and it does like fully like test everything, you can feel pretty competent, but it's still helpful to, to look at the code.I make hep pay a lot of attention to bug bot. I feel like Bug Bot has been a great really highly adopted internally. We often like, won't we tell people like, don't leave bug bot comments unaddressed. ‘cause we have such high confidence in it. So people always address their bug bot comments.Jonas: Once you've had two cases where you merged something and then you went back later, there was a bug in it, you merged, you went back later and you were like, ah, bug Bot had found that I should have listened to Bug Bot.Once that happens two or three times, you learn to wait for bug bot.Samantha: Yeah. So I think for us there's like that code level review where like it's looking at the actual code and then there's like the like feature level review where you're looking at the features. There's like a whole number of different like areas.There'll probably eventually be things like performance level review, security [00:25:00] review, things like that where it's like more more different aspects of how this feature might affect your code base that you want to potentially leverage an agent to help with.Jonas: And some of those like bug bot will be synchronous and you'll typically want to wait on before you merge.But I think another thing that we're starting to see is. As with cloud agents, you scale up this parallelism and how much code you generate. 10 person startups become, need the Devrel X and pipelines that a 10,000 person company used to need. And that looks like a lot of the things I think that 10,000 person companies invented in order to get that volume of software to production safely.So that's things like, release frequently or release slowly, have different stages where you release, have checkpoints, automated ways of detecting regressions. And so I think we're gonna need stacks merg stack diffs merge queues. Exactly. A lot of those things are going to be importantswyx: forward with.I think the majority of people still don't know what stack stacks are. And I like, I have many friends in Facebook and like I, I'm pretty friendly with graphite. I've just, [00:26:00] I've never needed it ‘cause I don't work on that larger team and it's just like democratization of no, only here's what we've already worked out at very large scale and here's how you can, it benefits you too.Like I think to me, one of the beautiful things about GitHub is that. It's actually useful to me as an individual solo developer, even though it's like actually collaboration software.Jonas: Yep.swyx: And I don't think a lot of Devrel tools have figured that out yet. That transition from like large down to small.Jonas: Yeah. Kers is probably an inverse story.swyx: This is small down toJonas: Yeah. Where historically Kers share, part of why we grew so quickly was anyone on the team could pick it up and in fact people would pick it up, on the weekend for their side project and then bring it into work. ‘cause they loved using it so much.swyx: Yeah.Jonas: And I think a thing that we've started working on a lot more, not us specifically, but as a company and other folks at Cursor, is making it really great for teams and making it the, the 10th person that starts using Cursor in a team. Is immediately set up with things like, we launched Marketplace recently so other people can [00:27:00] configure what CPS and skills like plugins.So skills and cps, other people can configure that. So that my cursor is ready to go and set up. Sam loves the Datadog, MCP and Slack, MCP you've also been using a lot butSamantha: also pre-launch, but I feel like it's so good.Jonas: Yeah, my cursor should be configured if Sam feels strongly that's just amazing and required.swyx: Is it automatically shared or you have to go and.Jonas: It depends on the MCP. So some are obviously off per user. Yeah. And so Sam can't off my cursor with my Slack MCP, but some are team off and those can be set up by admins.swyx: Yeah. Yeah. That's cool. Yeah, I think, we had a man on the pod when cursor was five people, and like everyone was like, okay, what's the thing?And then it's usually something teams and org and enterprise, but it's actually working. But like usually at that stage when you're five, when you're just a vs. Code fork it's like how do you get there? Yeah. Will people pay for this? People do pay for it.Jonas: Yeah. And I think for cloud agents, we expect.[00:28:00]To have similar kind of PLG things where I think off the bat we've seen a lot of adoption with kind of smaller teams where the code bases are not quite as complex to set up. Yes. If you need some insane docker layer caching thing for builds not to take two hours, that's going to take a little bit longer for us to be able to support that kind of infrastructure.Whereas if you have front end backend, like one click agents can install everything that they need themselves.swyx: This is a good chance for me to just ask some technical sort of check the box questions. Can I choose the size of the vm?Jonas: Not yet. We are planning on adding that. Weswyx: have, this is obviously you want like LXXL, whatever, right?Like it's like the Amazon like sort menu.Jonas: Yes, exactly. We'll add that.swyx: Yeah. In some ways you have to basically become like a EC2, almost like you rent a box.Jonas: You rent a box. Yes. We talk a lot about brain in a box. Yeah. So cursor, we want to be a brain in a box,swyx: but is the mental model different? Is it more serverless?Is it more persistent? Is. Something else.Samantha: We want it to be a bit persistent. The desktop should be [00:29:00] something you can return to af even after some days. Like maybe you go back, they're like still thinking about a feature for some period of time. So theswyx: full like sus like suspend the memory and bring it back and then keep going.Samantha: Exactly.swyx: That's an interesting one because what I actually do want, like from a manna and open crawl, whatever, is like I want to be able to log in with my credentials to the thing, but not actually store it in any like secret store, whatever. ‘cause it's like this is the, my most sensitive stuff.Yeah. This is like my email, whatever. And just have it like, persist to the image. I don't know how it was hood, but like to rehydrate and then just keep going from there. But I don't think a lot of infra works that way. A lot of it's stateless where like you save it to a docker image and then it's only whatever you can describe in a Docker file and that's it.That's the only thing you can cl multiple times in parallel.Jonas: Yeah. We have a bunch of different ways of setting them up. So there's a dockerfile based approach. The main default way is actually snapshottingswyx: like a Linux vmJonas: like vm, right? You run a bunch of install commands and then you snapshot more or less the file system.And so that gets you set up for everything [00:30:00] that you would want to bring a new VM up from that template basically.swyx: Yeah.Jonas: And that's a bit distinct from what Sam was talking about with the hibernating and re rehydrating where that is a full memory snapshot as well. So there, if I had like the browser open to a specific page and we bring that back, that page will still be there.swyx: Was there any discussion internally and just building this stuff about every time you shoot a video it's actually you show a little bit of the desktop and the browser and it's not necessary if you just show the browser. If, if you know you're just demoing a front end application.Why not just show the browser, right? Like it Yeah,Samantha: we do have some panning and zooming. Yeah. Like it can decide that when it's actually recording and cutting the video to highlight different things. I think we've played around with different ways of segmenting it and yeah. There's been some different revs on it for sure.Jonas: Yeah. I think one of the interesting things is the version that you see now in cursor.com actually is like half of what we had at peak where we decided to unshift or unshipped quite a few things. So two of the interesting things to talk about, one is directly an answer to your [00:31:00] question where we had native browser that you would have locally, it was basically an iframe that via port forwarding could load the URL could talk to local host in the vm.So that gets you basically, so inswyx: your machine's browser,likeJonas: in your local browser? Yeah. You would go to local host 4,000 and that would get forwarded to local host 4,000 in the VM via port forward. We unshift that like atswyx: Eng Rock.Jonas: Like an Eng Rock. Exactly. We unshift that because we felt that the remote desktop was sufficiently low latency and more general purpose.So we build Cursor web, but we also build Cursor desktop. And so it's really useful to be able to have the full spectrum of things. And even for Cursor Web, as you saw in one of the examples, the agent was uploading files and like I couldn't upload files and open the file viewer if I only had access to the browser.And we've thought a lot about, this might seem funny coming from Cursor where we started as this, vs. Code Fork and I think inherited a lot of amazing things, but also a lot [00:32:00] of legacy UI from VS Code.Minimal Web UI SurfacesJonas: And so with the web UI we wanted to be very intentional about keeping that very minimal and exposing the right sum of set of primitive sort of app surfaces we call them, that are shared features of that cloud.Environment that you and the agent both use. So agent uses desktop and controls it. I can use desktop and controlled agent runs terminal commands. I can run terminal commands. So that's how our philosophy around it. The other thing that is maybe interesting to talk about that we unshipped is and we may, both of these things we may reship and decide at some point in the future that we've changed our minds on the trade offs or gotten it to a point where, putswyx: it out there.Let users tell you they want it. Exactly. Alright, fine.Why No File EditorJonas: So one of the other things is actually a files app. And so we used to have the ability at one point during the process of testing this internally to see next to, I had GID desktop and terminal on the right hand side of the tab there earlier to also have a files app where you could see and edit files.And we actually felt that in some [00:33:00] ways, by restricting and limiting what you could do there, people would naturally leave more to the agent and fall into this new pattern of delegating, which we thought was really valuable. And there's currently no way in Cursor web to edit these files.swyx: Yeah. Except you like open up the PR and go into GitHub and do the thing.Jonas: Yeah.swyx: Which is annoying.Jonas: Just tell the agent,swyx: I have criticized open AI for this. Because Open AI is Codex app doesn't have a file editor, like it has file viewer, but isn't a file editor.Jonas: Do you use the file viewer a lot?swyx: No. I understand, but like sometimes I want it, the one way to do it is like freaking going to no, they have a open in cursor button or open an antigravity or, opening whatever and people pointed that.So I was, I was part of the early testers group people pointed that and they were like, this is like a design smell. It's like you actually want a VS. Code fork that has all these things, but also a file editor. And they were like, no, just trust us.Jonas: Yeah. I think we as Cursor will want to, as a product, offer the [00:34:00] whole spectrum and so you want to be able to.Work at really high levels of abstraction and double click and see the lowest level. That's important. But I also think that like you won't be doing that in Slack. And so there are surfaces and ways of interacting where in some cases limiting the UX capabilities makes for a cleaner experience that's more simple and drives people into these new patterns where even locally we kicked off joking about this.People like don't really edit files, hand code anymore. And so we want to build for where that's going and not where it's beenswyx: a lot of cool stuff. And Okay. I have a couple more.Full Stack Hosting Debateswyx: So observations about the design elements about these things. One of the things that I'm always thinking about is cursor and other peers of cursor start from like the Devrel tools and work their way towards cloud agents.Other people, like the lovable and bolts of the world start with here's like the vibe code. Full cloud thing. They were already cloud edges before anyone else cloud edges and we will give you the full deploy platform. So we own the whole loop. We own all the infrastructure, we own, we, we have the logs, we have the the live site, [00:35:00] whatever.And you can do that cycle cursor doesn't own that cycle even today. You don't have the versal, you don't have the, you whatever deploy infrastructure that, that you're gonna have, which gives you powers because anyone can use it. And any enterprise who, whatever you infra, I don't care. But then also gives you limitations as to how much you can actually fully debug end to end.I guess I'm just putting out there that like is there a future where there's like full stack cursor where like cursor apps.com where like I host my cursor site this, which is basically a verse clone, right? I don't know.Jonas: I think that's a interesting question to be asking, and I think like the logic that you laid out for how you would get there is logic that I largely agree with.swyx: Yeah. Yeah.Jonas: I think right now we're really focused on what we see as the next big bottleneck and because things like the Datadog MCP exist, yeah. I don't think that the best way we can help our customers ship more software. Is by building a hosting solution right now,swyx: by the way, these are things I've actually discussed with some of the companies I just named.Jonas: Yeah, for sure. Right now, just this big bottleneck is getting the code out there and also [00:36:00] unlike a lovable in the bolt, we focus much more on existing software. And the zero to one greenfield is just a very different problem. Imagine going to a Shopify and convincing them to deploy on your deployment solution.That's very different and I think will take much longer to see how that works. May never happen relative to, oh, it's like a zero to one app.swyx: I'll say. It's tempting because look like 50% of your apps are versal, superb base tailwind react it's the stack. It's what everyone does.So I it's kinda interesting.Jonas: Yeah.Model Choice and Auto Routingswyx: The other thing is the model select dying. Right now in cloud agents, it's stuck down, bottom left. Sure it's Codex High today, but do I care if it's suddenly switched to Opus? Probably not.Samantha: We definitely wanna give people a choice across models because I feel like it, the meta change is very frequently.I was a big like Opus 4.5 Maximalist, and when codex 5.3 came out, I hard, hard switch. So that's all I use now.swyx: Yeah. Agreed. I don't know if, but basically like when I use it in Slack, [00:37:00] right? Cursor does a very good job of exposing yeah. Cursors. If people go use it, here's the model we're using.Yeah. Here's how you switch if you want. But otherwise it's like extracted away, which is like beautiful because then you actually, you should decide.Jonas: Yeah, I think we want to be doing more with defaults.swyx: Yeah.Jonas: Where we can suggest things to people. A thing that we have in the editor, the desktop app is auto, which will route your request and do things there.So I think we will want to do something like that for cloud agents as well. We haven't done it yet. And so I think. We have both people like Sam, who are very savvy and want know exactly what model they want, and we also have people that want us to pick the best model for them because we have amazing people like Sam and we, we are the experts.Yeah. We have both the traffic and the internal taste and experience to know what we think is best.swyx: Yeah. I have this ongoing pieces of agent lab versus model lab. And to me, cursor and other companies are example of an agent lab that is, building a new playbook that is different from a model lab where it's like very GP heavy Olo.So obviously has a research [00:38:00] team. And my thesis is like you just, every agent lab is going to have a router because you're going to be asked like, what's what. I don't keep up to every day. I'm not a Sam, I don't keep up every day for using you as sample the arm arbitrator of taste. Put me on CRI Auto.Is it free? It's not free.Jonas: Auto's not free, but there's different pricing tiers. Yeah.swyx: Put me on Chris. You decide from me based on all the other people you know better than me. And I think every agent lab should basically end up doing this because that actually gives you extra power because you like people stop carrying or having loyalty with one lab.Jonas: Yeah.Best Of N and Model CouncilsJonas: Two other maybe interesting things that I don't know how much they're on your radar are one the best event thing we mentioned where running different models head to head is actually quite interesting becauseswyx: which exists in cursor.Jonas: That exists in cur ID and web. So the problem is where do you run them?swyx: Okay.Jonas: And so I, I can share my screen if that's interesting. Yeahinteresting.swyx: Yeah. Yeah. Obviously parallel agents, very popal.Jonas: Yes, exactly. Parallel agentsswyx: in you mind. Are they the same thing? Best event and parallel agents? I don't want to [00:39:00] put words in your mouth.Jonas: Best event is a subset of parallel agents where they're running on the same prompt.That would be my answer. So this is what that looks like. And so here in this dropdown picker, I can just select multiple models.swyx: Yeah.Jonas: And now if I do a prompt, I'm going to do something silly. I am running these five models.swyx: Okay. This is this fake clone, of course. The 2.0 yeah.Jonas: Yes, exactly. But they're running so the cursor 2.0, you can do desktop or cloud.So this is cloud specifically where the benefit over work trees is that they have their own VMs and can run commands and won't try to kill ports that the other one is running. Which are some of the pains. These are allswyx: called work trees?Jonas: No, these are all cloud agents with their own VMs.swyx: Okay. ButJonas: When you do it locally, sometimes people do work trees and that's been the main way that people have set out parallel so far.I've gotta say.swyx: That's so confusing for folks.Jonas: Yeah.swyx: No one knows what work trees are.Jonas: Exactly. I think we're phasing out work trees.swyx: Really.Jonas: Yeah.swyx: Okay.Samantha: But yeah. And one other thing I would say though on the multimodel choice, [00:40:00] so this is another experiment that we ran last year and the decide to ship at that time but may come back to, and there was an interesting learning that's relevant for, these different model providers. It was something that would run a bunch of best of ends but then synthesize and basically run like a synthesizer layer of models. And that was other agents that would take LM Judge, but one that was also agentic and could write code. So it wasn't just picking but also taking the learnings from two models or, and models that it was looking at and writing a new diff.And what we found was that at the time at least, there were strengths to using models from different model providers as the base level of this process. Like basically you could get almost like a synergistic output that was better than having a very unified, like bottom model tier. So it was really interesting ‘cause it's like potentially, even though even in the future when you have like maybe one model as ahead of the other for a little bit, there could be some benefit from having like multiple top tier models involved in like a [00:41:00] model swarm or whatever agent Swarm that you're doing, that they each have strengths and weaknesses.Yeah.Jonas: Andre called this the council, right?Samantha: Yeah, exactly. We actually, oh, that's another internal command we have that Ian wrote slash council. Oh, and they some, yeah.swyx: Yes. This idea is in various forms everywhere. And I think for me, like for me, the productization of it, you guys have done yeah, like this is very flexible, but.If I were to add another Yeah, what your thing is on here it would be too much. I what, let's say,Samantha: Ideally it's all, it's something that the user can just choose and it all happens under the hood in a way where like you just get the benefit of that process at the end and better output basically, but don't have to get too lost in the complexity of judging along the way.Jonas: Okay.Subagents for ContextJonas: Another thing on the many agents, on different parallel agents that's interesting is an idea that's been around for a while as well that has started working recently is subagents. And so this is one other way to get agents of the different prompts and different goals and different models, [00:42:00] different vintages to work together.Collaborate and delegate.swyx: Yeah. I'm very like I like one of my, I always looking for this is the year of the blah, right? Yeah. I think one of the things on the blahs is subs. I think this is of but I haven't used them in cursor. Are they fully formed or how do I honestly like an intro because do I form them from new every time?Do I have fixed subagents? How are they different for slash commands? There's all these like really basic questions that no one stops to answer for people because everyone's just like too busy launching. We have toSamantha: honestly, you could, you can see them in cursor now if you just say spin up like 50 subagents to, so cursor definesswyx: what Subagents.Yeah.Samantha: Yeah. So basically I think I shouldn't speak for the whole subagents team. This is like a different team that's been working on this, but our thesis or thing that we saw internally is that like they're great for context management for kind of long running threads, or if you're trying to just throw more compute at something.We have strongly used, almost like a generic task interface where then the main agent can define [00:43:00] like what goes into the subagent. So if I say explore my code base, it might decide to spin up an explore subagent and or might decide to spin up five explore subagent.swyx: But I don't get to set what those subagent are, right?It's all defined by a model.Samantha: I think. I actually would have to refresh myself on the sub agent interface.Jonas: There are some built-in ones like the explore subagent is free pre-built. But you can also instruct the model to use other subagents and then it will. And one other example of a built-in subagent is I actually just kicked one off in cursor and I can show you what that looks like.swyx: Yes. Because I tried to do this in pure prompt space.Jonas: So this is the desktop app? Yeah. Yeah. And that'sswyx: all you need to do, right? Yeah.Jonas: That's all you need to do. So I said use a sub agent to explore and I think, yeah, so I can even click in and see what the subagent is working on here. It ran some fine command and this is a composer under the hood.Even though my main model is Opus, it does smart routing to take, like in this instance the explorer sort of requires reading a ton of things. And so a faster model is really useful to get an [00:44:00] answer quickly, but that this is what subagent look like. And I think we wanted to do a lot more to expose hooks and ways for people to configure these.Another example of a cus sort of builtin subagent is the computer use subagent in the cloud agents, where we found that those trajectories can be long and involve a lot of images obviously, and execution of some testing verification task. We wanted to use that models that are particularly good at that.So that's one reason to use subagents. And then the other reason to use subagents is we want contexts to be summarized reduced down at a subagent level. That's a really neat boundary at which to compress that rollout and testing into a final message that agent writes that then gets passed into the parent rather than having to do some global compaction or something like that.swyx: Awesome. Cool. While we're in the subagents conversation, I can't do a cursor conversation and not talk about listen stuff. What is that? What is what? He built a browser. He built an os. Yes. And he [00:45:00] experimented with a lot of different architectures and basically ended up reinventing the software engineer org chart.This is all cool, but what's your take? What's, is there any hole behind the side? The scenes stories about that kind of, that whole adventure.Samantha: Some of those experiments have found their way into a feature that's available in cloud agents now, the long running agent mode internally, we call it grind mode.And I think there's like some hint of grind mode accessible in the picker today. ‘cause you can do choose grind until done. And so that was really the result of experiments that Wilson started in this vein where he I think the Ralph Wigga loop was like floating around at the time, but it was something he also independently found and he was experimenting with.And that was what led to this product surface.swyx: And it is just simple idea of have criteria for completion and do not. Until you complete,Samantha: there's a bit more complexity as well in, in our implementation. Like there's a specific, you have to start out by aligning and there's like a planning stage where it will work with you and it will not get like start grind execution mode until it's decided that the [00:46:00] plan is amenable to both of you.Basically,swyx: I refuse to work until you make me happy.Jonas: We found that it's really important where people would give like very underspecified prompt and then expect it to come back with magic. And if it's gonna go off and work for three minutes, that's one thing. When it's gonna go off and work for three days, probably should spend like a few hours upfront making sure that you have communicated what you actually want.swyx: Yeah. And just to like really drive from the point. We really mean three days that No, noJonas: human. Oh yeah. We've had three day months innovation whatsoever.Samantha: I don't know what the record is, but there's been a long time with the grantsJonas: and so the thing that is available in cursor. The long running agent is if you wanna think about it, very abstractly that is like one worker node.Whereas what built the browser is a society of workers and planners and different agents collaborating. Because we started building the browser with one worker node at the time, that was just the agent. And it became one worker node when we realized that the throughput of the system was not where it needed to be [00:47:00] to get something as large of a scale as the browser done.swyx: Yeah.Jonas: And so this has also become a really big mental model for us with cloud, cloud agents is there's the classic engineering latency throughput trade-offs. And so you know, the code is water flowing through a pipe. The, we think that over the coming months, the big unlock is not going to be one person with a model getting more done, like the water flowing faster and we'll be making the pipe much wider and so ing more, whether that's swarms of agents or parallel agents, both of those are things that contribute to getting.Much more done in the same amount of time, but any one of those tasks doesn't necessarily need to get done that quickly. And throughput is this really big thing where if you see the system of a hundred concurrent agents outputting thousands of tokens a second, you can't go back like that.Just you see a glimpse of the future where obviously there are many caveats. Like no one is using this browser. IRL. There's like a bunch of things not quite right yet, but we are going to get to systems that produce real production [00:48:00] code at the scale much sooner than people think. And it forces you to think what even happens to production systems. Like we've broken our GitHub actions recently because we have so many agents like producing and pushing code that like CICD is just overloaded. ‘cause suddenly it's like effectively weg grew, cursor's growing very quickly anyway, but you grow head count, 10 x when people run 10 x as many agents.And so a lot of these systems, exactly, a lot of these systems will need to adapt.swyx: It also reminds me, we, we all, the three of us live in the app layer, but if you talk to the researchers who are doing RL infrastructure, it's the same thing. It's like all these parallel rollouts and scheduling them and making sure as much throughput as possible goes through them.Yeah, it's the same thing.Jonas: We were talking briefly before we started recording. You were mentioning memory chips and some of the shortages there. The other thing that I think is just like hard to wrap your head around the scale of the system that was building the browser, the concurrency there.If Sam and I both have a system like that running for us, [00:49:00] shipping our software. The amount of inference that we're going to need per developer is just really mind-boggling. And that makes, sometimes when I think about that, I think that even with, the most optimistic projections for what we're going to need in terms of buildout, our underestimating, the extent to which these swarm systems can like churn at scale to produce code that is valuable to the economy.And,swyx: yeah, you can cut this if it's sensitive, but I was just Do you have estimates of how much your token consumption is?Jonas: Like per developer?swyx: Yeah. Or yourself. I don't need like comfy average. I just curious. ISamantha: feel like I, for a while I wasn't an admin on the usage dashboard, so I like wasn't able to actually see, but it was a,swyx: mine has gone up.Samantha: Oh yeah.swyx: But I thinkSamantha: it's in terms of how much work I'm doing, it's more like I have no worries about developers losing their jobs, at least in the near term. ‘cause I feel like that's a more broad discussion.swyx: Yeah. Yeah. You went there. I didn't go, I wasn't going there.I was just like how much more are you using?Samantha: There's so much stuff to be built. And so I feel like I'm basically just [00:50:00] trying to constantly I have more ambitions than I did before. Yes. Personally. Yes. So can't speak to the broader thing. But for me it's like I'm busier than ever before.I'm using more tokens and I am also doing more things.Jonas: Yeah. Yeah. I don't have the stats for myself, but I think broadly a thing that we've seen, that we expect to continue is J'S paradox. Whereswyx: you can't do it in our podcast without seeingJonas: it. Exactly. We've done it. Now we can wrap. We've done, we said the words.Phase one tab auto complete people paid like 20 bucks a month. And that was great. Phase two where you were iterating with these local models. Today people pay like hundreds of dollars a month. I think as we think about these highly parallel kind of agents running off for a long times in their own VM system, we are already at that point where people will be spending thousands of dollars a month per human, and I think potentially tens of thousands and beyond, where it's not like we are greedy for like capturing more money, but what happens is just individuals get that much more leverage.And if one person can do as much as 10 people, yeah. That tool that allows ‘em to do that is going to be tremendously valuable [00:51:00] and worth investing in and taking the best thing that exists.swyx: One more question on just the cursor in general and then open-ended for you guys to plug whatever you wanna put.How is Cursor hiring these days?Samantha: What do you mean by how?swyx: So obviously lead code is dead. Oh,Samantha: okay.swyx: Everyone says work trial. Different people have different levels of adoption of agents. Some people can really adopt can be much more productive. But other people, you just need to give them a little bit of time.And sometimes they've never lived in a token rich place like cursor.And once you live in a token rich place, you're you just work differently. But you need to have done that. And a lot of people anyway, it was just open-ended. Like how has agentic engineering, agentic coding changed your opinions on hiring?Is there any like broad like insights? Yeah.Jonas: Basically I'm asking this for other people, right? Yeah, totally. Totally. To hear Sam's opinion, we haven't talked about this the two of us. I think that we don't see necessarily being great at the latest thing with AI coding as a prerequisite.I do think that's a sign that people are keeping up and [00:52:00] curious and willing to upscale themselves in what's happening because. As we were talking about the last three months, the game has completely changed. It's like what I do all day is very different.swyx: Like it's my job and I can't,Jonas: Yeah, totally.I do think that still as Sam was saying, the fundamentals remain important in the current age and being able to go and double click down. And models today do still have weaknesses where if you let them run for too long without cleaning up and refactoring, the coke will get sloppy and there'll be bad abstractions.And so you still do need humans that like have built systems before, no good patterns when they see them and know where to steer things.Samantha: I would agree with that. I would say again, cursor also operates very quickly and leveraging ag agentic engineering is probably one reason why that's possible in this current moment.I think in the past it was just like people coding quickly and now there's like people who use agents to move faster as well. So it's part of our process will always look for we'll select for kind of that ability to make good decisions quickly and move well in this environment.And so I think being able to [00:53:00] figure out how to use agents to help you do that is an important part of it too.swyx: Yeah. Okay. The fork in the road, either predictions for the end of the year, if you have any, or PUDs.Jonas: Evictions are not going to go well.Samantha: I know it's hard.swyx: They're so hard. Get it wrong.It's okay. Just, yeah.Jonas: One other plug that may be interesting that I feel like we touched on but haven't talked a ton about is a thing that the kind of these new interfaces and this parallelism enables is the ability to hop back and forth between threads really quickly. And so a thing that we have,swyx: you wanna show something or,Jonas: yeah, I can show something.A thing that we have felt with local agents is this pain around contact switching. And you have one agent that went off and did some work and another agent that, that did something else. And so here by having, I just have three tabs open, let's say, but I can very quickly, hop in here.This is an example I showed earlier, but the actual workflow here I think is really different in a way that may not be obvious, where, I start t
Renaissance history is so much wilder and weirder than you would have expected. Very fun chatting with Ada Palmer (historian, novelist, and composer based at the University of Chicago).Some especially fascinating things I learned from the conversation and her excellent book, Inventing the Renaissance:Not only did Gutenberg go bankrupt in the 1450s (after inventing the printing press), but so did the bank that foreclosed on him, and so did his apprentices. This is because paper was still very expensive, and so you had to make this big upfront CAPEX decision to print a batch of 300 copies of a book - say the Bible. But he's in a small landlocked German town where only priests are allowed to read the Bible - so he sells maybe 7 copies. It's only when this technology ends up in Venice, where you can hand 10 copies to each of 30 ship captains going to 30 different cities, that it starts taking off.Speaking of which, the printing revolution wasn't just one single discrete event, just as the computer revolution has been this whole century of going from mainframes -> personal computers -> phones -> social media, each with different and accelerating social impact. Books came first, but they're slow to print, and made in small batches. The real revolution is pamphlets - much faster, much harder to censor. Pamphlet runners are how you can have Luther's 95 Theses go from Wittenberg to London in 17 days.So much other wild stuff from this episode. For example, did you know that the largest and best-funded experimental laboratory in 17th century Europe was very likely the Roman one run by inquisitors? Ada jokes that the Inquisition accidentally invented peer review. The focus of the Inquisition is really misunderstood - it was obsessed with catching dangerous new heretics like Lutherans and Calvinists - it only executed one person for doing science.And this leads Ada to make an observation that I think is really wise: the authorities and censors are always worried about the exact wrong things given 20/20 hindsight. When Inquisition raids an underground bookshop during the French Enlightenment, they don't mind the Rousseau, Voltaire, and Encyclopédie, but they lose their minds about some Jansenist treatises about the technical nature of the Trinity.More broadly, a lesson for me from this episode is that it's just really hard to shape history in the specific way that you want to impact things. One of the most famous medieval scholars is this guy Petrarch. He survives the Black Death in the 1340s, watches his friends die to plague and bandits, and says: our leaders are selfish and terrible, we need to raise them on the Roman classics so they'll act like Cicero. So Europe pours money into finding ancient manuscripts, building libraries, and educating princes on classical virtues. Those princes grow up and fight bigger, nastier wars than ever before with new deadlier technology. And this, combined with greater urbanization and endemic plague, results in European life expectancy decreasing from 35 in the medieval period to 18 during the Renaissance (the period which we in retrospect think of as a golden age but which many people living through it thought of as the continuation of the dark ages that had persisted since the fall of Rome).Anyways, the libraries Petrarch inspires stick around, the printing press makes them accessible to everyone, and 200 years later a generation of medical students is reading Lucretius and asking “what if there are atoms and that's how diseases work?” which eventually leads to germ theory, vaccines, and a cure for the Black Death (Ada has longer more involved explanation of how cosplaying the Romans results through a series of many steps to the scientific revolution). Petrarch wanted to produce philosopher-kings that shared his values. Instead he created a world that doesn't share his values at all but can cure the disease that destroyed his.Watch on YouTube; read the transcript.Sponsors* Jane Street is still waiting on someone to solve their backdoor puzzle… They're accepting submissions until April 1st and have set aside $50,000 for the best attempts. Separately, applications are live for Jane Street's summer ML internships in NY, London, and Hong Kong. Go check all of this out at janestreet.com/dwarkesh.* Labelbox can help ensure your agents don't need to rely on overspecified prompts. They tailor real-world scenarios to whatever domain you're focused on, and they make sure the data you train on rewards real understanding, not just instruction-following. Learn more at labelbox.com/dwarkesh* Mercury's personal accounts let you add users, issue cards, and customize permissions. This is super useful for sharing finances with a partner, a roommate… or even an OpenClaw agent. And, if you're already a Mercury Business user, your personal account is free! See terms and conditions below, and learn more at mercury.com/personal-bankingEligible Mercury Business users who apply for and maintain a Mercury Personal account may have their Mercury Personal subscription fee waived provided they remain a user on an active Mercury Business account in good standing. Standard Mercury Platform Subscription fees will apply if they no longer meet eligibility requirements, including but not limited to no longer being associated with an eligible Mercury Business account, or if the program is modified or terminated. Mercury may modify or discontinue this offering at any time and will provide notice as required by law. See Subscription Terms for full details.* To sponsor a future episode, visit dwarkesh.com/advertise.Timestamps(00:00:00) - How cosplaying Ancient Rome led to the Renaissance(00:28:49) - How Florence's weird republic worked(00:38:13) - How the Medicis took over Florence(00:58:12) - Why it was so hard for Gutenberg to make any money off the printing press(01:17:34) - Why the industrial revolution didn't happen in Italy(01:23:02) - The Library of Alexandria isn't where most ancient books were lost(01:41:21) - The Inquisition accidentally invented peer review Get full access to Dwarkesh Podcast at www.dwarkesh.com/subscribe
Podcast: "The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis Episode: Situational Awareness in Government, with UK AISI Chief Scientist Geoffrey IrvingRelease date: 2026-03-01Get Podcast Transcript →powered by Listen411 - fast audio-to-text and summarizationGeoffrey Irving, Chief Scientist at the UK AI Security Institute, explains why our theoretical understanding of machine learning remains fragile even as models surpass experts on critical security tasks. He details AISI's work on frontier model evaluations, red teaming, and threat modeling across biosecurity, cybersecurity, and loss-of-control risks. The conversation explores reward hacking, eval awareness, and why current safety techniques may struggle to deliver high reliability. Listeners will also hear how AISI is funding foundational research to build stronger guarantees for AI safety. Use the Granola Recipe Nathan relies on to identify blind spots across conversations, AI research, and decisions: https://bit.ly/granolablindspotSponsors: Serval: Serval uses AI-powered automations to cut IT help desk tickets by more than 50%, freeing your team from repetitive tasks like password resets and onboarding. Book your free pilot and guarantee 50% help desk automation by week 4 at https://serval.com/cognitive Claude: Claude is the AI collaborator that understands your entire workflow, from drafting and research to coding and complex problem-solving. Start tackling bigger problems with Claude and unlock Claude Pro's full capabilities at https://claude.ai/tcr Tasklet: Tasklet is an AI agent that automates your work 24/7; just describe what you want in plain English and it gets the job done. Try it for free and use code COGREV for 50% off your first month at https://tasklet.ai CHAPTERS: (00:00) About the Episode (04:09) From physics to ML (08:52) AGI uncertainty and threats (Part 1) (18:08) Sponsors: Serval | Claude (21:29) AGI uncertainty and threats (Part 2) (27:35) Control, autonomy, alignment (Part 1) (34:02) Sponsor: Tasklet (35:14) Control, autonomy, alignment (Part 2) (38:44) Inside the UK AC (51:02) Evaluations and jailbreaking (01:01:17) Emerging capabilities and misuse (01:14:20) Agents and reward hacking (01:26:09) Theoretical alignment agenda (01:38:39) Debate and formal methods (01:51:19) Limits of formalization (02:02:27) Future risks and governance (02:16:23) Episode Outro (02:18:58) Outro PRODUCED BY: https://aipodcast.ing SOCIAL LINKS: Website: https://www.cognitiverevolution.ai Twitter (Podcast): https://x.com/cogrev_podcast Twitter (Nathan): https://x.com/labenz LinkedIn: https://linkedin.com/in/nathanlabenz/ Youtube: https://youtube.com/@CognitiveRevolutionPodcast Apple: https://podcasts.apple.com/de/podcast/the-cognitive-revolution-ai-builders-researchers-and/id1669813431 Spotify: https://open.spotify.com/show/6yHyok3M3BjqzR0VB5MSyk
On the podcast: how Tinder's ML-powered paywalls drove millions in new revenue, the art of selling features à la carte without killing subscription revenue, and why Tinder Select flopped despite users saying they'd pay for it.This conversation is shorter than usual and will be featured in RevenueCat's State of Subscription Apps report. Each episode in this series will explore one crucial topic and share actionable insights from top subscription app operators.Top Takeaways:
Data scientists use optimisation every day when training machine learning models, without even thinking about it. But there's another type of optimisation - that many data scientists are unaware of - that can be used to dramatically boost the business value of your ML outputs. This second layer transforms predictions into optimal decisions, and it's where the real impact often happens.In this episode, Dr. Tim Varelmann joins Dr. Genevieve Hayes to explain how combining machine learning with decision optimisation creates solutions that go far beyond prediction, helping stakeholders make better decisions in uncertain environments.You'll discover:How decision optimisation differs from ML parameter tuning [02:19]Why combining predictions with optimisation multiplies value [13:36]The mindset shift needed to think in optimisation terms [22:59]How to spot immediate optimisation opportunities in your work [23:42]Guest BioDr Tim Varelmann is the founder of Bluebird Optimization and holds a PhD in Mathematical Optimisation. He is also the creator of Effortless Modeling in Python with GAMSPy, the world's first GAMSPy course.LinksBluebird Optimization WebsiteConnect with Genevieve on LinkedInBe among the first to hear about the release of each new podcast episode by signing up HERE
In this episode of Robots and Red Tape, host Nick Schutt sits down with Ian P. Cook, PhD with Qloo, to cut through the AI hype.They discuss why AI won't cure cancer despite bold claims, highlighting the irreplaceable role of human experts in research and trials. Ian shares insights from his background in machine learning, DoD logistics, and building ML products, explaining AI's real value in medical transcription, document synthesis, and more—while warning about hallucinations, data privacy, and the dangers of anthropomorphizing these tools.What We Cover:Ian Cook's background and journey into AIWhy AI will not cure cancer (and why that claim is BS)Where generative AI actually helps medical researchWhat “AI discovered a new drug” really meansLLMs as probabilistic text generators, not reasoning enginesThe real dangers of overselling AI in medicineHallucinations: why they're a structural limitationRisks of agentic systems and compounding errorsThe wild Moltbook phenomenon and agent chaosWhy small domain-specific models beat massive general models* Practical advice for using AI wiselySubscribe to @RobotsandRedTapeAI for more no-hype AI conversations.#AI #MedicalAI #AIHype #GenerativeAI #HealthcareInnovation
As foundation models, including large language models and multimodal models, are increasingly deployed in complex and high-stakes settings, ensuring their safety has become more important than ever. In this talk, I present a probabilistic perspective on AI safety: safety risks are treated as structured distributions to be discovered and controlled, rather than isolated failures to be patched. I first introduce probabilistic red-teaming methods that characterize distributions of failures, revealing systematic safety risks that standard evaluation often misses. I then describe probabilistic defense methods that control model behavior during deployment by adaptively steering generation toward constraint-aligned distributions. By unifying failure discovery and behavior control under a probabilistic perspective, this talk highlights a distributional approach for understanding and managing safety risks in foundation models. About the speaker: Ruqi Zhang is an Assistant Professor in the Department of Computer Science at Purdue University. Her research focuses on probabilistic machine learning, generative modeling, and trustworthy AI. Prior to joining Purdue, she was a postdoctoral researcher at the Institute for Foundations of Machine Learning (IFML) at the University of Texas at Austin. She received her Ph.D. from Cornell University. Dr. Zhang has been a key organizer of the Symposium on Probabilistic Machine Learning. She has served as an Area Chair and Editor for ML conferences and journals, including ICML, NeurIPS, ICLR, AISTATS, UAI, and TMLR. Her contributions have been recognized with several honors, including AAAI New Faculty Highlights, Amazon Research Award, Spotlight Rising Star in Data Science, Seed for Success Acorn Award, and Ross-Lynn Research Scholar.
Today's episode is about the nobility in Upper Hungary - present day Slovakia. In the Slovak lesson, you are going to learn the comparative form of Slovak adjectives in the neuter gender and some new words from my story. You will also learn how to say “Silence is gold.“ in Slovak. At the end of this episode is my story about a young nobleman in Slovak.Episode notesIn today's episode, I'm talking about the nobility in Upper Hungary - present day Slovakia. In the Slovak lesson, you are going to learn the comparative form of Slovak adjectives in the neuter gender and some new words from my story. You will also learn how to say “Silence is gold.“ in Slovak. At the end of this episode, you can find my story about a young nobleman in Slovak.Slovak lessonSentences with the comparative form of adjectives in neuter:1. Naše dieťa je milšie ako susedovie. (Our child is nicer than our neighbor's.)2. Miško je najmilšie bábätko zo všetkých. (Miško is the nicest/sweetest baby of all.)3. Dnešné vysielanie bolo veselšie ako včera. (Today's broadcast was more joyful than yesterday's.)4. Doposiaľ to bolo najveselšíe popoludnie týždňa. (So far, it was the most joyful afternoon of the week.)5. Moje šteniatko je múdrejšie ako tvoje. (My puppy is smarter than yours.)6. To bolo najmúdrejšie rozhodnutie môjho života. (That was the wisest decision of my life.)7. Jeho auto je rýchlejšie ako tvoje. (His car is faster than yours.)8. Ferrari je najrýchlejšie auto na svete. (Ferrari is the fastest car in the world.)Vocabulary1. zámožná rodina (wealthy noble family)2. obdivovať (to admire)3. prihodiť sa (to happen)4. posilniť (to strengthen)5. spojenectvo (alliance)6. nevesta (bride)7. všímať si (to notice)8. šepkať (to whisper)9. trhlina (crack)10. riešenie (solution)11. zásnuby (engagement)12. mlčanie (silence)13. Mlčanie je zlato. (Silence is gold.) => Slovak proverb from my story.Timestamps00:34 Introduction to the lesson02:34 About the nobility in Upper Hungary05:28 Fun fact 108:12 Fun fact 211:03 Slovak lesson15:35 Vocabulary20:23 Story in Slovak23:50 Translation of the story into English26:54 Final thoughtsIf you have any questions, send it to my email hello@bozenasslovak.com. Check my Instagram https://www.instagram.com/bozenasslovak/ where I am posting the pictures of what I am talking about on my podcast. Also, check my website https://www.bozenasslovak.com © All copywrites reserved to Bozena Ondova Hilko LLC
tl;dr We're incubating an academic journal for AI alignment: rapid peer-review of foundational Alignment research that the current publication ecosystem underserves. Key bets: paid attributed review, reviewer-written synthesis abstracts, and targeted automation. Contact us if you're interested in participating as an author, reviewer, or editor, or if you know someone who might be. Experimental Infrastructure for Foundational Alignment Research This is the first in a series of “build-in-the-open” updates regarding the incubation of a new peer-reviewed journal dedicated to AI alignment. Later updates will contain much more detail, but we want to put this out soon to draw community participation early. Fill out this form to express your interest in participating as an author, reviewer, editor, developer, manager, or board member, or to recommend someone who might be interested. The Core Bet Peer review is a crucial public good: it applies scarce researcher time to sort new ideas for focused attention from the community, but is undersupplied because individual reviewers are poorly incentivized. Peer review in alignment research is particularly fragmented. While some parts of the alignment research community are served by existing venues, such as journals and ML conferences, there are significant gaps. These gaps arise from a [...] ---Outline:(00:38) Experimental Infrastructure for Foundational Alignment Research(01:09) The Core Bet(02:27) Operational Design(03:56) Scope(06:08) Governance(06:35) Advisory board(09:16) Institutional stewardship(10:11) Next steps(10:14) Join the founding team(11:49) Support us online(12:14) Contributors to this document The original text contained 2 footnotes which were omitted from this narration. --- First published: March 3rd, 2026 Source: https://www.lesswrong.com/posts/msnGbm52ZcG3xYcFo/an-alignment-journal-coming-soon --- Narrated by TYPE III AUDIO.
In this feed drop from the Internet History Podcast, host Brian McCullough speaks with Chris Dixon, general partner at a16z, about his path from 1980s hobbyist programmer to one of the most prominent venture capitalists in tech. Chris traces his career from quantitative finance to founding SiteAdvisor, cofounding Founder Collective, starting an early machine learning company, and eventually building a16z's crypto practice from the ground up. They also discuss his framework for spotting unconventional investments, the current state of crypto regulation, and why New York is becoming a serious tech hub. Resources: Follow Chris Dixon on X: https://twitter.com/cdixon Follow Brian McCullough on X: https://twitter.com/brianmcc Listen to Internet History Podcast: https://www.youtube.com/@internethistorypodcast Stay Updated:Find a16z on YouTube: YouTubeFind a16z on XFind a16z on LinkedInListen to the a16z Show on SpotifyListen to the a16z Show on Apple PodcastsFollow our host: https://twitter.com/eriktorenberg Please note that the content here is for informational purposes only; should NOT be taken as legal, business, tax, or investment advice or be used to evaluate any investment or security; and is not directed at any investors or potential investors in any a16z fund. a16z and its affiliates may maintain investments in the companies discussed. For more details please see a16z.com/disclosures. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.
00:00-20:00: Team USA men's hockey wins gold. ML looks back on an amazing run. Thanks to Byrne Dairy and CH Insurance. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.
Mick Emandi, CEO and Dr. Greg Carder, Functional Medicine Doctor have teamed up to develop RegenMDs.com a company that delivers Molecular Hydrogen Inhalation Therapy as well as Ultra RSF-1, a stem cell technology. They work with patients dealing with chronic pain, joint degeneration, heart conditions, autoimmune issues, and neurological problems harnessing the healing power of Medical Grade Molecular Hydrogen via a home device measuring mL's per minute developed originally in Osaka, Japan. Listen to this exciting conversation about these breakthrough technologies in health care.
„Musíme si dať dole klapky z očí a začať reálne riešiť otázku vlastnej obrany. V Európe to mnohé krajiny riešia, my na Slovensku sme sa ešte ani nezačali báť – v zmysle reálneho pohľadu do zrkadla“, hovorí Marek Janiga, študent medzinárodného práva a riešenia konfliktov na Univerzite Mieru OSN.„Máme tu vojnu, ktorá má štyri roky a dnes sa začala ďalšia“ – v sobotu si takéto slová vypočuli mladí rodičia, ktorí prišli do jedného z kostolov Bratislavy pokrstiť svoje dieťa. Kňaz ich adresoval na jednej strane s obavou z budúcnosti, na druhej s uznaním, že aj do takýchto čias rodičia privádzajú deti.Len pred pár dňami sa z vojny, ktorá už má štyri roky – a teda z Ukrajiny vrátil Marek Janiga. Humanitárny pracovník, študent práva, ktorého si pred časom všimol aj premiér – pre jeho kritiku rušenia Špeciálnej prokuratúry. Bol tam odviezť ďalšiu pomoc, paradoxne v čase, keď tá štátna a núdzová je z rozhodnutia premiéra zastavená.Marek však má za sebou aj ročný program v Mládežníckej poradnej komisii Nato. Naviac v štúdiu pokračuje na Univerzite pre mier, ktorá má mandát OSN. Jeho focusom je medzinárodné právo a riešenie sporov. Opäť - v čase, keď je mier atakovaný sériou vojen a na riešenie čaká nespočet sporov.Ako vidí ich riešenie a čomu ho učí skúsenosť s Ukrajinou?Hosťom RánoNahlas je Marek Janiga.
„Musíme si dať dole klapky z očí a začať reálne riešiť otázku vlastnej obrany. V Európe to mnohé krajiny riešia, my na Slovensku sme sa ešte ani nezačali báť – v zmysle reálneho pohľadu do zrkadla“, hovorí Marek Janiga, študent medzinárodného práva a riešenia konfliktov na Univerzite Mieru OSN.„Máme tu vojnu, ktorá má štyri roky a dnes sa začala ďalšia“ – v sobotu si takéto slová vypočuli mladí rodičia, ktorí prišli do jedného z kostolov Bratislavy pokrstiť svoje dieťa. Kňaz ich adresoval na jednej strane s obavou z budúcnosti, na druhej s uznaním, že aj do takýchto čias rodičia privádzajú deti.Len pred pár dňami sa z vojny, ktorá už má štyri roky – a teda z Ukrajiny vrátil Marek Janiga. Humanitárny pracovník, študent práva, ktorého si pred časom všimol aj premiér – pre jeho kritiku rušenia Špeciálnej prokuratúry. Bol tam odviezť ďalšiu pomoc, paradoxne v čase, keď tá štátna a núdzová je z rozhodnutia premiéra zastavená.Marek však má za sebou aj ročný program v Mládežníckej poradnej komisii Nato. Naviac v štúdiu pokračuje na Univerzite pre mier, ktorá má mandát OSN. Jeho focusom je medzinárodné právo a riešenie sporov. Opäť - v čase, keď je mier atakovaný sériou vojen a na riešenie čaká nespočet sporov.Ako vidí ich riešenie a čomu ho učí skúsenosť s Ukrajinou?Hosťom RánoNahlas je Marek Janiga.
Geoffrey Irving, Chief Scientist at the UK AI Security Institute, explains why our theoretical understanding of machine learning remains fragile even as models surpass experts on critical security tasks. He details AISI's work on frontier model evaluations, red teaming, and threat modeling across biosecurity, cybersecurity, and loss-of-control risks. The conversation explores reward hacking, eval awareness, and why current safety techniques may struggle to deliver high reliability. Listeners will also hear how AISI is funding foundational research to build stronger guarantees for AI safety. Nathan uses Granola to uncover blind spots in conversations and AI research. Try it at granola.ai/tcr with code TCR — and if you're already using it, test his blind spot recipe here: https://bit.ly/granolablindspot Sponsors: Serval: Serval uses AI-powered automations to cut IT help desk tickets by more than 50%, freeing your team from repetitive tasks like password resets and onboarding. Book your free pilot and guarantee 50% help desk automation by week 4 at https://serval.com/cognitive Claude: Claude is the AI collaborator that understands your entire workflow, from drafting and research to coding and complex problem-solving. Start tackling bigger problems with Claude and unlock Claude Pro's full capabilities at https://claude.ai/tcr Tasklet: Tasklet is an AI agent that automates your work 24/7; just describe what you want in plain English and it gets the job done. Try it for free and use code COGREV for 50% off your first month at https://tasklet.ai CHAPTERS: (00:00) About the Episode (04:09) From physics to ML (08:52) AGI uncertainty and threats (Part 1) (18:08) Sponsors: Serval | Claude (21:29) AGI uncertainty and threats (Part 2) (27:35) Control, autonomy, alignment (Part 1) (34:02) Sponsor: Tasklet (35:14) Control, autonomy, alignment (Part 2) (38:44) Inside the UK AC (51:02) Evaluations and jailbreaking (01:01:17) Emerging capabilities and misuse (01:14:20) Agents and reward hacking (01:26:09) Theoretical alignment agenda (01:38:39) Debate and formal methods (01:51:19) Limits of formalization (02:02:27) Future risks and governance (02:16:23) Episode Outro (02:18:58) Outro PRODUCED BY: https://aipodcast.ing SOCIAL LINKS: Website: https://www.cognitiverevolution.ai Twitter (Podcast): https://x.com/cogrev_podcast Twitter (Nathan): https://x.com/labenz LinkedIn: https://linkedin.com/in/nathanlabenz/ Youtube: https://youtube.com/@CognitiveRevolutionPodcast Apple: https://podcasts.apple.com/de/podcast/the-cognitive-revolution-ai-builders-researchers-and/id1669813431 Spotify: https://open.spotify.com/show/6yHyok3M3BjqzR0VB5MSyk
FREE Longevity Builder Web Class:https://longevitybuilderwebclass.netlify.app/Longevity Builder Book and Longevity Builder Health Labhttps://secretlongevityoffer.bolt.host/Theme: Why Cardiorespiratory Fitness (CRF) is the ultimate biological armor against the "Attackers" (Chronic Disease).Host: ShaneFeatured Guests: * John Ranello: 75-year-old fitness practitioner (VO2 Max: 48.5)Professor Ulrik Wisløff: Head of CERG, NTNU; Creator of PAI.Dr. Atefe Tari: Neuroscientist; Lead Researcher on the ExPlas study.The Narrative: Shane introduces the "rare physiology" of 75-year-old John Ranello.The Stats: John's VO2 Max is 48.5 mL/kg/min (Top 1% for his age). Shane's is 54.5 at nearly 60.The Premise: These aren't just "fitness numbers"—they are The Oxygen Shield™.The Core Thesis: High oxygen efficiency isn't about running marathons; it's about creating a system that is biologically "Hard to Kill."The Philosophy: 53 years in the industry. Why he refuses the "retirement" mindset.The Protocol: The 40-minute warm-up discipline and why sprinting is the fountain of youth.The Mindset: The body as a unified, high-performance system rather than a collection of parts.The Analogy: The body as a city; Oxygen as electricity. Low efficiency leads to "system brownouts."The "Attackers": How Heart Disease, Type 2 Diabetes, and Stroke cluster where the shield is thinnest.Biological Armor: Why increasing stroke volume and capillary density thickens the "walls" of your city, making it harder for disease to take hold.Expert Insight: Wisløff explains the HUNT Study data—showing that low cardiorespiratory fitness predicts mortality more accurately than smoking or blood pressure.The Mechanism: Moving from a "small engine" (high stress/low output) to a "large engine" (low stress/high output).Moving Beyond Steps: Why "10,000 steps" is a blunt tool.The 100 PAI Goal: The science of maintaining a rolling 7-day score of 100 to reduce mortality risk by 25-30%.The Longevity Builder Health Lab: Shane introduces the technology used to track the Oxygen Efficiency App and the AQ Engine App..
ML engineering demand remains high with a 3.2 to 1 job-to-candidate ratio, but entry-level hiring is collapsing as AI automates routine programming and data tasks. Career longevity requires shifting from model training to production operations, deep domain expertise, and mastering AI-augmented workflows before standard implementation becomes a commodity. Links Notes and resources at ocdevel.com/mlg/mla-30 Try a walking desk - stay healthy & sharp while you learn & code Generate a podcast - use my voice to listen to any AI generated content you want Market Data and Displacement ML engineering demand rose 89% in early 2025. Median salary is $187,500, with senior roles reaching $550,000. There are 3.2 open jobs for every qualified candidate. AI-exposed roles for workers aged 22 to 25 declined 13 to 16%, while workers over 30 saw 6 to 12% growth. Professional service job openings dropped 20% year-over-year by January 2025. Microsoft cut 15,000 roles, targeting software engineers, and 30% of its code is now AI-generated. Salesforce reduced support headcount from 9,000 to 5,000 after AI handled 30 to 50% of its workload. Sector Comparisons Creative: Chinese illustrator jobs fell 70% in one year. AI increased output from 1 to 40 scenes per day, crashing commission rates by 90%. Trades: US construction lacks 1.7 million workers. Licensing takes 5 years, and the career fatality risk is 1 in 200. High suicide rates (56 per 100,000) and emerging robotics like the $5,900 Unitree R1 indicate a 10 to 15 year window before automation. Orchestration: Prompt engineering roles paying $375,000 became nearly obsolete in 24 months. Claude Code solves 72% of GitHub issues in under eight minutes. Technical Specialization Priorities Model Ops: Move from training to deployment using vLLM or TensorRT. Set up drift detection and monitoring via MLflow or Weights & Biases. Evaluation: Use DeepEval or RAGAS to test for hallucinations, PII leaks, and adversarial robustness. Agentic Workflows: Build multi-step systems with LangGraph or CrewAI. Include human-in-the-loop checkpoints and observability. Optimization: Focus on quantization and distillation for on-device, air-gapped deployment. Domain Expertise: 57.7% of ML postings prefer specialists in healthcare, finance, or climate over generalists. Industry Perspectives Accelerationists (Amodei, Altman): Predict major disruption within 1 to 5 years. Skeptics (LeCun, Marcus): Argue LLMs lack causal reasoning, extending the adoption timeline to 10 to 15 years. Pragmatists (Andrew Ng): Argue that as code gets cheap, the bottleneck shifts from implementation to specification.
AI is already displacing workers in targeted ways - entry-level knowledge workers are being quietly erased from hiring pipelines, freelancers are getting crushed, and the career ladder is being sawed off at the bottom rungs. Yet ML engineer demand has surged 89% with a 3.2:1 talent deficit and $187K median salary. Covers the real displacement data, lessons from the artist bloodbath, the trades escape hatch, the orchestrator treadmill, expert disagreements on timelines, and concrete short- and long-term career moves for ML engineers. Links Notes and resources at ocdevel.com/mlg/mla-4 Try a walking desk - stay healthy & sharp while you learn & code Generate a podcast - use my voice to listen to any AI generated content you want Market Metrics and Displacement Dynamics ML Market: H1 2025 demand rose 89% with a 3.2 to 1 talent deficit. Median salary is $187,500, while Generative AI specialists earn a 40 to 60 percent premium. The "Quiet" Decline: Macro data shows only 4.5% of total layoffs are AI-attributed, but entry-level hiring is collapsing. Stanford/ADP data shows a 13 to 16 percent employment drop for workers aged 22 to 25 in AI-exposed roles since late 2022. UK graduate job postings fell 67%. Corporate Attrition: Salesforce cut 4,000 roles after AI absorbed 30 to 50 percent of workloads. Microsoft cut 15,000 roles as AI began generating 30% of its code. Amazon cut 30,000 jobs while spending $100 billion on AI infrastructure. Sector Analysis: Creative and Trades Illustrators: Jobs in China's gaming sector fell 70% in one year. Clients accept "good enough" work (80% quality) at 5% of the cost. Western freelance graphic design and writing jobs fell 18.5% and 30% respectively within eight months of ChatGPT's launch. Manual Labor: The U.S. construction industry lacks 1.7 million workers annually, but apprenticeships take five years. Humanoid robotics are advancing, with Unitree's R1 priced at $5,900 and Figure AI robots completing 1,250 runtime hours at BMW. Full automation is 10 to 15 years away, but partial displacement via smaller crews is closer. The Orchestration Treadmill Obsolescence Speed: Prompt engineering roles went from $375,000 salaries to obsolescence in 24 months. AI coding agents like Claude Code now resolve 72% of medium-complexity GitHub issues autonomously. Fragile Expertise: Replacing junior workers with AI prevents the development of future senior talent. New engineers risk "fragile expertise," directed by tools they cannot debug during novel failure modes. Economic and Expert Outlook Macro Risks: Daron Acemoglu warns of "so-so automation" that cuts costs without raising productivity, predicting only 0.66% growth over ten years. "Ghost GDP" describes AI-inflated accounts that fail to circulate because machines do not consume. Expert Camps: Accelerationists (Anthropic, OpenAI) predict human-level AI by 2027. Skeptics (LeCun, Marcus) argue LLMs are a dead end lacking world models. Pragmatists (Andrew Ng) suggest shifting from implementation to specification as the cost of code nears zero. Tactical Adaptation for ML Engineers Immediate Skills: Master production ML systems, MLOps, LLM evaluation, and safety engineering. Ability to manage deployment risks and hallucination detection is the primary hiring differentiator. Long-term Moats: Focus on "Small AI" (on-device, private), mechanistic interpretability, and deep domain knowledge in healthcare, logistics, or climate science. The Playbook: Optimize for the current three to five year window. Move from being a model builder to a product-focused engineer who understands business tradeoffs and regulatory compliance.
Send a textAI just found hundreds of high-severity vulnerabilities hiding in open source, and the market flinched. We dig into what Anthropic's Claude Code Security actually means for security teams, why vendors like CrowdStrike and Okta aren't going away, and how the real change lands on roles, workflows, and the skills you need next. From CI/CD integration to vulnerability discovery at scale, we frame where general models augment specialized tools and where human expertise still anchors the stack.We also get tactical with five CISSP-style AI questions designed to sharpen your instincts. You'll learn how adversaries reverse engineer decision boundaries to drive up false negatives, what adversarial examples look like in practice, and why adversarial training matters. We break down indirect prompt injection—how a crafted document can hijack an LLM to exfiltrate session data—and outline guardrails that actually reduce risk. Then we map AI risk using NIST's AI RMF, focusing on the Measure function to evaluate potential harms to protected classes, and we unpack why federated learning still faces privacy leakage through gradient updates without differential privacy and secure aggregation.If you're in a SOC or building AppSec pipelines, this conversation gives you a blueprint to adapt: automate tier one triage, monitor for model drift, add OOD detection, and treat your models like code with tests, reviews, and rollbacks. If you're planning your career, we share concrete pivot paths into detection engineering with ML, AI governance, and assurance. Want more hands-on practice and mentorship to pass the CISSP the first time and future-proof your skills? Subscribe, share this with a teammate, and leave a review with the next AI topic you want us to tackle.Gain exclusive access to 360 FREE CISSP Practice Questions at FreeCISSPQuestions.com and have them delivered directly to your inbox! Don't miss this valuable opportunity to strengthen your CISSP exam preparation and boost your chances of certification success. Join now and start your journey toward CISSP mastery today!
Disturbing pics of Stephen Hawking on Epstein Island released, USA Men's Hockey visits Trump, Nancy Guthrie reward raised, Bonnie Blue knocked up, and Trudi fights her toilet. Programming Note: Marcie Hume (Corey Feldman vs. The World) and Lita Ford will join us tomorrow. The State of the Union is going down tonight. The US Men's Hockey Team is getting some heat following their recent communication with Donald Trump. Savannah Guthrie is now offering a $1M reward for her mother Nancy. Some turds are threatening to boycott the Met Gala due to Jeff Bezos' sponsorship. Stephen Hawking photos have emerged of him living it up on Epstein Island. Drew confirms John Lenon's weiner is uncirc'd. AI confirms they all were uncircumcised. Legacy Partner's drops a new $50 gift card winner. Congrats to _____________! Darren McCarty dropped by the studio today for ML's Soul of Detroit. TJ Miller is in town. Check him out in Royal Oak this weekend. Jim Breuer is popping off at American Airlines. Mickey Redmond's grandson, Teddy, has a rare form of leukemia and could use financial help. A BAFTAs judge has quit following the n-word incident. Eric Dane's family is still fundraising. Rebecca Gayheart has broken her silence. Hey Taylor Swift... why you look different? Cruz Beckham and the Breakers are the hot new rock act. Andy Dick remains in physical shambles. Lisa Rinna has been drugged... in front of everyone. Some people are saying she might have been over served. The Olympic Men's Hockey Final is the most watched pre-9am sports event in history. Evan Dando of The Lemonheads can't catch a break. Trudi destroyed her toilet.Drew's hot water heater took a dump. Drew was nearly bamboozled by credit card thieves again. It's tax season. Hooray. Steven Spielberg is bailing on California for New York. Congressman Tony Gonzales has himself quite the scandal. Is Bonnie Blue really pregnant or is this all a stunt? Maury Povich wants nothing to do with the situation. Drew reeducated himself on the crimes of D.B. Cooper. The trial has resumed for the Alexander Brothers. Merch is still available. Buy it before it's gone. If you'd like to help support the show… consider subscribing to our YouTube Channel, Facebook, Instagram and Twitter (Drew Lane, Marc Fellhauer, Trudi Daniels, Jim Bentley and BranDon)
Send a textIn this episode, Kay Suthar sits down with Patrick Twitchett to break down why efficiency and optimisation should be at the core of every business. Patrick, founder of CASE MASTERMIND and widely known as “The Simplifier,” shares how entrepreneurs can increase income, reduce unnecessary costs, and simplify operations without sacrificing growth. They explore the power of masterminds, the principle that you are the average of the five people you spend the most time with, and why proximity can dramatically shift your results. Patrick also dives into the difference between to-do lists and calendars, how to properly calculate your professional rate, and why outsourcing is often the smartest financial move you can make. If you've ever felt overwhelmed, overworked, or stuck in complexity, this episode is your reminder that less really is more.What to expect in this episode: (00:00) – Why efficiency and optimisation drive business growth (04:10) – Lessons from Rich Dad Poor Dad and Think and Grow Rich (07:40) – Living by the principle “less is more” (11:20) – The real difference between to-do lists and calendars (15:00) – How to calculate your professional hourly rate (18:50) – Why outsourcing can actually make you more money (22:30) – The power of masterminds and proximity (26:40) – A mastermind member repurposing a marketing strategy in real timeAbout Patrick TwitchettPatrick Twitchett is the founder of CASE MASTERMIND. He helps entrepreneurs optimise costs and improve income through his consultancy service Simplies, combining the words simple and supplies. Known as “The Simplifier,” Patrick supports business owners in streamlining operations and building stronger financial foundations. He also speaks regularly on the CASE Broadcast alongside Melvyn Manning as MēL and PāT, discussing business growth and personal development.Connect with Patrick TwitchettWebsite: https://www.casemastermind.co.uk/Email: patrick.twitchett@simplies.co.ukFacebook Group: https://www.facebook.com/groups/CASEmastermind/Instagram: https://www.instagram.com/case_mastermind/YouTube: https://www.youtube.com/channel/UCW0rA_8xhXgFZZApcG4QXswTwitter: https://twitter.com/CASEmastermindLinkedIn: https://www.linkedin.com/company/casenetworking/FREE Gift from PatrickSign up as a Chrome member and receive the CASE Mastermind newsletter: https://casemastermind.co.uk/Connect with Kay SutharBusiness Website: https://makeyourmarkagency.com/Podcast Website: https://www.makeyourmarkpodcast.com/LinkedIn: https://www.linkedin.com/in/kay-suthar-make-your-mark/Facebook Group: https://www.facebook.com/groups/482037820744114Email: kay@makeyourmarkagency.comFREE Gifts from Kay Suthar:3 Ultimate Secrets to Getting Booked on Podcasts: https://getbookedonpodcast.com5 Simple Steps to Launch Your Podcast in 14 Days: https://14daystolaunch.com
Editor's note: CuspAI raised a $100m Series A in September and is rumored to have reached a unicorn valuation. They have all-star advisors from Geoff Hinton to Yann Lecun and team of deep domain experts to tackle this next frontier in AI applications.In this episode, Max Welling traces the thread connecting quantum gravity, equivariant neural networks, diffusion models, and climate-focused materials discovery (yes, there is one!!!).We begin with a provocative framing: experiments as computation. Welling describes the idea of a “physics processing unit”—a world in which digital models and physical experiments work together, with nature itself acting as a kind of processor. It's a grounded but ambitious vision of AI for science: not replacing chemists, but accelerating them.Along the way, we discuss:* Why symmetry and equivariance matter in deep learning* The tradeoff between scale and inductive bias* The deep mathematical links between diffusion models and stochastic thermodynamics* Why materials—not software—may be the real bottleneck for AI and the energy transition* What it actually takes to build an AI-driven materials platformMax reflects on moving from curiosity-driven theoretical physics (including work with Gerard ‘t Hooft) toward impact-driven research in climate and energy. The result is a conversation about convergence: physics and machine learning, digital models and laboratory experiments, long-term ambition and incremental progress.Full Video EpisodeTimestamps* 00:00:00 – The Physics Processing Unit (PPU): Nature as the Ultimate Computer* Max introduces the idea of a Physics Processing Unit — using real-world experiments as computation.* 00:00:44 – From Quantum Gravity to AI for Materials* Brandon frames Max's career arc: VAE pioneer → equivariant GNNs → materials startup founder.* 00:01:34 – Curiosity vs Impact: How His Motivation Evolved* Max explains the shift from pure theoretical curiosity to climate-driven impact.* 00:02:43 – Why CaspAI Exists: Technology as Climate Strategy* Politics struggles; technology scales. Why materials innovation became the focus.* 00:03:39 – The Thread: Physics → Symmetry → Machine Learning* How gauge symmetry, group theory, and relativity informed equivariant neural networks.* 00:06:52 – AI for Science Is Exploding (Not Emerging)* The funding surge and why AI-for-Science feels like a new industrial era.* 00:07:53 – Why Now? The Two Catalysts Behind AI for Science* Protein folding, ML force fields, and the tipping point moment.* 00:10:12 – How Engineers Can Enter AI for Science* Practical pathways: curriculum, workshops, cross-disciplinary training.* 00:11:28 – Why Materials Matter More Than Software* The argument that everything—LLMs included—rests on materials innovation.* 00:13:02 – Materials as a Search Engine* The vision: automated exploration of chemical space like querying Google.* 01:14:48 – Inside CuspAI: The Platform Architecture* Generative models + multi-scale digital twin + experiment loop.* 00:21:17 – Automating Chemistry: Human-in-the-Loop First* Start manual → modular tools → agents → increasing autonomy.* 00:25:04 – Moonshots vs Incremental Wins* Balancing lighthouse materials with paid partnerships.* 00:26:22 – Why Breakthroughs Will Still Require Humans* Automation is vertical-specific and iterative.* 00:29:01 – What Is Equivariance (In Plain English)?* Symmetry in neural networks explained with the bottle example.* 00:30:01 – Why Not Just Use Data Augmentation?* The optimization trade-off between inductive bias and data scale.* 00:31:55 – Generative AI Meets Stochastic Thermodynamics* His upcoming book and the unification of diffusion models and physics.* 00:33:44 – When the Book Drops (ICLR?)TranscriptMax: I want to think of it as what I would call a physics processing unit, like a PPU, right? Which is you have digital processing units and then you have physics processing units. So it's basically nature doing computations for you. It's the fastest computer known, as possible even. It's a bit hard to program because you have to do all these experiments. Those are quite bulky, it's like a very large thing you have to do. But in a way it is a computation and that's the way I want to see it. You can do computations in a data center and then you can ask nature to do some computations. Your interface with nature is a bit more complicated. But then these things will have to seamlessly work together to get to a new material that you're interested in.[01:00:44:14 - 01:01:34:08]Brandon: Yeah, it's a pleasure to have Max Woehling as a guest today. Max has done so much over his career that I've been so excited about. If you're in the deep learning community, you probably know Max for his work on variational autocoders, which has literally stood the test of prime or officially stood the test of prime. If you are a scientist, you probably know him for his like, binary work on graph neural networks on equivariance. And if you're a material science, you probably know him about his new startup, CASPAI. Max has a long history doing lots of cool problems. You started in quantum gravity, which is I think very different than all of these other things you worked on. The first question for AI engineers and for scientists, what is the thread in how you think about problems? What is the thread in the type of things which excite you? And how do you decide what is the next big thing you want to work on?[01:01:34:08 - 01:02:41:13]Max: So it has actually evolved a lot. In my young days, let's breathe, I would just follow what I would find super interesting. I have kind of this sensor. I think many people have, but maybe not really sort of use very much, which is like, you get this feeling about getting very excited about some problem. Like it could be, what's inside of a black hole or what's at the boundary of the universe or what are quantum mechanics actually all about. And so I follow that basically throughout my career. But I have to say that as you get older, this changes a little bit in the sense that there's a new dimension coming to it and there's this impact. Going in two-dimensional quantum gravity, you pretty much guaranteed there's going to be no impact on what you do relative, maybe a few papers, but not in this world, this energy scale. As I get closer to retirement, which is fortunately still 10 years away or so, I do want to kind of make a positive impact in the world. And I got pretty worried about climate change.[01:02:43:15 - 01:03:19:11]Max: I think politics seems to have a hard time solving it, especially these days. And so I thought better work on it from the technology side. And that's why we started CaspAI. But there's also a lot of really interesting science problems in material science. And so it's kind of combining both the impact you can make with it as well as the interesting science. So it's sort of these two dimensions, like working on things which you feel there's like, well, there's something very deep going on here. And on the other hand, trying to build tools that can actually make a real impact in the world.[01:03:19:11 - 01:03:39:23]RJ: So the thread that when I look back, look at the different things that you worked out, some of them seem pretty connected, like the physics to equivariance and, yeah, and, uh, gravitational networks, maybe. And that seems to be somewhat related to Casp. Do you have a thread through there?[01:03:39:23 - 01:06:52:16]Max: Yeah. So physics is the thread. So having done, you know, spent a lot of time in theoretical physics, I think there is first very fundamental and exciting questions, like things that haven't actually been figured out in quantum gravity. So that is really the frontier. There's also a lot of mathematical tools that you can use, right? In, for instance, in particle physics, but also in general relativity, sort of symmetry space to play an enormously important role. And this goes all the way to gauge symmetries as well. And so applying these kinds of symmetries to, uh, machine learning was actually, you know, I thought of it as a very deep and interesting mathematical problem. I did this with Taco Cohen and Taco was the main driver behind this, went all the way from just simple, like rotational symmetries all the way to gauge symmetries on spheres and stuff like that. So, and, uh, Maurice Weiler, who's also here, um, when he was a PhD student, he was a very good student with me, you know, he wrote an entire book, which I can really recommend about the role of symmetries in AI and machine learning. So I find this a very deep and interesting problem. So more recently, so I've taken a sort of different path, which is the relationship between diffusion models and that field called stochastic thermodynamics. This is basically the thermodynamics, which is a theory of equilibrium. So but then formulated for out of equilibrium systems. And it turns out that the mathematics that we use for diffusion models, but even for reinforcement learning for Schrodinger bridges for MCMC sampling has the same mathematics as this theoretical, this physical theory of non-equilibrium systems. And that got me very excited. And actually, uh, when I taught a course in, um, Mauschenberg, uh, it is South Africa, close to Cape Town at the African Institute for Mathematical Sciences Ames. And I turned that into a book site. Two years later, the book was finished. I've sent it to the publisher. And this is about the deep relationship between free energy, diffusion models, basically generative AI and stochastic thermodynamics. So it's always some kind of, I don't know, I find physics very deep. I also think a lot about quantum mechanics and it's, it's, it's a completely weird theory that actually nobody really understands. And there's a very interesting story, which is maybe good to tell to connect sort of my PZ back to where I'm now. So I did my PZ with a Nobel Laureate, Gerard the toft. He says the most brilliant man I've ever met. He was never wrong about anything as long as I've seen him. And now he says quantum mechanics is wrong and he has a new theory of quantum mechanics. Nobody understands what he's saying, even though what he's writing down is not mathematically very complex, but he's trying to address this understandability, let's say of quantum mechanics head on. And I find it very courageous and I'm completely fascinated by it. So I'm also trying to think about, okay, can I actually understand quantum mechanics in a more mundane way? So that, you know, without all the weird multiverses and collapses and stuff like that. So the physics is always been the threat and I'm trying to apply the physics to the machine learning to build better algorithms.[01:06:52:16 - 01:07:05:15]Brandon: You are still very involved in understanding and understanding physics and the worlds. Yeah. And just like applications to machine learning or introducing no formalisms. That's really cool.[01:07:05:15 - 01:07:18:02]Max: Yes, I would say I'm not contributing much to physics, but I'm contributing to the interface between physics and science. And that's called AI for science or science or AI is kind of a super, it's actually a new discipline that's emerging.[01:07:18:02 - 01:07:18:19]Speaker 5: Yeah.[01:07:18:19 - 01:07:45:14]Max: And it's not just emerging, it's exploding, I would say. That's the better term because I know you go from investments into like in the hundreds of millions now in the billions. So there's now actually a startup by Jeff Bezos that is at 6.2 billion sheep round. Right. Insane. I guess it's the largest startup ever, I think. And that's in this field, AI for science. It tells you something that we are creating a new bubble here.[01:07:46:15 - 01:07:53:28]Brandon: So why do you think it is? What has changed that has motivated people to start working on AI for science type problems?[01:07:53:28 - 01:08:49:17]Max: So there's two reasons actually. One is that people have been applying sort of the new tools from AI to the sciences, which is quite natural. And there's of course, I think there's two big examples, protein folding is a big one. And the other one is machine learning forest fields or something called machine learning inter-atomic potentials. Both of them have been actually very successful. Both also had something to do with symmetries, which is a little cool. And sort of people in the AI sciences saw an opportunity to apply the tools that they had developed beyond advertised placement, right, or multimedia applications into something that could actually make a very positive impact in society like health, drug development, materials for the energy transition, carbon capture. These are all really cool, impactful applications.[01:08:50:19 - 01:09:42:14]Max: Despite that, the science and the kind of the is also very interesting. I would say the fact that these sort of these two fields are coming together and that we're now at the point that we can actually model these things effectively and move the needle on some of these sort of science sort of methodologies is also a very unique moment, I would say. People recognize that, okay, now we're at the cusp of something new, where it results whether the company is called after. We're at the cusp of something new. And of course that always creates a lot of energy. It's like, okay, there's something, it's like sort of virgin field. It's like nobody's green field. Nobody's been there. I can rush in and I can sort of start harvesting there, right? And I think that's also what's causing a lot of sort of enthusiasm in the fields.[01:09:42:14 - 01:10:12:18]RJ: If you're an AI engineer, basically if the people that listen to this podcast will be in the field, then you maybe don't have a strong science background. How does, but are excited. Most I would say most AI practitioners, BM engineers or scientists would consider themselves scientists and they have some background, a little bit of physics, a little bit of industry college, maybe even graduate school that have been working or are starting out. How does somebody who is not a scientist on a day-to-day basis, how do they get involved?[01:10:12:18 - 01:10:14:28]Max: Well, they can read my book once it's out.[01:10:16:07 - 01:11:05:24]Max: This is basically saying that there is more, we should create curricula that are on this interface. So I'm not sure there is, also we already have some universities actual courses you can take, maybe online courses you can take. These workshops where we are now are actually very good as well. And we should probably have more tutorials before the workshop starts. Actually we've, I've kind of proposed this at some point. It's like maybe first have an hour of a tutorial so that people can get new into the field. There's a lot out there. Most of it is of course inaccessible, but I would say we will create much more books and other contents that is more accessible, including this podcast I would say. So I think it will come. And these days you can watch videos and things. There's a huge amount of content you can go and see.[01:11:05:24 - 01:11:28:28]Brandon: So maybe a follow-up to that. How do people learn and get involved? But why should they get involved? I mean, we have a lot of people who are of our audience will be interested in AI engineering, but they may be looking for bigger impacts in the world. What opportunities does AI for science provide them to make an impact to change the world? That working in this the world of pure bits would not.[01:11:28:28 - 01:11:40:06]Max: So my view is that underlying almost everything is immaterial. So we are focusing a lot on LLMs now, which is kind of the software layer.[01:11:41:06 - 01:11:56:05]Max: I would say if you think very hard, underlying everything is immaterial. So underlying an LLM is a GPU, and underlying a GPU is a wafer on which we will have to deposit materials. Do we want to wait a little bit?[01:12:02:25 - 01:12:11:06]Max: Underlying everything is immaterial. So I was saying, you know, there's the LLM underlying the LLM is a GPU on which it runs. In order to make that GPU,[01:12:12:08 - 01:12:43:20]Max: you have to put materials down on a wafer and sort of shine on it with sort of EUV light in order to etch kind of the structures in. But that's now an actual material problem, because more or less we've reached the limits of scaling things down. And now we are trying to improve further by new materials. So that's a fundamental materials problem. We need to get through the energy transition fast if we don't want to kind of mess up this world. And so there is, for instance, batteries. That's a complete materials problem. There's fuel cells.[01:12:44:23 - 01:13:01:16]Max: There is solar panels. So that they can now make solar panels with new perovskite layers on top of the silicon layers that can capture, you know, theoretically up to 50% of the light, where now we're at, I don't know, maybe 22 or something. So these are huge changes all by material innovation.[01:13:02:21 - 01:13:47:15]Max: And yeah, I think wherever you go, you know, I can probably dig deep enough and then tell you, well, actually, the very foundation of what you're doing is a material problem. And so I think it's just very nice to work on this very, very foundation. And also because I think this is maybe also something that's happening now is we can start to search through this material space. This has never been the case, right? It's like scientists, the normal way of working is you read papers and then you come up with no hypothesis. You do an experiment and you learn, et cetera. So that's a very slow process. Now we can treat this as a search engine. Like we search the internet, we now search the space of all possible molecules, not just the ones that people have made or that they're in the universe, but all of them.[01:13:48:21 - 01:14:42:01]Max: And we can make this kind of fully automated. That's the hope, right? We can just type, it becomes a tool where you type what you want and something starts spinning and some experiments get going. And then, you know, outcome list of materials and then you look at it and say, maybe not. And then you refine your query a little bit. And you kind of do research with this search engine where a huge amount of computation and experimentation is happening, you know, somewhere far away in some lab or some data center or something like this. I find this a very, very promising view of how we can sort of build a much better sort of materials layer underneath almost everything. And also more sustainable materials. Our plastics are polluting the planet. If you come up with a plastic that kind of destroys itself, you know, after, I don't a few weeks, right? And actually becomes a fertilizer. These are things that are not impossible at all. These things can be done, right? And we should do it.[01:14:42:01 - 01:14:47:23]RJ: Can you tell us a little bit just generally about CUSBI and then I have a ton of questions.[01:14:47:23 - 01:14:48:15]Speaker 5: Yeah.[01:14:48:15 - 01:17:49:10]Max: So CUSBI started about 20 months ago and it was because I was worried about I'm still worried about climate change. And so I realized that in order to get, you know, to stay within two degrees, let's say, we would not only have to reduce our emissions to zero by 2050, but then, you know, another half century or even a century of removing carbon dioxide from the atmosphere, not by reducing your emissions, but actually removing it at a rate that's about half the rate that we now emit it. And that is a unsolved problem. But if we don't solve it, two degrees is not going to happen, right? It's going to be much more. And I don't think people quite understand how bad that can be, like four degrees, like very bad. So this technology needs to be developed. And so this was my and my co-founder, Chet Edwards, motivation to start this startup. And also because, you know, we saw the technology was ready, which is also very good. So if you're, you know, the time is right to do it. And yeah, so we now in the meanwhile, we've grown to about 40 people. We've kind of collected 130 million investment into the company, which is for a European company is quite a lot. I would say it's interesting that right after that, you know, other startups got even more. So that's kind of tells you how fast this is growing. But yeah, we are we are now at the we've built the platform, of course, but it's for a series of material classes and it needs to be constantly expanded to new material classes. And it can be more automated because, you know, we know putting LLMs in as the whole thing gets more and more automated. And now we're moving to sort of high throughput experimentation. So connecting the actual platform, which is computational, to the experiments so that you can get also get fast feedback from experiments. And I kind of think of experiments as something you do at the end, although that's what we've been doing so far. I want to think of it as what I would call a sort of a physics processing unit, like a PPU, right, which is you have digital processing units and then you have physics processing units. So it's basically nature doing computations for you. It's the fastest computer known as possible, even. It's a bit hard to program because you have to do all these experiments. Those are quite, quite bulky. It's like a very large thing you have to do. But in a way, it is a computation. And that's the way I want to see it. So I want to you can do computations in a data center and then you can ask nature to do some computations. Your interface with nature is a bit more complicated. But then these things will have to seamlessly work together to get to a new material that you're interested in. And that's the vision we have. We don't say super intelligence because I don't quite know what it means and I don't want to oversell it. But I do want to automate this process and give a very powerful tool in the hands of the chemists and the material scientists.[01:17:49:10 - 01:18:01:02]Brandon: That actually brings up a question I wanted to ask you. First of all, can you talk about your platform to like whatever degree, like explain kind of how it works and like what you your thought processes was in developing it?[01:18:01:02 - 01:20:47:22]Max: Yeah, I think it's been surprisingly, it's not rocket science, I would say. It's not rocket science in the sense of the design and basically the design that, you know, I wrote down at the very beginning. It's still more or less the design, although you add things like I wasn't thinking very much about multi-scale models and as the common are rated that actually multi-scale is very important. And the beginning, I wasn't thinking very much about self-driving labs. But now I think, you know, we are now at the stage we should be adding that. And so there is sort of bits and details that we're adding. But more or less, it's what you see in the slide decks here as well, which is there is a generative component that you have to train to generate candidates. And then there is a digital twin, multi-scale, multi-fidelity digital twin, which you walk through the steps of the ladder, you know, they do the cheap things first, you weed out everything that's obviously unuseful, and then you go to more and more expensive things later. And so you narrow things down to a small number. Those go into an experiment, you know, do the experiment, get feedback, etc. Now, things that also have been more recently added is sort of more agentic sort of parts. You know, we have agents that search the literature and come up with, you know, actually the chemical literature and come up with, you know, chemical suggestions for doing experiments. We have agents which sort of autonomously orchestrate all of the computations and the experiments that need to be done. You know, they're in various stages of maturity and they can be continuously improved, I would say. And so that's basically I don't think that part. There's rocket science, but, you know, the design of that thing is not like surprising. What is it's surprising hard to actually build it. Right. So that's that's the thing that is where the moat is in the data that you can get your hands on and the and actually building the platform. And I would say there's two people in particular I want to call out, which is Felix Hunker, who is actually, you know, building the scientific part of the platform and Sandra de Maria, who is building the sort of the skate that is kind of this the MLOps part of the platform. Yeah. And so and recently we also added sort of Aaron Walsh to our team, who is a very accomplished scientist from Imperial College. We're very happy about that. He's going to be a chief science officer. And we also have a partnerships team that sort of seeks out all the customers because I think this is one thing I find very important. In print, it's so complex to do to actually bring a material to the real world that you must do this, you know, in collaboration with sort of the domain experts, which are the companies typically. So we always we only start to invest in the direction if we find a good industrial partner to go on that journey with us.[01:20:47:22 - 01:20:55:12]Brandon: Makes a lot of sense. Over the evolution of the platform, did you find that you that human intervention, human,[01:20:56:18 - 01:21:17:01]Brandon: I guess you could start out with a pure, you could imagine two directions when you start up making everything purely automatic, automated, agentic, so on. And then later on, you like find that you need to have more human input and feedback different steps. Or maybe did you start out with having human feedback? You have lots of steps and then like kind of, yeah, figure out ways to remove, you know,[01:21:17:01 - 01:22:39:18]Max: that is the second one. So you build tools for you. So it's much more modular than you think. But it's like, we need these tools for this application. We need these tools. So you build all these tools, and then you go through a workflow actually in the beginning just manually. So you put them in a first this tool, then run this to them or this with sithery. So you put them in a workflow and then you figure out, oh, actually, you know, this this porous material that we are trying to make actually collapses if you shake it a bit. Okay, then you add a new tool that says test for stability. Right. Yeah. And so there's more and more tools. And then you build the agent, which could be a Bayesian optimizer, or it could be an actual other them, you know, maybe trained to be a good chemist that will then start to use all these tools in the right way in the right order. Yeah. Right. But in the beginning, it's like you as a chemist are putting the workflow together. And then you think about, okay, how am I going to automate this? Right. For one very easy question you can ask yourself is, you know, every time somebody who is not a super expert in DFT, yeah, and he wants to do a calculation has to go to somebody who knows DFT. And so could you start to automate that away, which is like, okay, make it so user friendly, so that you actually do the right DFT for the right problem and for the right length of time, and you can actually assess whether it's a good outcome, etc. So you start to automate smaller small pieces and bigger pieces, etc. And in the end, the whole thing is automated.[01:22:39:18 - 01:22:53:25]Brandon: So your philosophy is you want to provide a set of specific tools that make it so that the scientists making decisions are better informed and less so trying to create an automated process.[01:22:53:25 - 01:23:22:01]Max: I think it's this is sort of the same where you're saying because, yes, we want to automate, yeah, but we don't see something very soon where the chemists and the domain expert is out of the loop. Yeah, but it but it's a retreat, right? It's like, okay, so first, you need an expert to tell you precisely how to set the parameters of the DFT calculation. Okay, maybe we can take that out. We can maybe automate that, right? And so increasingly, more of these things are going to be removed.[01:23:22:01 - 01:23:22:19]Speaker 5: Yeah.[01:23:22:19 - 01:24:33:25]Max: In the end, the vision is it will be a search engine where you where somebody, a chemist will type things and we'll get candidates, but the chemist will still decide what is a good material and what is not a good material out of that list, right? And so the vision of a completely dark lab, where you can close the door and you just say, just, you know, find something interesting and then it will it will just figure out what's interesting and we'll figure out, you know, it's like, oh, I found this new material to blah, blah, blah, blah, right? That's not the vision I have. He's not for, you know, a long time. So for me, it's really empowering the domain experts that are sitting in the companies and in universities to be much faster in developing their materials. And I should say, it's also good to be a little humble at times, because it is very complicated, you know, to bring it to make it and to bring it into the real world. And there are people that are doing this for the entire lives. Yeah. Right. And it's like, I wonder if they scratch their head and say, well, you know, how are you going to completely automate that away, like in the next five years? I don't think that's going to happen at all.[01:24:35:01 - 01:24:39:24]Max: Yeah. So to me, it's an increasingly powerful tool in the hands of the chemists.[01:24:39:24 - 01:25:04:02]RJ: I have a question. You've talked before about getting people interested based on having, you know, sort of a big breakthrough in materials, incremental change. I'm curious what you think about the platform you have now in are sort of stepping towards and how are you chasing the big change or is this like incremental or is there they're not mutually exclusive, obviously, but what do you think about that?[01:25:04:02 - 01:26:04:27]Max: We follow a mixed strategy. So we are definitely going after a big material. Again, we do this with a partner. I'm not going to disclose precisely what it is, but we have our own kind of long term goal. You could call it lighthouse or, you know, sort of moonshot or whatever, but it is going to be a really impactful material that we want to develop as a proof point that it can be done and that it will make it into the into the real world and that AI was essential in actually making it happen. At the same time, we also are quite happy to work with companies that have more modest goals. Like I would say one is a very deep partnership where you go on a journey with a company and that's a long term commitment together. And the other one is like somebody says, I knew I need a force field. Can you help me train this force field and then maybe analyze this particular problem for me? And I'll pay you a bunch of money for that. And then maybe after that we'll see. And that's fine too. Right. But we prefer, you know, the deep partnerships where we can really change something for the good.[01:26:04:27 - 01:26:22:02]RJ: Yeah. And do you feel like from a platform standpoint you're ready for that or what are the things that and again, not asking you to disclose proprietary secret sauce, but what are the things generally speaking that need to happen from where we are to where to get those big breakthroughs?[01:26:22:02 - 01:28:40:01]Max: What I find interesting about this field is that every time you build something, it's actually immediately useful. Right. And so unlike quantum computing, which or nuclear fusion, so you work for 20, 30, 40 years and nothing, nothing, nothing, nothing. And then it has to happen. Right. And when it happens, it's huge. So it's quite different here because every time you introduce, so you go to a customer and you say, so what do you need? Right. So we work, let's say, on a problem like a water filtration. We want to remove PFAS from water. Right. So we do this with a company, Camira. So they are a deep partner for us. Right. So we on a journey together. I think that the breakthrough will happen with a lot of human in the loop because there is the chemists who have a whole lot more knowledge of their field and it's us who will help them with training, having a new message. And in that kind of interface, these interactions, something beautiful will happen and that will have to happen first before this field will really take off, I think. And so in the sense that it's not a bubble, let's put it that way. So that's people see that as actual real what's happening. So in the beginning, it will be very, you know, with a lot of humans in the loop, I would say, and I would I would hope we will have this new sort of breakthrough material before, you know, everything is completely automated because that will take a while. And also it is very vertical specific. So it's like completely automating something for problem A, you know, you can probably achieve it, but then you'll sort of have to start over again for problem B because, you know, your experimental setup looks very different in the machines that you characterize your materials look very different. Even the models in your platform will have to be retrained and fine tuned to the new class. So every time, you know, you have a lot of learnings to transfer, but also, you know, the problems are actually different. And so, yes, I would want that breakthrough material before it's completely automated, which I think is kind of a long term vision. And I would say every time you move to something new, you'll have to start retraining and humans will have to come in again and say, okay, so what does this problem look like? And now sort of, you know, point the the machine again, you know, in the new direction and then and then use it again.[01:28:40:01 - 01:28:47:17]RJ: For the non-scientists among us, me included a bit of a scientist. There's a lot of terminology. You mentioned DFT,[01:28:49:00 - 01:29:01:11]RJ: you equivariance we've talked about. Can you sort of explain in engineering terms or the level of sophistication and engineering? Well, how what is equivariance?[01:29:01:11 - 01:29:55:01]Max: So equivariance is the infusion of symmetry in neural networks. So if I build a neural network, let's say that needs to recognize this bottle, right, and then I rotate the bottle, it will then actually have to completely start again because it has no idea that the rotated bottle. Well, actually, the input that represents a rotated bottle is actually rotated bottle. It just doesn't understand that. Right. If you build equivariance in basically once you've trained it in one orientation, it will understand it in any other orientation. So that means you need a lot less data to train these models. And these are constraints on the weights of the model. So so basically you have to constrain the way such data to understand it. And you can build it in, you can hard code it in. And yeah, this the symmetry groups can be, you know, translations, rotations, but also permutations. I can graph neural network, their permutations and then physics, of course, as many more of these groups.[01:29:55:01 - 01:30:01:08]RJ: To pray devil's advocate, why not just use data augmentation by your bottle is in all the different orientations?[01:30:01:08 - 01:30:58:23]Max: As an option, it's just not exact. It's like, why would you go through the work of doing all that? Where you would really need an infinite number of augmentations to get it completely right. Where you can also hard code it in. Now, I have to say sometimes actually data augmentation works even better than hard coding the equivariance in. And this is something to do with the fact that if you constrain the optimization, the weights before the optimization starts, the optimization surface or objective becomes more complicated. And so it's harder to find good minima. So there is also a complicated interplay, I think, between the optimization process and these constraints you put in your network. And so, yeah, you'll hear kind of contradicting claims in this field. Like some people and for certain applications, it works just better than not doing it. And sometimes you hear other people, if you have a lot of data and you can do data augmentation, then actually it's easier to optimize them and it actually works better than putting the equivariance in.[01:30:58:23 - 01:31:07:16]Brandon: Do you think there's kind of a bitter lesson for mathematically founded models and strategies for doing deep learning?[01:31:07:16 - 01:31:46:06]Max: Yeah, ultimately it's a trade-off between data and inductive bias. So if your inductive bias is not perfectly correct, you have to be careful because you put a ceiling to what you can do. But if you know the symmetry is there, it's hard to imagine there isn't a way to actually leverage it. But yeah, so there is a bitter lesson. And one of the bitter lessons is you should always make sure your architecture is scale, unless you have a tiny data set, in which case it doesn't matter. But if you, you know, the same bitter lessons or lessons that you can draw in LLM space are eventually going to be true in this space as well, I think.[01:31:47:10 - 01:31:55:01]RJ: Can you talk a little bit about your upcoming book and tell the listeners, like, what's exciting about it? Yeah, I should read it.[01:31:55:01 - 01:33:42:20]Max: So this book is about, it's called Generative AI and Stochastic Thermodynamics. It basically lays bare the fact that the mathematics that goes into both generative AI, which is the technology to generate images and videos, and this field of non-equilibrium statistical mechanics, which are systems of molecules that are just moving around and relaxing to the ground state, or that you can control to have certain, you know, be in a certain state, the mathematics of these two is actually identical. And so that's fascinating. And in fact, what's interesting is that Jeff Hinton and Radford Neal already wrote down the variational free energy for machine learning a long time ago. And there's also Carl Friston's work on free energy principle and active entrance. But now we've related it to this very new field in physics, which is called stochastic thermodynamics or non-equilibrium thermodynamics, which has its own very interesting theorems, like fluctuation theorems, which we don't typically talk about, but we can learn a lot from. And I think it's just it can sort of now start to cross fertilize. When we see that these things are actually the same, we can, like we did for symmetries, we can now look at this new theory that's out there, developed by these very smart physicists, and say, okay, what can we take from here that will make our algorithms better? At the same time, we can use our models to now help the scientists do better science. And so it becomes a beautiful cross-fertilization between these two fields. The book is rather technical, I would say. And it takes all sorts of things that have been done as stochastic thermodynamics, and all sorts of models that have been done in the machine learning literature, and it basically equates them to each other. And I think hopefully that sense of unification will be revealing to people.[01:33:42:20 - 01:33:44:05]RJ: Wait, and when is it out?[01:33:44:05 - 01:33:56:09]Max: Well, it depends on the publisher now. But I hope in April, I'm going to give a keynote at ICLR. And it would be very nice if they have this book in my hand. But you know, it's hard to control these kind of timelines.[01:33:56:09 - 01:33:58:19]RJ: Yeah, I'm looking forward to it. Great.[01:33:58:19 - 01:33:59:25]Max: Thank you very much. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.latent.space/subscribe
Darren McCarty is into wrestling, but ML and Marc are lovers, not fighters, so they ask D Mac to tell […]
Most people in AI are trying to give AIs ‘good' values. Max Harms wants us to give them no values at all. According to Max, the only safe design is an AGI that defers entirely to its human operators, has no views about how the world ought to be, is willingly modifiable, and completely indifferent to being shut down — a strategy no AI company is working on at all.In Max's view any grander preferences about the world, even ones we agree with, will necessarily become distorted during a recursive self-improvement loop, and be the seeds that grow into a violent takeover attempt once that AI is powerful enough.It's a vision that springs from the worldview laid out in If Anyone Builds It, Everyone Dies, the recent book by Eliezer Yudkowsky and Nate Soares, two of Max's colleagues at the Machine Intelligence Research Institute.To Max, the book's core thesis is common sense: if you build something vastly smarter than you, and its goals are misaligned with your own, then its actions will probably result in human extinction.And Max thinks misalignment is the default outcome. Consider evolution: its “goal” for humans was to maximise reproduction and pass on our genes as much as possible. But as technology has advanced we've learned to access the reward signal it set up for us, pleasure — without any reproduction at all, by having sex while on birth control for instance.We can understand intellectually that this is inconsistent with what evolution was trying to design and motivate us to do. We just don't care.Max thinks current ML training has the same structural problem: our development processes are seeding AI models with a similar mismatch between goals and behaviour. Across virtually every training run, models designed to align with various human goals are also being rewarded for persisting, acquiring resources, and not being shut down.This leads to Max's research agenda. The idea is to train AI to be “corrigible” and defer to human control as its sole objective — no harmlessness goals, no moral values, nothing else. In practice, models would get rewarded for behaviours like being willing to shut themselves down or surrender power.According to Max, other approaches to corrigibility have tended to treat it as a constraint on other goals like “make the world good,” rather than a primary objective in its own right. But those goals gave AI reasons to resist shutdown and otherwise undermine corrigibility. If you strip out those competing objectives, alignment might follow naturally from AI that is broadly obedient to humans.Max has laid out the theoretical framework for “Corrigibility as a Singular Target,” but notes that essentially no empirical work has followed — no benchmarks, no training runs, no papers testing the idea in practice. Max wants to change this — he's calling for collaborators to get in touch at maxharms.com.Links to learn more, video, and full transcript: https://80k.info/mh26This episode was recorded on October 19, 2025.Chapters:Cold open (00:00:00)Who's Max Harms? (00:01:22)A note from Rob Wiblin (00:01:58)If anyone builds it, will everyone die? The MIRI perspective on AGI risk (00:04:26)Evolution failed to 'align' us, just as we'll fail to align AI (00:26:22)We're training AIs to want to stay alive and value power for its own sake (00:44:31)Objections: Is the 'squiggle/paperclip problem' really real? (00:53:54)Can we get empirical evidence re: 'alignment by default'? (01:06:24)Why do few AI researchers share Max's perspective? (01:11:37)We're training AI to pursue goals relentlessly — and superintelligence will too (01:19:53)The case for a radical slowdown (01:26:07)Max's best hope: corrigibility as stepping stone to alignment (01:29:09)Corrigibility is both uniquely valuable, and practical, to train (01:33:44)What training could ever make models corrigible enough? (01:46:13)Corrigibility is also terribly risky due to misuse risk (01:52:44)A single researcher could make a corrigibility benchmark. Nobody has. (02:00:04)Red Heart & why Max writes hard science fiction (02:13:27)Should you homeschool? Depends how weird your kids are. (02:35:12)Video and audio editing: Dominic Armstrong, Milo McGuire, Luke Monsour, and Simon MonsourMusic: CORBITCoordination, transcripts, and web: Katy Moore
Furf and Monty are back with another Pulm PEEPs Pearls episode. The topic of today’s discussion is an often discussed, but often misunderstood, test; the methacholine challenge. They’ll review when to utilize this test, how it should be performed, and the appropriate interpretation. Contributors This episode was prepared with research by Pulm PEEPs Associate Editor George Doumat. Dustin Latimer, another Pulm PEEPs Associate Editor, assisted with audio and video editing. Key Learning Points What the Test Measures Methacholine challenge is a direct bronchial provocation test of airway hyperresponsiveness (AHR), a core physiologic feature of asthma. Anyone will bronchoconstrict at high enough concentrations — the test looks for an abnormal threshold. The key endpoint is the PC20: the methacholine concentration causing a 20% fall in FEV1. Abnormal in adults: PC20 ≤ 8–16 mg/mL Test Performance Meta-analyses: pooled sensitivity ~60%, specificity ~90%. Real-world cohorts: sensitivity 55–62%, specificity 56–100% (varies by population, protocol, and threshold used). Not a standalone yes/no test — best used as part of a broader diagnostic pathway. Where It Fits in the Asthma Workup The test belongs in a stepwise approach: Step 1: Spirometry + bronchodilator response Step 2: Add FeNO and/or peak flow variability (if available) Step 3: If the picture is still unclear → methacholine challenge It is most useful for symptomatic patients with normal spirometry and no bronchodilator reversibility. Given its cost, mild risk, and discomfort, it should not be a first-line test — most asthma diagnoses do not require it. Technique and Medication Prep Technique ERS guidelines favor tidal breathing over deep inspiratory maneuvers. Deep breaths can be bronchoprotective and blunt the response, reducing sensitivity — especially in mild or well-controlled asthma. Medication Washout (to Avoid False Negatives) Medication ClassWashout PeriodShort-acting beta-agonists (SABA)≥ 6 hoursLong-acting beta-agonists (LABA)~24 hoursUltra-long-acting beta-agonists~48 hoursShort-acting anticholinergics (e.g., ipratropium)~12 hoursLong-acting muscarinic antagonists (LAMA, e.g., tiotropium)7 days Inhaled corticosteroids, leukotriene blockers, and antihistamines do not significantly affect the test acutely — continue these. Withdrawing ICS also carries its own risk for asthma patients. Practical tip: Spell out exactly what to hold and when — for both the patient and the PFT lab — at the time the test is ordered. Interpreting Results Negative Test (PC20 > 16 mg/mL) Very high negative predictive value in symptomatic adults. Makes current asthma quite unlikely (assuming proper test conduct). This is the test’s greatest strength: it is an excellent rule-out test. Positive Test (PC20 ≤ 8–16 mg/mL) More nuanced — airway hyperresponsiveness is not unique to asthma. Can be positive in: chronic cough, allergic rhinitis, COPD, and even some healthy asymptomatic individuals. A positive result raises probability but must be interpreted alongside the clinical story, variable respiratory symptoms, peak flow variability, FeNO, and ICS response. Safety and Risks Overall, the test is quite safe; significant adverse effects are rare. Temporary breathing discomfort is expected (bronchoconstriction is being induced). Severe bronchospasm is possible: A trained clinician should be available; SABA inhaler/nebulizer must be immediately on hand; a physician should be reachable in the facility. Contraindications / cautions: Avoid if FEV1 < 70% predicted or < 1–1.5 L (baseline obstruction greatly increases risk). Avoid within 3 months of an acute cardiac event (rare risk of cardiac events with unstable cardiac disease). Five Pearls — Quick Recap What it tests: Methacholine challenge is a direct test of AHR with high specificity but variable sensitivity — it belongs inside a diagnostic pathway, not as a standalone asthma test. When to use it: Most useful for symptomatic patients with normal spirometry and no bronchodilator response, after FeNO and peak flow variability have been considered. Technique and meds matter: Use tidal breathing protocol; respect washout intervals — especially the 7-day LAMA washout and 24–48 hour LABA window — to avoid false negatives. Safety: Generally safe, but can induce significant bronchoconstriction. Have a SABA available and avoid the test in patients with FEV1 < 70% predicted. Interpretation: A negative test (PC20 > 16 mg/mL) strongly argues against current asthma. A positive test raises probability but is not specific — interpret alongside the full clinical picture. References and Further Reading Coates AL, Wanger J, Cockcroft DW, Culver BH; Bronchoprovocation Testing Task Force: Kai-Håkon Carlsen; Diamant Z, Gauvreau G, Hall GL, Hallstrand TS, Horvath I, de Jongh FHC, Joos G, Kaminsky DA, Laube BL, Leuppi JD, Sterk PJ. ERS technical standard on bronchial challenge testing: general considerations and performance of methacholine challenge tests. Eur Respir J. 2017 May 1;49(5):1601526. doi: 10.1183/13993003.01526-2016. PMID: 28461290. Lee, J., & Song, J. U. (2021). Diagnostic comparison of methacholine and mannitol bronchial challenge tests for identifying bronchial hyperresponsiveness in asthma: a systematic review and meta-analysis. Journal of Asthma, 58(7), 883–891. https://doi.org/10.1080/02770903.2020.1739704 Davis BE, Blais CM, Cockcroft DW. Methacholine challenge testing: comparative pharmacology. J Asthma Allergy. 2018 May 14;11:89-99. doi: 10.2147/JAA.S160607. PMID: 29785128; PMCID: PMC5957064.
Você está sendo julgado pela sua imagem antes mesmo de abrir a boca.Neste episódio, a Dra. Najla Toledo, especialista em harmonização Full Face, revela como a estética se tornou o novo "dress code" do sucesso. Se você sente que seu rosto está "derretendo" ou que sua imagem não condiz com sua autoridade, você está perdendo o jogo do mercado.Esqueça os procedimentos exagerados. A Dra. Najla explica a ciência por trás da sustentação óssea e o impacto brutal que a harmonização natural traz para a confiança de empresários e líderes. Aprenda por que o "barato sai caro" no mundo do Botox e como o gerenciamento do envelhecimento pode salvar sua carreira.Disponível no Youtube:Link: https://youtu.be/HT9iYrLbtxI00:00:11 - Especialista em naturalidade: Dra. Najla Toledo.00:09:26 - A polêmica dos 14ML e a analogia do Ketchup.00:11:43 - Por que preencher o "bigode chinês" pode ser um erro.00:18:31 - Ácido Hialurônico vs. Bioestimuladores: qual a diferença?.00:31:04 - Famosos que deram errado: como evitar o rosto de balão.00:45:13 - Por que o seu Botox não dura? A ciência da dose.01:03:55 - O colapso inflamatório: como a dieta afeta seu rosto.01:19:05 - Condição Especial: 1 ML extra para seguidores Excepcionais.Siga a Dra. Najla no Instagram:https://www.instagram.com/dra.najlavicentini/Nos Siga:Marcelo Toledo: https://www.instagram.com/marcelotoledoInstagram: https://www.instagram.com/excepcionaispodcastTikTok: https://www.tiktok.com/@excepcionaispodcast
Blood vitamin D levels, not supplement dose, determine breast cancer risk, with studies showing roughly a 40% to 50% lower risk once levels rise into protective ranges Women who maintain blood vitamin D levels around 50 to 60 ng/mL experience the greatest protection, while levels below 20 ng/mL consistently link to higher and more aggressive breast cancer risk Large pooled analyses and clinical trials show breast cancer risk drops step by step as vitamin D levels increase, with no evidence of harm at higher physiological levels Sunlight, exercise, and metabolic health strongly influence how much vitamin D actually reaches and protects breast tissue, explaining why intake alone often falls short Addressing low vitamin D by combining sunlight, targeted supplementation, exercise, and metabolic support turns vitamin D into a measurable, trackable strategy for long-term breast cancer prevention
0:00 - It's time for a /ML&K tradition: Brett explains why you should be optimistic about the Rockies and their upcoming season. Except this time, he's tempering his enthusiasm. There's still a long road ahead, but it seems like the Rockies might actually maybe possibly potentially make some slight changes that set them in the right direction.17:16 - Should we be concerned about the Nuggets right now? Are there cracks starting to form in this (alleged) championship roster?30:52 - Oh, by the way...Danika Mason is an Australian sports reporter, and she's in Milan to cover the Olympics. She did a live hit on Australian TV while absolutely hammered, and it's fantastic.
Episode 213: HIV PrEP Review H. Nicole Magaña, medical student, reviews the history of PrEP and outlines the currently FDA-approved medications used for HIV prevention. Dr. Arreaza provides additional perspective on long-acting injectable options, including how quickly they begin to protect patients after initiation. Written by Nicole Magana, MSIV, American University of the Caribbean. Comments and edits by Hector Arreaza, MD. You are listening to Rio Bravo qWeek Podcast, your weekly dose of knowledge brought to you by the Rio Bravo Family Medicine Residency Program from Bakersfield, California, a UCLA-affiliated program sponsored by Clinica Sierra Vista, Let Us Be Your Healthcare Home. This podcast was created for educational purposes only. Visit your primary care provider for additional medical advice. Pre-exposure prophylaxis for HIV. Previous episodes related to HIV: -Episode 67, HIV history (September 2021) -Episode 68, HIV transmissibility (October 2021) -Episode 70 (October 2021), HIV prevention (including HIV Prep with oral medications) -Episode 98 (June 2022), we introduced Apretude, the first injectable for HIV PrEP. Apretude was approved in December 2021. What is Pre-Exposure prophylaxis (PrEP)? Pre-exposure prophylaxis, or PrEP, is the use of antiretroviral medications taken by individuals who are HIV-negative to prevent HIV acquisition. There are 30,000 new HIV infections annually in the US. How effective is it? When taken as prescribed, PrEP is highly effective at reducing the risk of HIV transmission through sexual exposure and injection drug use. Patients who are adherent to PrEP can lower their risk of contracting HIV by 99%. The effectiveness of oral PrEP is highly adherence dependent. In trials with 70% adherence, the relative risk of HIV acquisition was 0.27, compared to 0.51 with 40-70% adherence and no significant benefit with adherence ≤40%. How does PrEP work? PrEP works by maintaining therapeutic drug levels in the bloodstream and in target tissues. If HIV exposure occurs, viral replication is inhibited, preventing the establishment of infection. Brief History of PrEP. The concept of PrEP originated from early animal studies demonstrating that antiretroviral medications could prevent retroviral transmission when administered before exposure. In 2010, the iPrEx trial showed that daily oral tenofovir disoproxil fumarate (known as Truvada) with emtricitabine significantly reduced HIV acquisition among men who have sex with men and transgender women. This was the first large clinical trial to demonstrate the effectiveness of PrEP. In 2012, the FDA approved oral Truvada, which is TDF/FTC (tenofovir disoproxil and emtricitabine) for HIV prevention. Since then, additional studies have expanded indications and introduced new formulations, including long-acting injectable options. Who Should Be Offered PrEP? PrEP should be considered for any HIV-negative individual at increased risk of HIV acquisition, including Men who have sex with men, transgender individuals, heterosexual men and women with an HIV-positive partner, individuals with recent bacterial sexually transmitted infections, people who inject drugs, individuals engaging in condomless sex with partners of unknown HIV status. Remember that PrEP should be offered in a nonjudgmental, patient-centered manner, make it a safe space to talk openly about prevention of HIV. Available HIV PrEP Options. Daily Oral PrEP: There are 2 formulations of Tenofovir. There is Tenofovir disoproxil fumarate (TDF)/ Truvada and Tenofovir alafenamide (TAF)/ Descovy. Each is available in a tablet combined with Emtricitabine a nucleoside reverse transcriptase inhibitor. Truvada: It is approved for all populations at risk through sexual exposure or injection drug use. Something to look out for before starting this medication is for pre-existing CKD. Do not give to patients who have an estimated glomerular filtration rate of less than 60 mL/min. (6) Descovy: This option is approved for men who have sex with men and transgender women but is not approved for individuals at risk through receptive vaginal sex. It has less impact on renal function and bone mineral density compared to Truvada. It can be used in moderately reduced kidney function (GFR between 30-60 mL/min). Truvada and Descovy are taken orally once a day. After patients start taking these medications, when are they considered to be protected? Nicole: With daily oral PrEP, guidelines differ with WHO and International Aids Society-USA stating it takes about 7 days, while CDC states 21 days to allow for adequate concentration in tissues (1). Adherence is critical for efficacy. Injectable HIV PrEP. In 2021, the FDA approved the first Injectable PrEP option Long-acting cabotegravir (CAB-LA)- known on the market as Apretude. Cabotegravir is an integrase strand transfer inhibitor administered as an intramuscular injection.Dosing consists of an initial injection, a second injection one month later, and then maintenance injections every two months (1). Another option is Lenacapavir (Yeztugo). The Yeztugo as a pre-exposure prophylaxis (PrEP) for HIV in Oct 2024. Yeztugo is the first and only FDA-approved HIV prevention treatment that requires just two injections per year, offering a long-acting option for people who weigh at least 35kg. It is given as 2 injections every 6 months. First dose is given with 2 tablets on Day 1 and Day 2, then every 6 months 2 injections on the same day. Clinical trials, including HPTN 083 and HPTN 084, demonstrated that injectable cabotegravir is superior to daily oral PrEP in preventing HIV infection. This advantage is largely due to improved adherence rather than differences in intrinsic drug potency. There have been no head-to-head comparisons between Yeztugo and Apretude, but they are both very effective. Apretude starts protecting 7 days after the first dose, and Yeztugo starts protecting 2 hours after Day 2 (if patient takes the oral loading dose) or 3-4 weeks if no oral load is taken. Injectable PrEP is particularly beneficial for patients who struggle with daily pill adherence, have trouble swallowing pills, prefer a discreet option, have difficulty storing their medication or have renal or bone disease that limits the use of tenofovir-based regimens like Truvada and Descovy (6). In one unpublished report by Medline, patients who received Apretude had an increase in bone mineral density compared to those who received Truvada (1). Tests prior to starting PrEP. Before initiating PrEP, patients must be confirmed to be HIV-negative. Baseline evaluation includes HIV testing with a fourth-generation antigen/antibody assay, HIV RNA testing if acute infection is suspected, renal function testing for oral PrEP, Hepatitis B screening, sexually transmitted infection screening, and pregnancy testing when appropriate. PrEP should not be started in individuals with known or suspected acute HIV infection. Monitoring for patients on HIV PrEP. Monitoring typically includes HIV testing every 2 to 3 months, STI screening every 3 to 6 months, renal function monitoring for those on oral PrEP (tenofovir- based), ongoing adherence and risk-reduction counseling. And for injectable PrEP, adherence to the injection schedule is essential, as delayed dosing may increase the risk of resistance if HIV infection occurs. HIV PrEP is not a prevention for other STIs. Screening for STIs and counseling about prevention is essential. Breakthrough HIV infections on PrEP are rare and most often associated with poor adherence or delayed diagnosis. Truvada is more studied in all populations and is considered safe during pregnancy and breastfeeding. There is less data regarding the injectable option in patients who are pregnant, may become pregnant, or whose primary risk factor is injection drug use (1). Injectable PrEP provides an important alternative for patients with chronic kidney disease and bone disease (1). Key Takeaway Pre-exposure prophylaxis is a safe, effective, and evidence-based strategy for HIV prevention. With both daily oral and long-acting injectable options available, PrEP can be individualized to meet patient needs. Normalizing PrEP discussions in clinical practice is essential to reducing new HIV infections and advancing public health goals. Even without trying, every night you go to bed a little wiser. Thanks for listening to Rio Bravo qWeek Podcast. We want to hear from you, send us an email at RioBravoqWeek@clinicasierravista.org, or visit our website riobravofmrp.org/qweek. See you next week! References: Antiretroviral Drugs for Treatment and Prevention of HIV in Adults: 2024 Recommendations of the International Antiviral Society–USA Panel. The Journal of the American Medical Association. 2025. Gandhi RT, Landovitz RJ, Sax PE, et al. Long-Acting Lenacapavir Acts as an Effective Preexposure Prophylaxis in a Rectal SHIV Challenge Macaque Model. The Journal of Clinical Investigation. 2023. Bekerman E, Yant SR, VanderVeen L, et al. Pharmacokinetics and Safety of Once-Yearly Lenacapavir: A Phase 1, Open-Label Study. Lancet. 2025. Jogiraju V, Pawar P, Yager J, et al.
What does it take to design a programming language from scratch when the target isn't just CPUs, but GPUs, accelerators, and the entire AI stack? In this episode, I sit down with legendary language architect Chris Lattner to talk about Mojo — his ambitious attempt to rethink systems programming for the machine learning era. We trace the arc from LLVM and Clang to Swift and now Mojo, unpacking the lessons Chris has carried forward into this new language. Mojo aims to combine Python's ergonomics with C-level performance, but the real story is deeper: memory ownership, heterogeneous compute, compile-time metaprogramming, and giving developers precise control over how AI workloads hit silicon. Chris shares the motivation behind Modular, why today's AI infrastructure demands new abstractions, and how Mojo fits into a rapidly evolving ecosystem of ML frameworks and hardware backends. We also dig into developer experience, safety vs performance tradeoffs, and what it means to build a language that spans research notebooks all the way down to kernel-level execution.
What if your data platform could power both critical business decisions and real-time product features at scale? In this episode, host Benjamin sits down with Magnus Dahlbäck, Senior Director of Data and Platform at Voi, to explore how a metrics-first approach and semantic layers transform data accessibility, why traditional ML and LLMs require different strategies for different problems, and how to balance FinOps costs while processing billions of IoT events daily. Whether you're building data infrastructure for a high-growth company or rethinking how your organization consumes data, this conversation is packed with practical strategies for unlocking data value and preparing your platform for AI. Tune in to discover how Voi ditched traditional BI tools and revolutionized their approach to enterprise analytics.
00:00-25:00: What should we expect from the new Buffalo Bills' 3-4 defense? ML breaks it down from system to players and more. Thanks to Batavia Downs Gaming and CH Insurance. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.
Presidential candidate (?) Rahm Emanuel joins ML and Marc to talk about everything but his political ambitions. STRAIGHT DOPEWho's Rahm Emanuel, […]
00:00-20:00: ML says there is nothing to fear about the Yankees in 2026. Cashman, Boone, aging players and more. Run it back again to just crash in October. Thanks to Byrne Dairy and CH Insurance. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.
Rahul Raja is a Staff Software Engineer at LinkedIn, working on large-scale search infrastructure, information retrieval systems, and integrating AI/ML to improve ranking and semantic search experiences.The Future of Information Retrieval: From Dense Vectors to Cognitive Search // MLOps Podcast #362 with Rahul Raja, Staff Software Engineer at LinkedInJoin the Community: https://go.mlops.community/YTJoinInGet the newsletter: https://go.mlops.community/YTNewsletterMLOps GPU Guide: https://go.mlops.community/gpuguide// AbstractInformation Retrieval is evolving from keyword matching to intelligent, vector-based understanding. In this talk, Rahul Raja explores how dense retrieval, vector databases, and hybrid search systems are redefining how modern AI retrieves, ranks, and reasons over information. He discusses how retrieval now powers large language models through Retrieval-Augmented Generation (RAG) and the new MLOps challenges that arise, embedding drift, continuous evaluation, and large-scale vector maintenance.Looking ahead, the session envisions a future of Cognitive Search, where retrieval systems move beyond recall to genuine reasoning, contextual understanding, and multimodal awareness. Listeners will gain insight into how the next generation of retrieval will bridge semantics, scalability, and intelligence, powering everything from search and recommendations to generative AI.// BioRahul is a Staff Engineer at LinkedIn, where he focuses on search and deployment systems at scale. Rahul is a graduate from Carnegie Mellon University and has a strong background in building reliable, high-performance infrastructure. He has led many initiatives to improve search relevance and streamline ML deployment workflows.// Related LinksWebsite: https://www.linkedin.com/Coding Agents Conference: https://luma.com/codingagents~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExploreJoin our Slack community [https://go.mlops.community/slack]Follow us on X/Twitter [@mlopscommunity](https://x.com/mlopscommunity) or [LinkedIn](https://go.mlops.community/linkedin)] Sign up for the next meetup: [https://go.mlops.community/register]MLOps Swag/Merch: [https://shop.mlops.community/]Connect with Demetrios on LinkedIn: /dpbrinkmConnect with Rahul on LinkedIn: /rahulraja963/Timestamps:[00:00] Vector Search for Media[00:33] RAG and Search Evolution[04:45] Cognitive vs Semantic Search[08:26] High Value Search Signals[16:43] Scaling with Embeddings[22:37] BM25 Benchmark Bias[29:00] Video Search Use Cases[31:21] Context and Search Tradeoff[35:04] Personal Memory Augmentation[39:03] Future of Cognitive Search[44:51] Access Control in Vectors[49:14] Search Ranking Challenge[54:43] Hard Search Problems Solved[58:29] Freshness vs Cost[1:02:12] Wrap up
00:00-20:00: ML breaks down the Sabres. How they got here and are the playoffs happening, finally, in WNY? Thanks to Byrne Dairy and Western OTB. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.
In today's Cloud Wars Minute, I analyze the leadership shift at Workday and what it means in the age of agentic AI.Highlights00:00 — I want to talk about a change at the top of Workday. And I want to point out somebody who's been a real superstar in this business and that's Workday co-founder, former co-CEO, former CEO, chairman, executive chairman, resigned as CEO, now back in as CEO, Aneel Bhusri.01:13 — He was going to be the person that ran all the business, the operations. And Aneel said, "I can go back to what I truly love," which is developing products and strategy. Carl Eschenbach left about a week ago. The board asked Bhusri to step back in as CEO, and he's done that. So there's no question that Aneel Bhusri's first love is products and strategy.02:24 — He said, “Now, with Carl Eschenbach coming in a couple of years ago, now I can go do this stuff I really love around products and strategy.” It is this thing about never being trained to do it. He's on the board of directors at General Motors, a highly accomplished executive in a lot of ways. Aneel certainly doesn't need the money.03:13 — How does a company like Workday or Oracle or SAP or Salesforce balance those two things, the enterprise applications that brought them here, and the agentic AI that has to take them forward? Workday, several months ago, announced Workday ERP. From the outside, you've got SAP and Oracle always aggressively trying to go after Workday customers.03:59 — I want to mention about Aneel, the way he manages. He said, “I've sort of become”— this is when machine learning, ML, was really becoming hot — “I became the Pied Piper of Workday. I was just going around to all the different developers and engineering teams and just asking developers and engineering teams over and over and over again, what are you doing with ML?"04:56 — And now they've got two great president-level executives at Workday. Rob Enslin and Gerrit Kazmaier. I think it's very likely that about a year from now, Workday will announce that Bhusri is going to become co-CEO and elevate one of those two, Enslin or Kazmaier, to the co-CEO role with him. Visit Cloud Wars for more.
Episode 212: Managing HFpEFHyo Mun and Jordan Redden (medical students) explain how to manage HFpEF with medications and touch some basics about nonpharmacologic treatments. Dr. Arreaza asks insightful questions to guide the discussion. Written by Hyo Mun, MSIV, American University of the Caribbean; and Jordan Redden, MSIV, Ross University School of Medicine. Comments by Hector Arreaza, MD.You are listening to Rio Bravo qWeek Podcast, your weekly dose of knowledge brought to you by the Rio Bravo Family Medicine Residency Program from Bakersfield, California, a UCLA-affiliated program sponsored by Clinica Sierra Vista, Let Us Be Your Healthcare Home. This podcast was created for educational purposes only. Visit your primary care provider for additional medical advice.Treatment of HFpEFArreaza: Mike, if you had to name the one therapy everyone with HFpEF should be on, what is it?Mike: That's easy! SGLT-2 inhibitors. This is the one slam-dunk we have in HFpEF. Empagliflozin (Jardiance) or dapagliflozin (Farxiga) should be started in essentially every patient with HFpEF, and it doesn't matter if they have diabetes or not.Jordan: And that's worth repeating, because people still think of these as “diabetes drugs.” They're not anymore. In HFpEF, SGLT-2 inhibitors reduce heart-failure hospitalizations, improve symptoms, improve quality of life, and even reduce cardiovascular death.Dr. Arreaza: They're also simple. Empagliflozin 10 mg daily or dapagliflozin 10 mg daily. No titration, no drama. The effectiveness of these meds was established around 2019 with DAPA-HF and later with DELIVER. These were trials thatdemonstrated that dapagliflozin reduces worsening heart failure and cardiovascular events across the full spectrum of heart failure, from reduced to preserved ejection fraction, independent of diabetes status.Mike: And the number needed to treat is about 28 to prevent one heart-failure hospitalization. That's excellent for a disease where we historically had almost nothing that worked.Jordan: They're also safe in chronic kidney disease down to an eGFR of about 25, which makes them even more useful in this population.Dr. Arreaza: Alright. We got SGLT-2 inhibitor, what's next?Mike: Volume management. Loop diuretics are still the backbone of symptom control in HFpEF. If the patient is volume overloaded, you diurese, and you diurese aggressively.Jordan: The goal is euvolemia. Dry weight, no edema, no orthopnea, no waking up gasping for air. A lot of these patients end up needing chronic oral loop diuretics to stay there.Dr. Arreaza: Something to remember: HFpEF patients don't tolerate congestion well, and being “a little wet” is not benign. Let's move into RAAS inhibition. Where do ARBs and ACE inhibitors fit in?Mike: Between ARBs and ACE inhibitors, ARBs are the winners in HFpEF. They actually reduce heart failure hospitalizations—drugs like candesartan, losartan, valsartan. ACE inhibitors? Not so much. They showed minimal benefit in older HFpEF patients, which is why we go with ARBs instead.Jordan: But a lot of clinicians get nervous about ACE inhibitors and ARBs because of kidney function, so it's worth talking through how these drugs actually work in the kidney.Dr. Arreaza: Yes, misunderstanding may lead to unnecessary drug discontinuation.Jordan: Under normal conditions, the afferent arteriole brings blood into the glomerulus, and the efferent arteriole is constricted by angiotensin II. That constriction keeps pressure high in the glomerulus and maintains filtration.Mike: Here's what happens with an ACE inhibitor: you block angiotensin II, the efferent arteriole relaxes, glomerular pressure drops, and GFR dips slightly. Creatinine bumps up a little, and that scares people, but that's actually the whole point—that's how you get kidney protection long-term.Jordan: High intraglomerular pressure causes hyperfiltration injury and scarring over time. Lowering that pressure protects the kidney long-term. The short-term GFR drop is the price you pay for long-term benefits.Dr. Arreaza: So let's talk about CKD, because this is where people panic.Mike: Right. ACE inhibitors and ARBs are not contraindicated in chronic kidney disease. In fact, they're recommended even in advanced stages. They reduce progression to kidney failure by about a third.Jordan: The key is how you use them. Start low. Check creatinine and potassium one to two weeks after starting, then periodically. A creatinine rise up to 30% from baseline is acceptable. That's not kidney injury, that's physiology.Dr. Arreaza: And what about potassium creeping up?Mike: You adjust the dose or add a potassium binder. You don't just automatically stop the drug.Dr. Arreaza: Now there is one absolute contraindication everyone needs to know about! (board exam test)Jordan: Bilateral renal artery stenosis. This is the big one. In these patients, the kidneys are completely dependent on angiotensin II–mediated efferent constriction to maintain GFR. Take that away, and GFR collapses.Mike: Creatinine can jump dramatically within days. If you see a creatinine rise of 20% or more shortly after starting an ACE inhibitor, you should be thinking about bilateral renal artery stenosis and stopping the drug immediately.Dr. Arreaza: After revascularization, though, many patients can tolerate ACE inhibitors again, so this isn't always permanent. What about cardiorenal syndrome? That's where things get uncomfortable.Mike: It is uncomfortable, but cardiorenal syndrome isn't a contraindication. These patients have severe heart failure and kidney disease, and their mortality is actually higher than patients with heart failure alone.Jordan: ACE inhibitors still reduce mortality and slow kidney disease progression in this group. Studies show that stopping ACE inhibitors during acute heart-failure admissions increases in-hospital mortality three- to four-fold.Dr. Arreaza: So we are cautious, but we don't avoid it.Mike: Exactly. Start low, titrate slowly, monitor labs closely, accept up to a 30% creatinine rise. You only stop if kidney function keeps worsening, or potassium gets dangerously high.Dr. Arreaza: Alright. Let's move on. What about mineralocorticoid receptor antagonists… MRA?Jordan: Spironolactone or eplerenone might reduce hospitalizations in HFpEF, but the data is mixed. This is more of a “select patients” situation.Mike: And you have to watch potassium and kidney function carefully, especially if they're already on an ACE inhibitor or ARB.Dr. Arreaza: What about sacubitril-valsartan, also known as Entresto®?Mike: Entresto may help patients with mildly reduced EF roughly in the 45 to 57% range. It's not first-line for HFpEF, but in select patients, it's reasonable.Dr. Arreaza: Now let's clarify one of the biggest sources of confusion: beta blockers.Jordan: Beta blockers are not a treatment for HFpEF itself. They're only indicated if the patient has another reason to be on them, like coronary disease or atrial fibrillation.Mike: And timing really matters here. You absolutely do not start beta blockers during acute decompensated heart failure. Their negative inotropic effects can make things worse when patients are volume overloaded.Jordan: But, and this is critical, you also don't stop them if the patient is already taking one. Abrupt withdrawal causes a sympathetic surge and dramatically increases mortality.Dr. Arreaza: If a patient is admitted on a beta blocker, what do we do?Mike: Continue it at the same dose or reduce it slightly if they're really unstable. Once they're euvolemic and stable, you can carefully titrate up.Jordan: And watch for chronotropic incompetence. HFpEF patients often rely on heart-rate response to exercise, and beta blockers can worsen exercise intolerance.Dr. Arreaza: Beyond medications, HFpEF is really about treating comorbidities. Aerobic activity can be an initial strategy to improve exercise intolerance and has evidence of improving aerobic function and quality of life. Sodium restriction: improves symptoms, does not decrease risk of death or hospitalizations.Mike: Hypertension control is huge. For diabetes, the SGLT-2 inhibitors will perform double duty. For obesity, weight loss improves symptoms, and GLP-1 agonists like semaglutide are absolute gamechangers.Jordan: Don't forget sleep apnea, atrial fibrillation, and lifestyle. Exercise improves the quality of life, even if it doesn't change hard outcomes. Lifestyle is the main treatment. Dr. Arreaza: And when should you refer to cardiology?Mike: You should refer when the diagnosis isn't clear; symptoms are not responding to treatment, difficult volume management, end-organ dysfunction, or if you are concerned about advanced heart failure.Dr. Arreaza: So, it has been a great discussion. What is the takeaway?Mike: HFpEF treatment isn't about one magic drug -- it's about volume control, SGLT2 inhibitors, smart use of RAAS blockade, and aggressive management of comorbidities.Jordan: And it's understanding the physiology, so you don't withhold life-saving therapies out of fear.Dr. Arreaza: Well said. If you found this helpful, share it with a friend or colleague and rate us wherever you listen. This is Dr. Arreaza, signing off.Jordan/Mike: Thanks! Even without trying, every night you go to bed a little wiser. Thanks for listening to Rio Bravo qWeek Podcast. We want to hear from you, send us an email at RioBravoqWeek@clinicasierravista.org, or visit our website riobravofmrp.org/qweek. See you next week! _____________________References:Barzin A, Barnhouse KK, Kane SF. Heart Failure With Preserved Ejection Fraction. Am Fam Physician. 2025;112(4):435-440.Heidenreich PA, Bozkurt B, Aguilar D, et al. 2022 AHA/ACC/HFSA guideline for the management of heart failure. Circulation. 2022;145(18):e895-e1032.Kittleson MM, Panjrath GS, Amancherla K, et al. 2023 ACC expert consensus decision pathway on management of heart failure with preserved ejection fraction. J Am Coll Cardiol. 2023;81(18):1835-1878.Anker SD, Butler J, Filippatos G, et al. Empagliflozin in heart failure with a preserved ejection fraction. N Engl J Med. 2021;385(16):1451-1461.Solomon SD, McMurray JJV, Claggett B, et al. Dapagliflozin in heart failure with mildly reduced or preserved ejection fraction. N Engl J Med. 2022;387(12):1089-1098.Pitt B, Pfeffer MA, Assmann SF, et al. Spironolactone for heart failure with preserved ejection fraction. N Engl J Med. 2014;370(15):1383-1392.Yusuf S, Pfeffer MA, Swedberg K, et al. Effects of candesartan in patients with chronic heart failure and preserved left-ventricular ejection fraction. Lancet. 2003;362(9386):777-781.Solomon SD, McMurray JJV, Anand IS, et al. Angiotensin-neprilysin inhibition in heart failure with preserved ejection fraction. N Engl J Med. 2019;381(17):1609-1620.Kosiborod MN, Abildstrøm SZ, Borlaug BA, et al. Semaglutide in patients with heart failure with preserved ejection fraction and obesity. N Engl J Med. 2023;389(12):1069-1084.Xie Y, Xu E, Bowe B, Al-Aly Z. Long-term cardiovascular outcomes of COVID-19. Nat Med. 2022;28(3):583-590.Puntmann VO, Carerj ML, Wieters I, et al. Outcomes of cardiovascular magnetic resonance imaging in patients recently recovered from COVID-19. JAMA Cardiol. 2020;5(11):1265-1273.Basso C, Leone O, Rizzo S, et al. Pathological features of COVID-19-associated myocardial injury. Eur Heart J. 2020;41(39):3827-3835.Nalbandian A, Sehgal K, Gupta A, et al. Post-acute COVID-19 syndrome. Nat Med. 2021;27(4):601-615.Badve SV, Roberts MA, Hawley CM, et al. Effects of angiotensin-converting enzyme inhibitors and angiotensin receptor blockers in adults with estimated GFR less than 60 mL/min per 1.73 m². Ann Intern Med. 2024;177(8):953-963.Navis G, Faber HJ, de Zeeuw D, de Jong PE. ACE inhibitors and the kidney: a risk-benefit assessment. Drug Saf. 1996;15(3):200-211.Textor SC, Novick AC, Tarazi RC, et al. Critical perfusion pressure for renal function in patients with bilateral atherosclerotic renal vascular disease. Ann Intern Med. 1985;102(3):308-314.Hackam DG, Spence JD, Garg AX, Textor SC. Role of renin-angiotensin system blockade in atherosclerotic renal artery stenosis and renovascular hypertension. Hypertension. 2007;50(6):998-1003.Ronco C, Haapio M, House AA, et al. Cardiorenal syndrome. J Am Coll Cardiol. 2008;52(19):1527-1539.Prins KW, Neill JM, Tyler JO, et al. Effects of beta-blocker withdrawal in acute decompensated heart failure. JACC Heart Fail. 2015;3(8):647-653.Jondeau G, Neuder Y, Eicher JC, et al. B-CONVINCED: Beta-blocker CONtinuation Vs. INterruption in patients with Congestive heart failure hospitalizED for a decompensation episode. Eur Heart J. 2009;30(18):2186-2192.Theme song, Works All The Time by Dominik Schwarzer, YouTube ID: CUBDNERZU8HXUHBS, purchased from https://www.premiumbeat.com/.
Software Engineering Radio - The Podcast for Professional Software Developers
In this episode, Subhajit Paul joins SE Radio host Kanchan Shringi to discuss how enterprise resource planning (ERP) systems work in practice and where machine learning and generative AI are beginning to fit into real-world ERP environments. Subhajit grounds the conversation in ERP fundamentals, explaining core business flows such as order-to-cash, procure-to-pay, and plan-to-produce, and why ERP systems are central to running large enterprises. He then walks through the realities of ERP implementation, sharing examples of both successful and failed projects and highlighting common challenges around testing, process coverage, integrations, and change management. The discussion also explores how AI is being applied in ERP today, including practical ML use cases such as inventory optimization and anomaly detection, as well as emerging generative AI and agent-based approaches. Brought to you by IEEE Computer Society and IEEE Software magazine.
From rewriting Google's search stack in the early 2000s to reviving sparse trillion-parameter models and co-designing TPUs with frontier ML research, Jeff Dean has quietly shaped nearly every layer of the modern AI stack. As Chief AI Scientist at Google and a driving force behind Gemini, Jeff has lived through multiple scaling revolutions from CPUs and sharded indices to multimodal models that reason across text, video, and code.Jeff joins us to unpack what it really means to “own the Pareto frontier,” why distillation is the engine behind every Flash model breakthrough, how energy (in picojoules) not FLOPs is becoming the true bottleneck, what it was like leading the charge to unify all of Google's AI teams, and why the next leap won't come from bigger context windows alone, but from systems that give the illusion of attending to trillions of tokens.We discuss:* Jeff's early neural net thesis in 1990: parallel training before it was cool, why he believed scaling would win decades early, and the “bigger model, more data, better results” mantra that held for 15 years* The evolution of Google Search: sharding, moving the entire index into memory in 2001, softening query semantics pre-LLMs, and why retrieval pipelines already resemble modern LLM systems* Pareto frontier strategy: why you need both frontier “Pro” models and low-latency “Flash” models, and how distillation lets smaller models surpass prior generations* Distillation deep dive: ensembles → compression → logits as soft supervision, and why you need the biggest model to make the smallest one good* Latency as a first-class objective: why 10–50x lower latency changes UX entirely, and how future reasoning workloads will demand 10,000 tokens/sec* Energy-based thinking: picojoules per bit, why moving data costs 1000x more than a multiply, batching through the lens of energy, and speculative decoding as amortization* TPU co-design: predicting ML workloads 2–6 years out, speculative hardware features, precision reduction, sparsity, and the constant feedback loop between model architecture and silicon* Sparse models and “outrageously large” networks: trillions of parameters with 1–5% activation, and why sparsity was always the right abstraction* Unified vs. specialized models: abandoning symbolic systems, why general multimodal models tend to dominate vertical silos, and when vertical fine-tuning still makes sense* Long context and the illusion of scale: beyond needle-in-a-haystack benchmarks toward systems that narrow trillions of tokens to 117 relevant documents* Personalized AI: attending to your emails, photos, and documents (with permission), and why retrieval + reasoning will unlock deeply personal assistants* Coding agents: 50 AI interns, crisp specifications as a new core skill, and how ultra-low latency will reshape human–agent collaboration* Why ideas still matter: transformers, sparsity, RL, hardware, systems — scaling wasn't blind; the pieces had to multiply togetherShow Notes:* Gemma 3 Paper* Gemma 3* Gemini 2.5 Report* Jeff Dean's “Software Engineering Advice fromBuilding Large-Scale Distributed Systems” Presentation (with Back of the Envelope Calculations)* Latency Numbers Every Programmer Should Know by Jeff Dean* The Jeff Dean Facts* Jeff Dean Google Bio* Jeff Dean on “Important AI Trends” @Stanford AI Club* Jeff Dean & Noam Shazeer — 25 years at Google (Dwarkesh)—Jeff Dean* LinkedIn: https://www.linkedin.com/in/jeff-dean-8b212555* X: https://x.com/jeffdeanGoogle* https://google.com* https://deepmind.googleFull Video EpisodeTimestamps00:00:04 — Introduction: Alessio & Swyx welcome Jeff Dean, chief AI scientist at Google, to the Latent Space podcast00:00:30 — Owning the Pareto Frontier & balancing frontier vs low-latency models00:01:31 — Frontier models vs Flash models + role of distillation00:03:52 — History of distillation and its original motivation00:05:09 — Distillation's role in modern model scaling00:07:02 — Model hierarchy (Flash, Pro, Ultra) and distillation sources00:07:46 — Flash model economics & wide deployment00:08:10 — Latency importance for complex tasks00:09:19 — Saturation of some tasks and future frontier tasks00:11:26 — On benchmarks, public vs internal00:12:53 — Example long-context benchmarks & limitations00:15:01 — Long-context goals: attending to trillions of tokens00:16:26 — Realistic use cases beyond pure language00:18:04 — Multimodal reasoning and non-text modalities00:19:05 — Importance of vision & motion modalities00:20:11 — Video understanding example (extracting structured info)00:20:47 — Search ranking analogy for LLM retrieval00:23:08 — LLM representations vs keyword search00:24:06 — Early Google search evolution & in-memory index00:26:47 — Design principles for scalable systems00:28:55 — Real-time index updates & recrawl strategies00:30:06 — Classic “Latency numbers every programmer should know”00:32:09 — Cost of memory vs compute and energy emphasis00:34:33 — TPUs & hardware trade-offs for serving models00:35:57 — TPU design decisions & co-design with ML00:38:06 — Adapting model architecture to hardware00:39:50 — Alternatives: energy-based models, speculative decoding00:42:21 — Open research directions: complex workflows, RL00:44:56 — Non-verifiable RL domains & model evaluation00:46:13 — Transition away from symbolic systems toward unified LLMs00:47:59 — Unified models vs specialized ones00:50:38 — Knowledge vs reasoning & retrieval + reasoning00:52:24 — Vertical model specialization & modules00:55:21 — Token count considerations for vertical domains00:56:09 — Low resource languages & contextual learning00:59:22 — Origins: Dean's early neural network work01:10:07 — AI for coding & human–model interaction styles01:15:52 — Importance of crisp specification for coding agents01:19:23 — Prediction: personalized models & state retrieval01:22:36 — Token-per-second targets (10k+) and reasoning throughput01:23:20 — Episode conclusion and thanksTranscriptAlessio Fanelli [00:00:04]: Hey everyone, welcome to the Latent Space podcast. This is Alessio, founder of Kernel Labs, and I'm joined by Swyx, editor of Latent Space. Shawn Wang [00:00:11]: Hello, hello. We're here in the studio with Jeff Dean, chief AI scientist at Google. Welcome. Thanks for having me. It's a bit surreal to have you in the studio. I've watched so many of your talks, and obviously your career has been super legendary. So, I mean, congrats. I think the first thing must be said, congrats on owning the Pareto Frontier.Jeff Dean [00:00:30]: Thank you, thank you. Pareto Frontiers are good. It's good to be out there.Shawn Wang [00:00:34]: Yeah, I mean, I think it's a combination of both. You have to own the Pareto Frontier. You have to have like frontier capability, but also efficiency, and then offer that range of models that people like to use. And, you know, some part of this was started because of your hardware work. Some part of that is your model work, and I'm sure there's lots of secret sauce that you guys have worked on cumulatively. But, like, it's really impressive to see it all come together in, like, this slittily advanced.Jeff Dean [00:01:04]: Yeah, yeah. I mean, I think, as you say, it's not just one thing. It's like a whole bunch of things up and down the stack. And, you know, all of those really combine to help make UNOS able to make highly capable large models, as well as, you know, software techniques to get those large model capabilities into much smaller, lighter weight models that are, you know, much more cost effective and lower latency, but still, you know, quite capable for their size. Yeah.Alessio Fanelli [00:01:31]: How much pressure do you have on, like, having the lower bound of the Pareto Frontier, too? I think, like, the new labs are always trying to push the top performance frontier because they need to raise more money and all of that. And you guys have billions of users. And I think initially when you worked on the CPU, you were thinking about, you know, if everybody that used Google, we use the voice model for, like, three minutes a day, they were like, you need to double your CPU number. Like, what's that discussion today at Google? Like, how do you prioritize frontier versus, like, we have to do this? How do we actually need to deploy it if we build it?Jeff Dean [00:02:03]: Yeah, I mean, I think we always want to have models that are at the frontier or pushing the frontier because I think that's where you see what capabilities now exist that didn't exist at the sort of slightly less capable last year's version or last six months ago version. At the same time, you know, we know those are going to be really useful for a bunch of use cases, but they're going to be a bit slower and a bit more expensive than people might like for a bunch of other broader models. So I think what we want to do is always have kind of a highly capable sort of affordable model that enables a whole bunch of, you know, lower latency use cases. People can use them for agentic coding much more readily and then have the high-end, you know, frontier model that is really useful for, you know, deep reasoning, you know, solving really complicated math problems, those kinds of things. And it's not that. One or the other is useful. They're both useful. So I think we'd like to do both. And also, you know, through distillation, which is a key technique for making the smaller models more capable, you know, you have to have the frontier model in order to then distill it into your smaller model. So it's not like an either or choice. You sort of need that in order to actually get a highly capable, more modest size model. Yeah.Alessio Fanelli [00:03:24]: I mean, you and Jeffrey came up with the solution in 2014.Jeff Dean [00:03:28]: Don't forget, L'Oreal Vinyls as well. Yeah, yeah.Alessio Fanelli [00:03:30]: A long time ago. But like, I'm curious how you think about the cycle of these ideas, even like, you know, sparse models and, you know, how do you reevaluate them? How do you think about in the next generation of model, what is worth revisiting? Like, yeah, they're just kind of like, you know, you worked on so many ideas that end up being influential, but like in the moment, they might not feel that way necessarily. Yeah.Jeff Dean [00:03:52]: I mean, I think distillation was originally motivated because we were seeing that we had a very large image data set at the time, you know, 300 million images that we could train on. And we were seeing that if you create specialists for different subsets of those image categories, you know, this one's going to be really good at sort of mammals, and this one's going to be really good at sort of indoor room scenes or whatever, and you can cluster those categories and train on an enriched stream of data after you do pre-training on a much broader set of images. You get much better performance. If you then treat that whole set of maybe 50 models you've trained as a large ensemble, but that's not a very practical thing to serve, right? So distillation really came about from the idea of, okay, what if we want to actually serve that and train all these independent sort of expert models and then squish it into something that actually fits in a form factor that you can actually serve? And that's, you know, not that different from what we're doing today. You know, often today we're instead of having an ensemble of 50 models. We're having a much larger scale model that we then distill into a much smaller scale model.Shawn Wang [00:05:09]: Yeah. A part of me also wonders if distillation also has a story with the RL revolution. So let me maybe try to articulate what I mean by that, which is you can, RL basically spikes models in a certain part of the distribution. And then you have to sort of, well, you can spike models, but usually sometimes... It might be lossy in other areas and it's kind of like an uneven technique, but you can probably distill it back and you can, I think that the sort of general dream is to be able to advance capabilities without regressing on anything else. And I think like that, that whole capability merging without loss, I feel like it's like, you know, some part of that should be a distillation process, but I can't quite articulate it. I haven't seen much papers about it.Jeff Dean [00:06:01]: Yeah, I mean, I tend to think of one of the key advantages of distillation is that you can have a much smaller model and you can have a very large, you know, training data set and you can get utility out of making many passes over that data set because you're now getting the logits from the much larger model in order to sort of coax the right behavior out of the smaller model that you wouldn't otherwise get with just the hard labels. And so, you know, I think that's what we've observed. Is you can get, you know, very close to your largest model performance with distillation approaches. And that seems to be, you know, a nice sweet spot for a lot of people because it enables us to kind of, for multiple Gemini generations now, we've been able to make the sort of flash version of the next generation as good or even substantially better than the previous generations pro. And I think we're going to keep trying to do that because that seems like a good trend to follow.Shawn Wang [00:07:02]: So, Dara asked, so it was the original map was Flash Pro and Ultra. Are you just sitting on Ultra and distilling from that? Is that like the mother load?Jeff Dean [00:07:12]: I mean, we have a lot of different kinds of models. Some are internal ones that are not necessarily meant to be released or served. Some are, you know, our pro scale model and we can distill from that as well into our Flash scale model. So I think, you know, it's an important set of capabilities to have and also inference time scaling. It can also be a useful thing to improve the capabilities of the model.Shawn Wang [00:07:35]: And yeah, yeah, cool. Yeah. And obviously, I think the economy of Flash is what led to the total dominance. I think the latest number is like 50 trillion tokens. I don't know. I mean, obviously, it's changing every day.Jeff Dean [00:07:46]: Yeah, yeah. But, you know, by market share, hopefully up.Shawn Wang [00:07:50]: No, I mean, there's no I mean, there's just the economics wise, like because Flash is so economical, like you can use it for everything. Like it's in Gmail now. It's in YouTube. Like it's yeah. It's in everything.Jeff Dean [00:08:02]: We're using it more in our search products of various AI mode reviews.Shawn Wang [00:08:05]: Oh, my God. Flash past the AI mode. Oh, my God. Yeah, that's yeah, I didn't even think about that.Jeff Dean [00:08:10]: I mean, I think one of the things that is quite nice about the Flash model is not only is it more affordable, it's also a lower latency. And I think latency is actually a pretty important characteristic for these models because we're going to want models to do much more complicated things that are going to involve, you know, generating many more tokens from when you ask the model to do so. So, you know, if you're going to ask the model to do something until it actually finishes what you ask it to do, because you're going to ask now, not just write me a for loop, but like write me a whole software package to do X or Y or Z. And so having low latency systems that can do that seems really important. And Flash is one direction, one way of doing that. You know, obviously our hardware platforms enable a bunch of interesting aspects of our, you know, serving stack as well, like TPUs, the interconnect between. Chips on the TPUs is actually quite, quite high performance and quite amenable to, for example, long context kind of attention operations, you know, having sparse models with lots of experts. These kinds of things really, really matter a lot in terms of how do you make them servable at scale.Alessio Fanelli [00:09:19]: Yeah. Does it feel like there's some breaking point for like the proto Flash distillation, kind of like one generation delayed? I almost think about almost like the capability as a. In certain tasks, like the pro model today is a saturated, some sort of task. So next generation, that same task will be saturated at the Flash price point. And I think for most of the things that people use models for at some point, the Flash model in two generation will be able to do basically everything. And how do you make it economical to like keep pushing the pro frontier when a lot of the population will be okay with the Flash model? I'm curious how you think about that.Jeff Dean [00:09:59]: I mean, I think that's true. If your distribution of what people are asking people, the models to do is stationary, right? But I think what often happens is as the models become more capable, people ask them to do more, right? So, I mean, I think this happens in my own usage. Like I used to try our models a year ago for some sort of coding task, and it was okay at some simpler things, but wouldn't do work very well for more complicated things. And since then, we've improved dramatically on the more complicated coding tasks. And now I'll ask it to do much more complicated things. And I think that's true, not just of coding, but of, you know, now, you know, can you analyze all the, you know, renewable energy deployments in the world and give me a report on solar panel deployment or whatever. That's a very complicated, you know, more complicated task than people would have asked a year ago. And so you are going to want more capable models to push the frontier in the absence of what people ask the models to do. And that also then gives us. Insight into, okay, where does the, where do things break down? How can we improve the model in these, these particular areas, uh, in order to sort of, um, make the next generation even better.Alessio Fanelli [00:11:11]: Yeah. Are there any benchmarks or like test sets they use internally? Because it's almost like the same benchmarks get reported every time. And it's like, all right, it's like 99 instead of 97. Like, how do you have to keep pushing the team internally to it? Or like, this is what we're building towards. Yeah.Jeff Dean [00:11:26]: I mean, I think. Benchmarks, particularly external ones that are publicly available. Have their utility, but they often kind of have a lifespan of utility where they're introduced and maybe they're quite hard for current models. You know, I, I like to think of the best kinds of benchmarks are ones where the initial scores are like 10 to 20 or 30%, maybe, but not higher. And then you can sort of work on improving that capability for, uh, whatever it is, the benchmark is trying to assess and get it up to like 80, 90%, whatever. I, I think once it hits kind of 95% or something, you get very diminishing returns from really focusing on that benchmark, cuz it's sort of, it's either the case that you've now achieved that capability, or there's also the issue of leakage in public data or very related kind of data being, being in your training data. Um, so we have a bunch of held out internal benchmarks that we really look at where we know that wasn't represented in the training data at all. There are capabilities that we want the model to have. Um, yeah. Yeah. Um, that it doesn't have now, and then we can work on, you know, assessing, you know, how do we make the model better at these kinds of things? Is it, we need different kind of data to train on that's more specialized for this particular kind of task. Do we need, um, you know, a bunch of, uh, you know, architectural improvements or some sort of, uh, model capability improvements, you know, what would help make that better?Shawn Wang [00:12:53]: Is there, is there such an example that you, uh, a benchmark inspired in architectural improvement? Like, uh, I'm just kind of. Jumping on that because you just.Jeff Dean [00:13:02]: Uh, I mean, I think some of the long context capability of the, of the Gemini models that came, I guess, first in 1.5 really were about looking at, okay, we want to have, um, you know,Shawn Wang [00:13:15]: immediately everyone jumped to like completely green charts of like, everyone had, I was like, how did everyone crack this at the same time? Right. Yeah. Yeah.Jeff Dean [00:13:23]: I mean, I think, um, and once you're set, I mean, as you say that needed single needle and a half. Hey, stack benchmark is really saturated for at least context links up to 1, 2 and K or something. Don't actually have, you know, much larger than 1, 2 and 8 K these days or two or something. We're trying to push the frontier of 1 million or 2 million context, which is good because I think there are a lot of use cases where. Yeah. You know, putting a thousand pages of text or putting, you know, multiple hour long videos and the context and then actually being able to make use of that as useful. Try to, to explore the über graduation are fairly large. But the single needle in a haystack benchmark is sort of saturated. So you really want more complicated, sort of multi-needle or more realistic, take all this content and produce this kind of answer from a long context that sort of better assesses what it is people really want to do with long context. Which is not just, you know, can you tell me the product number for this particular thing?Shawn Wang [00:14:31]: Yeah, it's retrieval. It's retrieval within machine learning. It's interesting because I think the more meta level I'm trying to operate at here is you have a benchmark. You're like, okay, I see the architectural thing I need to do in order to go fix that. But should you do it? Because sometimes that's an inductive bias, basically. It's what Jason Wei, who used to work at Google, would say. Exactly the kind of thing. Yeah, you're going to win. Short term. Longer term, I don't know if that's going to scale. You might have to undo that.Jeff Dean [00:15:01]: I mean, I like to sort of not focus on exactly what solution we're going to derive, but what capability would you want? And I think we're very convinced that, you know, long context is useful, but it's way too short today. Right? Like, I think what you would really want is, can I attend to the internet while I answer my question? Right? But that's not going to happen. I think that's going to be solved by purely scaling the existing solutions, which are quadratic. So a million tokens kind of pushes what you can do. You're not going to do that to a trillion tokens, let alone, you know, a billion tokens, let alone a trillion. But I think if you could give the illusion that you can attend to trillions of tokens, that would be amazing. You'd find all kinds of uses for that. You would have attend to the internet. You could attend to the pixels of YouTube and the sort of deeper representations that we can find. You could attend to the form for a single video, but across many videos, you know, on a personal Gemini level, you could attend to all of your personal state with your permission. So like your emails, your photos, your docs, your plane tickets you have. I think that would be really, really useful. And the question is, how do you get algorithmic improvements and system level improvements that get you to something where you actually can attend to trillions of tokens? Right. In a meaningful way. Yeah.Shawn Wang [00:16:26]: But by the way, I think I did some math and it's like, if you spoke all day, every day for eight hours a day, you only generate a maximum of like a hundred K tokens, which like very comfortably fits.Jeff Dean [00:16:38]: Right. But if you then say, okay, I want to be able to understand everything people are putting on videos.Shawn Wang [00:16:46]: Well, also, I think that the classic example is you start going beyond language into like proteins and whatever else is extremely information dense. Yeah. Yeah.Jeff Dean [00:16:55]: I mean, I think one of the things about Gemini's multimodal aspects is we've always wanted it to be multimodal from the start. And so, you know, that sometimes to people means text and images and video sort of human-like and audio, audio, human-like modalities. But I think it's also really useful to have Gemini know about non-human modalities. Yeah. Like LIDAR sensor data from. Yes. Say, Waymo vehicles or. Like robots or, you know, various kinds of health modalities, x-rays and MRIs and imaging and genomics information. And I think there's probably hundreds of modalities of data where you'd like the model to be able to at least be exposed to the fact that this is an interesting modality and has certain meaning in the world. Where even if you haven't trained on all the LIDAR data or MRI data, you could have, because maybe that's not, you know, it doesn't make sense in terms of trade-offs of. You know, what you include in your main pre-training data mix, at least including a little bit of it is actually quite useful. Yeah. Because it sort of tempts the model that this is a thing.Shawn Wang [00:18:04]: Yeah. Do you believe, I mean, since we're on this topic and something I just get to ask you all the questions I always wanted to ask, which is fantastic. Like, are there some king modalities, like modalities that supersede all the other modalities? So a simple example was Vision can, on a pixel level, encode text. And DeepSeq had this DeepSeq CR paper that did that. Vision. And Vision has also been shown to maybe incorporate audio because you can do audio spectrograms and that's, that's also like a Vision capable thing. Like, so, so maybe Vision is just the king modality and like. Yeah.Jeff Dean [00:18:36]: I mean, Vision and Motion are quite important things, right? Motion. Well, like video as opposed to static images, because I mean, there's a reason evolution has evolved eyes like 23 independent ways, because it's such a useful capability for sensing the world around you, which is really what we want these models to be. So I think the only thing that we can be able to do is interpret the things we're seeing or the things we're paying attention to and then help us in using that information to do things. Yeah.Shawn Wang [00:19:05]: I think motion, you know, I still want to shout out, I think Gemini, still the only native video understanding model that's out there. So I use it for YouTube all the time. Nice.Jeff Dean [00:19:15]: Yeah. Yeah. I mean, it's actually, I think people kind of are not necessarily aware of what the Gemini models can actually do. Yeah. Like I have an example I've used in one of my talks. It had like, it was like a YouTube highlight video of 18 memorable sports moments across the last 20 years or something. So it has like Michael Jordan hitting some jump shot at the end of the finals and, you know, some soccer goals and things like that. And you can literally just give it the video and say, can you please make me a table of what all these different events are? What when the date is when they happened? And a short description. And so you get like now an 18 row table of that information extracted from the video, which is, you know, not something most people think of as like a turn video into sequel like table.Alessio Fanelli [00:20:11]: Has there been any discussion inside of Google of like, you mentioned tending to the whole internet, right? Google, it's almost built because a human cannot tend to the whole internet and you need some sort of ranking to find what you need. Yep. That ranking is like much different for an LLM because you can expect a person to look at maybe the first five, six links in a Google search versus for an LLM. Should you expect to have 20 links that are highly relevant? Like how do you internally figure out, you know, how do we build the AI mode that is like maybe like much broader search and span versus like the more human one? Yeah.Jeff Dean [00:20:47]: I mean, I think even pre-language model based work, you know, our ranking systems would be built to start. I mean, I think even pre-language model based work, you know, our ranking systems would be built to start. With a giant number of web pages in our index, many of them are not relevant. So you identify a subset of them that are relevant with very lightweight kinds of methods. You know, you're down to like 30,000 documents or something. And then you gradually refine that to apply more and more sophisticated algorithms and more and more sophisticated sort of signals of various kinds in order to get down to ultimately what you show, which is, you know, the final 10 results or, you know, 10 results plus. Other kinds of information. And I think an LLM based system is not going to be that dissimilar, right? You're going to attend to trillions of tokens, but you're going to want to identify, you know, what are the 30,000 ish documents that are with the, you know, maybe 30 million interesting tokens. And then how do you go from that into what are the 117 documents I really should be paying attention to in order to carry out the tasks that the user has asked? And I think, you know, you can imagine systems where you have, you know, a lot of highly parallel processing to identify those initial 30,000 candidates, maybe with very lightweight kinds of models. Then you have some system that sort of helps you narrow down from 30,000 to the 117 with maybe a little bit more sophisticated model or set of models. And then maybe the final model is the thing that looks. So the 117 things that might be your most capable model. So I think it has to, it's going to be some system like that, that is really enables you to give the illusion of attending to trillions of tokens. Sort of the way Google search gives you, you know, not the illusion, but you are searching the internet, but you're finding, you know, a very small subset of things that are, that are relevant.Shawn Wang [00:22:47]: Yeah. I often tell a lot of people that are not steeped in like Google search history that, well, you know, like Bert was. Like he was like basically immediately inside of Google search and that improves results a lot, right? Like I don't, I don't have any numbers off the top of my head, but like, I'm sure you guys, that's obviously the most important numbers to Google. Yeah.Jeff Dean [00:23:08]: I mean, I think going to an LLM based representation of text and words and so on enables you to get out of the explicit hard notion of, of particular words having to be on the page, but really getting at the notion of this topic of this page or this page. Paragraph is highly relevant to this query. Yeah.Shawn Wang [00:23:28]: I don't think people understand how much LLMs have taken over all these very high traffic system, very high traffic. Yeah. Like it's Google, it's YouTube. YouTube has this like semantics ID thing where it's just like every token or every item in the vocab is a YouTube video or something that predicts the video using a code book, which is absurd to me for YouTube size.Jeff Dean [00:23:50]: And then most recently GROK also for, for XAI, which is like, yeah. I mean, I'll call out even before LLMs were used extensively in search, we put a lot of emphasis on softening the notion of what the user actually entered into the query.Shawn Wang [00:24:06]: So do you have like a history of like, what's the progression? Oh yeah.Jeff Dean [00:24:09]: I mean, I actually gave a talk in, uh, I guess, uh, web search and data mining conference in 2009, uh, where we never actually published any papers about the origins of Google search, uh, sort of, but we went through sort of four or five or six. generations, four or five or six generations of, uh, redesigning of the search and retrieval system, uh, from about 1999 through 2004 or five. And that talk is really about that evolution. And one of the things that really happened in 2001 was we were sort of working to scale the system in multiple dimensions. So one is we wanted to make our index bigger, so we could retrieve from a larger index, which always helps your quality in general. Uh, because if you don't have the page in your index, you're going to not do well. Um, and then we also needed to scale our capacity because we were, our traffic was growing quite extensively. Um, and so we had, you know, a sharded system where you have more and more shards as the index grows, you have like 30 shards. And then if you want to double the index size, you make 60 shards so that you can bound the latency by which you respond for any particular user query. Um, and then as traffic grows, you add, you add more and more replicas of each of those. And so we eventually did the math that realized that in a data center where we had say 60 shards and, um, you know, 20 copies of each shard, we now had 1200 machines, uh, with disks. And we did the math and we're like, Hey, one copy of that index would actually fit in memory across 1200 machines. So in 2001, we introduced, uh, we put our entire index in memory and what that enabled from a quality perspective was amazing. Um, and so we had more and more replicas of each of those. Before you had to be really careful about, you know, how many different terms you looked at for a query, because every one of them would involve a disk seek on every one of the 60 shards. And so you, as you make your index bigger, that becomes even more inefficient. But once you have the whole index in memory, it's totally fine to have 50 terms you throw into the query from the user's original three or four word query, because now you can add synonyms like restaurant and restaurants and cafe and, uh, you know, things like that. Uh, bistro and all these things. And you can suddenly start, uh, sort of really, uh, getting at the meaning of the word as opposed to the exact semantic form the user typed in. And that was, you know, 2001, very much pre LLM, but really it was about softening the, the strict definition of what the user typed in order to get at the meaning.Alessio Fanelli [00:26:47]: What are like principles that you use to like design the systems, especially when you have, I mean, in 2001, the internet is like. Doubling, tripling every year in size is not like, uh, you know, and I think today you kind of see that with LLMs too, where like every year the jumps in size and like capabilities are just so big. Are there just any, you know, principles that you use to like, think about this? Yeah.Jeff Dean [00:27:08]: I mean, I think, uh, you know, first, whenever you're designing a system, you want to understand what are the sort of design parameters that are going to be most important in designing that, you know? So, you know, how many queries per second do you need to handle? How big is the internet? How big is the index you need to handle? How much data do you need to keep for every document in the index? How are you going to look at it when you retrieve things? Um, what happens if traffic were to double or triple, you know, will that system work well? And I think a good design principle is you're going to want to design a system so that the most important characteristics could scale by like factors of five or 10, but probably not beyond that because often what happens is if you design a system for X. And something suddenly becomes a hundred X, that would enable a very different point in the design space that would not make sense at X. But all of a sudden at a hundred X makes total sense. So like going from a disk space index to a in memory index makes a lot of sense once you have enough traffic, because now you have enough replicas of the sort of state on disk that those machines now actually can hold, uh, you know, a full copy of the, uh, index and memory. Yeah. And that all of a sudden enabled. A completely different design that wouldn't have been practical before. Yeah. Um, so I'm, I'm a big fan of thinking through designs in your head, just kind of playing with the design space a little before you actually do a lot of writing of code. But, you know, as you said, in the early days of Google, we were growing the index, uh, quite extensively. We were growing the update rate of the index. So the update rate actually is the parameter that changed the most. Surprising. So it used to be once a month.Shawn Wang [00:28:55]: Yeah.Jeff Dean [00:28:56]: And then we went to a system that could update any particular page in like sub one minute. Okay.Shawn Wang [00:29:02]: Yeah. Because this is a competitive advantage, right?Jeff Dean [00:29:04]: Because all of a sudden news related queries, you know, if you're, if you've got last month's news index, it's not actually that useful for.Shawn Wang [00:29:11]: News is a special beast. Was there any, like you could have split it onto a separate system.Jeff Dean [00:29:15]: Well, we did. We launched a Google news product, but you also want news related queries that people type into the main index to also be sort of updated.Shawn Wang [00:29:23]: So, yeah, it's interesting. And then you have to like classify whether the page is, you have to decide which pages should be updated and what frequency. Oh yeah.Jeff Dean [00:29:30]: There's a whole like, uh, system behind the scenes that's trying to decide update rates and importance of the pages. So even if the update rate seems low, you might still want to recrawl important pages quite often because, uh, the likelihood they change might be low, but the value of having updated is high.Shawn Wang [00:29:50]: Yeah, yeah, yeah, yeah. Uh, well, you know, yeah. This, uh, you know, mention of latency and, and saving things to this reminds me of one of your classics, which I have to bring up, which is latency numbers. Every programmer should know, uh, was there a, was it just a, just a general story behind that? Did you like just write it down?Jeff Dean [00:30:06]: I mean, this has like sort of eight or 10 different kinds of metrics that are like, how long does a cache mistake? How long does branch mispredict take? How long does a reference domain memory take? How long does it take to send, you know, a packet from the U S to the Netherlands or something? Um,Shawn Wang [00:30:21]: why Netherlands, by the way, or is it, is that because of Chrome?Jeff Dean [00:30:25]: Uh, we had a data center in the Netherlands, um, so, I mean, I think this gets to the point of being able to do the back of the envelope calculations. So these are sort of the raw ingredients of those, and you can use them to say, okay, well, if I need to design a system to do image search and thumb nailing or something of the result page, you know, how, what I do that I could pre-compute the image thumbnails. I could like. Try to thumbnail them on the fly from the larger images. What would that do? How much dis bandwidth than I need? How many des seeks would I do? Um, and you can sort of actually do thought experiments in, you know, 30 seconds or a minute with the sort of, uh, basic, uh, basic numbers at your fingertips. Uh, and then as you sort of build software using higher level libraries, you kind of want to develop the same intuitions for how long does it take to, you know, look up something in this particular kind of.Shawn Wang [00:31:21]: I'll see you next time.Shawn Wang [00:31:51]: Which is a simple byte conversion. That's nothing interesting. I wonder if you have any, if you were to update your...Jeff Dean [00:31:58]: I mean, I think it's really good to think about calculations you're doing in a model, either for training or inference.Jeff Dean [00:32:09]: Often a good way to view that is how much state will you need to bring in from memory, either like on-chip SRAM or HBM from the accelerator. Attached memory or DRAM or over the network. And then how expensive is that data motion relative to the cost of, say, an actual multiply in the matrix multiply unit? And that cost is actually really, really low, right? Because it's order, depending on your precision, I think it's like sub one picodule.Shawn Wang [00:32:50]: Oh, okay. You measure it by energy. Yeah. Yeah.Jeff Dean [00:32:52]: Yeah. I mean, it's all going to be about energy and how do you make the most energy efficient system. And then moving data from the SRAM on the other side of the chip, not even off the off chip, but on the other side of the same chip can be, you know, a thousand picodules. Oh, yeah. And so all of a sudden, this is why your accelerators require batching. Because if you move, like, say, the parameter of a model from SRAM on the, on the chip into the multiplier unit, that's going to cost you a thousand picodules. So you better make use of that, that thing that you moved many, many times with. So that's where the batch dimension comes in. Because all of a sudden, you know, if you have a batch of 256 or something, that's not so bad. But if you have a batch of one, that's really not good.Shawn Wang [00:33:40]: Yeah. Yeah. Right.Jeff Dean [00:33:41]: Because then you paid a thousand picodules in order to do your one picodule multiply.Shawn Wang [00:33:46]: I have never heard an energy-based analysis of batching.Jeff Dean [00:33:50]: Yeah. I mean, that's why people batch. Yeah. Ideally, you'd like to use batch size one because the latency would be great.Shawn Wang [00:33:56]: The best latency.Jeff Dean [00:33:56]: But the energy cost and the compute cost inefficiency that you get is quite large. So, yeah.Shawn Wang [00:34:04]: Is there a similar trick like, like, like you did with, you know, putting everything in memory? Like, you know, I think obviously NVIDIA has caused a lot of waves with betting very hard on SRAM with Grok. I wonder if, like, that's something that you already saw with, with the TPUs, right? Like that, that you had to. Uh, to serve at your scale, uh, you probably sort of saw that coming. Like what, what, what hardware, uh, innovations or insights were formed because of what you're seeing there?Jeff Dean [00:34:33]: Yeah. I mean, I think, you know, TPUs have this nice, uh, sort of regular structure of 2D or 3D meshes with a bunch of chips connected. Yeah. And each one of those has HBM attached. Um, I think for serving some kinds of models, uh, you know, you, you pay a lot higher cost. Uh, and time latency, um, bringing things in from HBM than you do bringing them in from, uh, SRAM on the chip. So if you have a small enough model, you can actually do model parallelism, spread it out over lots of chips and you actually get quite good throughput improvements and latency improvements from doing that. And so you're now sort of striping your smallish scale model over say 16 or 64 chips. Uh, but as if you do that and it all fits in. In SRAM, uh, that can be a big win. So yeah, that's not a surprise, but it is a good technique.Alessio Fanelli [00:35:27]: Yeah. What about the TPU design? Like how much do you decide where the improvements have to go? So like, this is like a good example of like, is there a way to bring the thousand picojoules down to 50? Like, is it worth designing a new chip to do that? The extreme is like when people say, oh, you should burn the model on the ASIC and that's kind of like the most extreme thing. How much of it? Is it worth doing an hardware when things change so quickly? Like what was the internal discussion? Yeah.Jeff Dean [00:35:57]: I mean, we, we have a lot of interaction between say the TPU chip design architecture team and the sort of higher level modeling, uh, experts, because you really want to take advantage of being able to co-design what should future TPUs look like based on where we think the sort of ML research puck is going, uh, in some sense, because, uh, you know, as a hardware designer for ML and in particular, you're trying to design a chip starting today and that design might take two years before it even lands in a data center. And then it has to sort of be a reasonable lifetime of the chip to take you three, four or five years. So you're trying to predict two to six years out where, what ML computations will people want to run two to six years out in a very fast changing field. And so having people with interest. Interesting ML research ideas of things we think will start to work in that timeframe or will be more important in that timeframe, uh, really enables us to then get, you know, interesting hardware features put into, you know, TPU N plus two, where TPU N is what we have today.Shawn Wang [00:37:10]: Oh, the cycle time is plus two.Jeff Dean [00:37:12]: Roughly. Wow. Because, uh, I mean, sometimes you can squeeze some changes into N plus one, but, you know, bigger changes are going to require the chip. Yeah. Design be earlier in its lifetime design process. Um, so whenever we can do that, it's generally good. And sometimes you can put in speculative features that maybe won't cost you much chip area, but if it works out, it would make something, you know, 10 times as fast. And if it doesn't work out, well, you burned a little bit of tiny amount of your chip area on that thing, but it's not that big a deal. Uh, sometimes it's a very big change and we want to be pretty sure this is going to work out. So we'll do like lots of carefulness. Uh, ML experimentation to show us, uh, this is actually the, the way we want to go. Yeah.Alessio Fanelli [00:37:58]: Is there a reverse of like, we already committed to this chip design so we can not take the model architecture that way because it doesn't quite fit?Jeff Dean [00:38:06]: Yeah. I mean, you, you definitely have things where you're going to adapt what the model architecture looks like so that they're efficient on the chips that you're going to have for both training and inference of that, of that, uh, generation of model. So I think it kind of goes both ways. Um, you know, sometimes you can take advantage of, you know, lower precision things that are coming in a future generation. So you can, might train it at that lower precision, even if the current generation doesn't quite do that. Mm.Shawn Wang [00:38:40]: Yeah. How low can we go in precision?Jeff Dean [00:38:43]: Because people are saying like ternary is like, uh, yeah, I mean, I'm a big fan of very low precision because I think that gets, that saves you a tremendous amount of time. Right. Because it's picojoules per bit that you're transferring and reducing the number of bits is a really good way to, to reduce that. Um, you know, I think people have gotten a lot of luck, uh, mileage out of having very low bit precision things, but then having scaling factors that apply to a whole bunch of, uh, those, those weights. Scaling. How does it, how does it, okay.Shawn Wang [00:39:15]: Interesting. You, so low, low precision, but scaled up weights. Yeah. Huh. Yeah. Never considered that. Yeah. Interesting. Uh, w w while we're on this topic, you know, I think there's a lot of, um, uh, this, the concept of precision at all is weird when we're sampling, you know, uh, we just, at the end of this, we're going to have all these like chips that I'll do like very good math. And then we're just going to throw a random number generator at the start. So, I mean, there's a movement towards, uh, energy based, uh, models and processors. I'm just curious if you've, obviously you've thought about it, but like, what's your commentary?Jeff Dean [00:39:50]: Yeah. I mean, I think. There's a bunch of interesting trends though. Energy based models is one, you know, diffusion based models, which don't sort of sequentially decode tokens is another, um, you know, speculative decoding is a way that you can get sort of an equivalent, very small.Shawn Wang [00:40:06]: Draft.Jeff Dean [00:40:07]: Batch factor, uh, for like you predict eight tokens out and that enables you to sort of increase the effective batch size of what you're doing by a factor of eight, even, and then you maybe accept five or six of those tokens. So you get. A five, a five X improvement in the amortization of moving weights, uh, into the multipliers to do the prediction for the, the tokens. So these are all really good techniques and I think it's really good to look at them from the lens of, uh, energy, real energy, not energy based models, um, and, and also latency and throughput, right? If you look at things from that lens, that sort of guides you to. Two solutions that are gonna be, uh, you know, better from, uh, you know, being able to serve larger models or, you know, equivalent size models more cheaply and with lower latency.Shawn Wang [00:41:03]: Yeah. Well, I think, I think I, um, it's appealing intellectually, uh, haven't seen it like really hit the mainstream, but, um, I do think that, uh, there's some poetry in the sense that, uh, you know, we don't have to do, uh, a lot of shenanigans if like we fundamentally. Design it into the hardware. Yeah, yeah.Jeff Dean [00:41:23]: I mean, I think there's still a, there's also sort of the more exotic things like analog based, uh, uh, computing substrates as opposed to digital ones. Uh, I'm, you know, I think those are super interesting cause they can be potentially low power. Uh, but I think you often end up wanting to interface that with digital systems and you end up losing a lot of the power advantages in the digital to analog and analog to digital conversions. You end up doing, uh, at the sort of boundaries. And periphery of that system. Um, I still think there's a tremendous distance we can go from where we are today in terms of energy efficiency with sort of, uh, much better and specialized hardware for the models we care about.Shawn Wang [00:42:05]: Yeah.Alessio Fanelli [00:42:06]: Um, any other interesting research ideas that you've seen, or like maybe things that you cannot pursue a Google that you would be interested in seeing researchers take a step at, I guess you have a lot of researchers. Yeah, I guess you have enough, but our, our research.Jeff Dean [00:42:21]: Our research portfolio is pretty broad. I would say, um, I mean, I think, uh, in terms of research directions, there's a whole bunch of, uh, you know, open problems and how do you make these models reliable and able to do much longer, kind of, uh, more complex tasks that have lots of subtasks. How do you orchestrate, you know, maybe one model that's using other models as tools in order to sort of build, uh, things that can accomplish, uh, you know, much more. Yeah. Significant pieces of work, uh, collectively, then you would ask a single model to do. Um, so that's super interesting. How do you get more verifiable, uh, you know, how do you get RL to work for non-verifiable domains? I think it's a pretty interesting open problem because I think that would broaden out the capabilities of the models, the improvements that you're seeing in both math and coding. Uh, if we could apply those to other less verifiable domains, because we've come up with RL techniques that actually enable us to do that. Uh, effectively, that would, that would really make the models improve quite a lot. I think.Alessio Fanelli [00:43:26]: I'm curious, like when we had Noam Brown on the podcast, he said, um, they already proved you can do it with deep research. Um, you kind of have it with AI mode in a way it's not verifiable. I'm curious if there's any thread that you think is interesting there. Like what is it? Both are like information retrieval of JSON. So I wonder if it's like the retrieval is like the verifiable part. That you can score or what are like, yeah, yeah. How, how would you model that, that problem?Jeff Dean [00:43:55]: Yeah. I mean, I think there are ways of having other models that can evaluate the results of what a first model did, maybe even retrieving. Can you have another model that says, is this things, are these things you retrieved relevant? Or can you rate these 2000 things you retrieved to assess which ones are the 50 most relevant or something? Um, I think those kinds of techniques are actually quite effective. Sometimes I can even be the same model, just prompted differently to be a, you know, a critic as opposed to a, uh, actual retrieval system. Yeah.Shawn Wang [00:44:28]: Um, I do think like there, there is that, that weird cliff where like, it feels like we've done the easy stuff and then now it's, but it always feels like that every year. It's like, oh, like we know, we know, and the next part is super hard and nobody's figured it out. And, uh, exactly with this RLVR thing where like everyone's talking about, well, okay, how do we. the next stage of the non-verifiable stuff. And everyone's like, I don't know, you know, Ellen judge.Jeff Dean [00:44:56]: I mean, I feel like the nice thing about this field is there's lots and lots of smart people thinking about creative solutions to some of the problems that we all see. Uh, because I think everyone sort of sees that the models, you know, are great at some things and they fall down around the edges of those things and, and are not as capable as we'd like in those areas. And then coming up with good techniques and trying those. And seeing which ones actually make a difference is sort of what the whole research aspect of this field is, is pushing forward. And I think that's why it's super interesting. You know, if you think about two years ago, we were struggling with GSM, eight K problems, right? Like, you know, Fred has two rabbits. He gets three more rabbits. How many rabbits does he have? That's a pretty far cry from the kinds of mathematics that the models can, and now you're doing IMO and Erdos problems in pure language. Yeah. Yeah. Pure language. So that is a really, really amazing jump in capabilities in, you know, in a year and a half or something. And I think, um, for other areas, it'd be great if we could make that kind of leap. Uh, and you know, we don't exactly see how to do it for some, some areas, but we do see it for some other areas and we're going to work hard on making that better. Yeah.Shawn Wang [00:46:13]: Yeah.Alessio Fanelli [00:46:14]: Like YouTube thumbnail generation. That would be very helpful. We need that. That would be AGI. We need that.Shawn Wang [00:46:20]: That would be. As far as content creators go.Jeff Dean [00:46:22]: I guess I'm not a YouTube creator, so I don't care that much about that problem, but I guess, uh, many people do.Shawn Wang [00:46:27]: It does. Yeah. It doesn't, it doesn't matter. People do judge books by their covers as it turns out. Um, uh, just to draw a bit on the IMO goal. Um, I'm still not over the fact that a year ago we had alpha proof and alpha geometry and all those things. And then this year we were like, screw that we'll just chuck it into Gemini. Yeah. What's your reflection? Like, I think this, this question about. Like the merger of like symbolic systems and like, and, and LMS, uh, was a very much core belief. And then somewhere along the line, people would just said, Nope, we'll just all do it in the LLM.Jeff Dean [00:47:02]: Yeah. I mean, I think it makes a lot of sense to me because, you know, humans manipulate symbols, but we probably don't have like a symbolic representation in our heads. Right. We have some distributed representation that is neural net, like in some way of lots of different neurons. And activation patterns firing when we see certain things and that enables us to reason and plan and, you know, do chains of thought and, you know, roll them back now that, that approach for solving the problem doesn't seem like it's going to work. I'm going to try this one. And, you know, in a lot of ways we're emulating what we intuitively think, uh, is happening inside real brains in neural net based models. So it never made sense to me to have like completely separate. Uh, discrete, uh, symbolic things, and then a completely different way of, of, uh, you know, thinking about those things.Shawn Wang [00:47:59]: Interesting. Yeah. Uh, I mean, it's maybe seems obvious to you, but it wasn't obvious to me a year ago. Yeah.Jeff Dean [00:48:06]: I mean, I do think like that IMO with, you know, translating to lean and using lean and then the next year and also a specialized geometry model. And then this year switching to a single unified model. That is roughly the production model with a little bit more inference budget, uh, is actually, you know, quite good because it shows you that the capabilities of that general model have improved dramatically and, and now you don't need the specialized model. This is actually sort of very similar to the 2013 to 16 era of machine learning, right? Like it used to be, people would train separate models for lots of different, each different problem, right? I have, I want to recognize street signs and something. So I train a street sign. Recognition recognition model, or I want to, you know, decode speech recognition. I have a speech model, right? I think now the era of unified models that do everything is really upon us. And the question is how well do those models generalize to new things they've never been asked to do and they're getting better and better.Shawn Wang [00:49:10]: And you don't need domain experts. Like one of my, uh, so I interviewed ETA who was on, who was on that team. Uh, and he was like, yeah, I, I don't know how they work. I don't know where the IMO competition was held. I don't know the rules of it. I just trained the models, the training models. Yeah. Yeah. And it's kind of interesting that like people with these, this like universal skill set of just like machine learning, you just give them data and give them enough compute and they can kind of tackle any task, which is the bitter lesson, I guess. I don't know. Yeah.Jeff Dean [00:49:39]: I mean, I think, uh, general models, uh, will win out over specialized ones in most cases.Shawn Wang [00:49:45]: Uh, so I want to push there a bit. I think there's one hole here, which is like, uh. There's this concept of like, uh, maybe capacity of a model, like abstractly a model can only contain the number of bits that it has. And, uh, and so it, you know, God knows like Gemini pro is like one to 10 trillion parameters. We don't know, but, uh, the Gemma models, for example, right? Like a lot of people want like the open source local models that are like that, that, that, and, and, uh, they have some knowledge, which is not necessary, right? Like they can't know everything like, like you have the. The luxury of you have the big model and big model should be able to capable of everything. But like when, when you're distilling and you're going down to the small models, you know, you're actually memorizing things that are not useful. Yeah. And so like, how do we, I guess, do we want to extract that? Can we, can we divorce knowledge from reasoning, you know?Jeff Dean [00:50:38]: Yeah. I mean, I think you do want the model to be most effective at reasoning if it can retrieve things, right? Because having the model devote precious parameter space. To remembering obscure facts that could be looked up is actually not the best use of that parameter space, right? Like you might prefer something that is more generally useful in more settings than this obscure fact that it has. Um, so I think that's always attention at the same time. You also don't want your model to be kind of completely detached from, you know, knowing stuff about the world, right? Like it's probably useful to know how long the golden gate be. Bridges just as a general sense of like how long are bridges, right? And, uh, it should have that kind of knowledge. It maybe doesn't need to know how long some teeny little bridge in some other more obscure part of the world is, but, uh, it does help it to have a fair bit of world knowledge and the bigger your model is, the more you can have. Uh, but I do think combining retrieval with sort of reasoning and making the model really good at doing multiple stages of retrieval. Yeah.Shawn Wang [00:51:49]: And reasoning through the intermediate retrieval results is going to be a, a pretty effective way of making the model seem much more capable, because if you think about, say, a personal Gemini, yeah, right?Jeff Dean [00:52:01]: Like we're not going to train Gemini on my email. Probably we'd rather have a single model that, uh, we can then use and use being able to retrieve from my email as a tool and have the model reason about it and retrieve from my photos or whatever, uh, and then make use of that and have multiple. Um, you know, uh, stages of interaction. that makes sense.Alessio Fanelli [00:52:24]: Do you think the vertical models are like, uh, interesting pursuit? Like when people are like, oh, we're building the best healthcare LLM, we're building the best law LLM, are those kind of like short-term stopgaps or?Jeff Dean [00:52:37]: No, I mean, I think, I think vertical models are interesting. Like you want them to start from a pretty good base model, but then you can sort of, uh, sort of viewing them, view them as enriching the data. Data distribution for that particular vertical domain for healthcare, say, um, we're probably not going to train or for say robotics. We're probably not going to train Gemini on all possible robotics data. We, you could train it on because we want it to have a balanced set of capabilities. Um, so we'll expose it to some robotics data, but if you're trying to build a really, really good robotics model, you're going to want to start with that and then train it on more robotics data. And then maybe that would. It's multilingual translation capability, but improve its robotics capabilities. And we're always making these kind of, uh, you know, trade-offs in the data mix that we train the base Gemini models on. You know, we'd love to include data from 200 more languages and as much data as we have for those languages, but that's going to displace some other capabilities of the model. It won't be as good at, um, you know, Pearl programming, you know, it'll still be good at Python programming. Cause we'll include it. Enough. Of that, but there's other long tail computer languages or coding capabilities that it may suffer on or multi, uh, multimodal reasoning capabilities may suffer. Cause we didn't get to expose it to as much data there, but it's really good at multilingual things. So I, I think some combination of specialized models, maybe more modular models. So it'd be nice to have the capability to have those 200 languages, plus this awesome robotics model, plus this awesome healthcare, uh, module that all can be knitted together to work in concert and called upon in different circumstances. Right? Like if I have a health related thing, then it should enable using this health module in conjunction with the main base model to be even better at those kinds of things. Yeah.Shawn Wang [00:54:36]: Installable knowledge. Yeah.Jeff Dean [00:54:37]: Right.Shawn Wang [00:54:38]: Just download as a, as a package.Jeff Dean [00:54:39]: And some of that installable stuff can come from retrieval, but some of it probably should come from preloaded training on, you know, uh, a hundred billion tokens or a trillion tokens of health data. Yeah.Shawn Wang [00:54:51]: And for listeners, I think, uh, I will highlight the Gemma three end paper where they, there was a little bit of that, I think. Yeah.Alessio Fanelli [00:54:56]: Yeah. I guess the question is like, how many billions of tokens do you need to outpace the frontier model improvements? You know, it's like, if I have to make this model better healthcare and the main. Gemini model is still improving. Do I need 50 billion tokens? Can I do it with a hundred, if I need a trillion healthcare tokens, it's like, they're probably not out there that you don't have, you know, I think that's really like the.Jeff Dean [00:55:21]: Well, I mean, I think healthcare is a particularly challenging domain, so there's a lot of healthcare data that, you know, we don't have access to appropriately, but there's a lot of, you know, uh, healthcare organizations that want to train models on their own data. That is not public healthcare data, uh, not public health. But public healthcare data. Um, so I think there are opportunities there to say, partner with a large healthcare organization and train models for their use that are going to be, you know, more bespoke, but probably, uh, might be better than a general model trained on say, public data. Yeah.Shawn Wang [00:55:58]: Yeah. I, I believe, uh, by the way, also this is like somewhat related to the language conversation. Uh, I think one of your, your favorite examples was you can put a low resource language in the context and it just learns. Yeah.Jeff Dean [00:56:09]: Oh, yeah, I think the example we used was Calamon, which is truly low resource because it's only spoken by, I think 120 people in the world and there's no written text.Shawn Wang [00:56:20]: So, yeah. So you can just do it that way. Just put it in the context. Yeah. Yeah. But I think your whole data set in the context, right.Jeff Dean [00:56:27]: If you, if you take a language like, uh, you know, Somali or something, there is a fair bit of Somali text in the world that, uh, or Ethiopian Amharic or something, um, you know, we probably. Yeah. Are not putting all the data from those languages into the Gemini based training. We put some of it, but if you put more of it, you'll improve the capabilities of those models.Shawn Wang [00:56:49]: Yeah.Jeff Dean [00:56:49]:
In this episode of Data in Biotech, host Ross Katz sits down with James Yoder, Founder and CEO of OpenBench, to unpack a radical new approach to early-stage drug discovery. James shares how OpenBench's "success-driven" model shifts risk away from biotech partners by only charging for validated hits. They dive deep into computational screening, molecular modeling, and the company's evolving tech stack that's making hit discovery smarter and more accessible. Discover how data, AI, and strategic collaboration are redefining biotech R&D. What you'll learn in this episode: >> Why OpenBench moved away from SaaS to a success-based service model >> How their computational platform predicts binding affinity and screens trillions of compounds >> The role of data flywheels and ML in improving drug discovery success rates >> Real-world case studies from biotech collaborations >> How OpenBench evaluates druggable targets in one week Meet our guest James Yoder is the Founder and CEO of OpenBench. With a background in statistics, data science, and applied machine learning, he leads OpenBench's mission to deliver validated drug discovery hits through computational innovation and a success-driven business model. About the host Ross Katz is Principal and Data Science Lead at CorrDyn. Ross specializes in building intelligent data systems that empower biotech and healthcare organizations to extract insights and drive innovation. Connect with our guest: Sponsor: CorrDyn, a data consultancyConnect with James Yoder on LinkedIn Connect with us: Follow the podcast for more insightful discussions on the latest in biotech and data science.Subscribe and leave a review if you enjoyed this episode!Connect with Ross Katz on LinkedIn Sponsored by… This episode is brought to you by CorrDyn, the leader in data-driven solutions for biotech and healthcare. Discover how CorrDyn is helping organizations turn data into breakthroughs at CorrDyn.
What looks like a novelty on the shelf can be a very real business when the fundamentals are right.In this episode of Business of Drinks, we sit down with John King, co-founder and owner of The Original Pickle Shot, to unpack how a bartender-born ritual turned into a nationally scaled spirits brand.The numbers tell the story. The Original Pickle Shot is now selling roughly 110K 9-liter cases annually, growing ~15% year over year, and ranks as the 10th largest flavored vodka in the U.S. — all without outside investment. What many assume is a niche product is, in reality, a high-velocity business driven by occasion, community, and repeat purchase.John walks through what product-market fit actually looked like for the brand — not hype or marketing spend, but watching depletions rise organically as consumers pulled the product through retail. Early success came off-premise first, with 50 mL bottles driving trial and 750 mLs becoming the fastest-growing format as the brand earned its place in party and tailgate occasions.For founders, this episode is a candid look at the trade-offs of staying self-funded. John shares how reinvesting every dollar back into the business forced discipline around expansion, prevented “false volume,” and slowed state rollouts until the company had the operational backbone to support them. The cost: Years of personal sacrifice and saying no to capital. The benefit: Control, speed of decision-making, and sustainable velocity.Distributors and retailers will appreciate John's clear-eyed take on partnerships — why beer vs. spirits houses matter less than alignment on expectations and margins — and how fun, irreverent brands still need hard data to win shelf space.If you're building, selling, or scaling a drinks brand and want a grounded example of how a so-called niche becomes a category leader, this conversation delivers real-world lessons.For the latest updates, follow us:Business of Drinks:YouTubeLinkedInInstagram @bizofdrinksErica Duecy, co-host: Erica Duecy is founder and co-host of Business of Drinks and one of the drinks industry's most accomplished digital and content strategists. She runs the consultancy and advisory arm of Business of Drinks and has built publishing and marketing programs for Drizly, VinePair, SevenFifty, and other hospitality and drinks tech companies.LinkedInInstagram @ericaduecyScott Rosenbaum, co-host: Scott Rosenbaum is co-host of Business of Drinks and a veteran strategist and analyst with deep experience building drinks portfolios. Most recently, he was the Portfolio Development Director at Distill Ventures. Prior to that, he was the Vice President of T. Edward Wines & Spirits, a New York-based importer and distributor.LinkedInCaroline Lamb, contributor: Caroline is a producer and on-air contributor at Business of Drinks and a key account sales and marketing specialist at AHD Vintners, a Michigan-based importer and distributor.LinkedInInstagram @borkalineIf you enjoyed today's conversation, follow Business of Drinks wherever you're listening, and don't forget to rate and review us. Your support helps us reach new listeners passionate about the drinks industry. Thank you!
Erika Erickson is BACK! ML is, too, but from Quebec City. Marc is always here. ALWAYS. STRAIGHT DOPE Erika and […]
Astronomer's Steven Hillion reveals how OpenAI, Anthropic, Uber, and Lyft use Apache Airflow to orchestrate AI and machine learning pipelines at scale on AWS.Topics Include:Steven Hillion leads data and AI at AstronomerApache Airflow surpassed Spark and Kafka in community metricsAstronomer coordinates data flow like conductor orchestrating instrumental platformsOrganizations with data engineering teams use Airflow at scaleCustomers already used Airflow for ML before official promotionUber and Lyft orchestrate pricing models using AirflowAstronomer runs on AWS with close integration partnershipsOpenAI Anthropic and GitHub Copilot use Airflow for operationsInternal data team uses Airflow creating feedback loopsEvolved from constrained AI reports to agentic workflowsPlatform monitors generative AI output quality at user interactionsMetadata and context increasingly critical for AI applicationsLearn more at Astronomer's Data FlowCast podcastParticipants:Steven Hillion – SVP, Data and AI, AstronomerSee how Amazon Web Services gives you the freedom to migrate, innovate, and scale your software company at https://aws.amazon.com/isv/
This week on DanceSpeak, I sit down with Brian 'Footwork' Green, a master teacher and influential figure in street and club dance culture whose impact spans generations. Recorded live in August 2025, this episode captures Brian's unfiltered thoughts on musicality, lineage, and what often gets misunderstood about street dance. We explore competition versus convention culture, the realities of the dance economy, and the difference between who you are and the artistic name you move under. Brian speaks honestly about off-beat dancing, “auto-tuned” movement, teaching, trends, and what gets lost when dance drifts away from the heart. The conversation also touches on race, representation, and identity in dance spaces—layered, nuanced, and rooted in lived experience rather than soundbites. Insightful, funny, challenging, and deeply grounded in culture, this episode is for dancers who love dance enough to think about it, question it, and keep it alive. Instagram – https://www.instagram.com/gogalit Website – https://www.gogalit.com/ Fit From Home – https://galit-s-school-0397.thinkific.com/courses/fit-from-home You can connect with Brian on Instagram https://www.instagram.com/brianfootworkgreen/. You can purchase Brian's on-line dance classes https://www.theybarelyunderstandhello.com/#classes.
Charlie Langton, Jennifer Hammond and ML are HACKED! Hear all about their troubles with X. STRAIGHT DOPE Soulmates already know, […]