POPULARITY
Categories
My fellow pro-growth/progress/abundance Up Wingers,Nuclear fission is a safe, powerful, and reliable means of generating nearly limitless clean energy to power the modern world. A few public safety scares and a lot of bad press over the half-century has greatly delayed our nuclear future. But with climate change and energy-hungry AI making daily headlines, the time — finally — for a nuclear renaissance seems to have arrived.Today on Faster, Please! — The Podcast, I talk with Dr. Tim Gregory about the safety and efficacy of modern nuclear power, as well as the ambitious energy goals we should set for our society.Gregory is a nuclear scientist at the UK National Nuclear Laboratory. He is also a popular science broadcaster on radio and TV, and an author. His most recent book, Going Nuclear: How Atomic Energy Will Save the World is out now.In This Episode* A false start for a nuclear future (1:29)* Motivators for a revival (7:20)* About nuclear waste . . . (12:41)* Not your mother's reactors (17:25)* Commercial fusion, coming soon . . . ? (23:06)Below is a lightly edited transcript of our conversation. A false start for a nuclear future (1:29)The truth is that radiation, we're living in it all the time, it's completely inescapable because we're all living in a sea of background radiation.Pethokoukis: Why do America, Europe, Japan not today get most of their power from nuclear fission, since that would've been a very reasonable prediction to make in 1965 or 1975, but it has not worked out that way? What's your best take on why it hasn't?Going back to the '50s and '60s, it looked like that was the world that we currently live in. It was all to play for, and there were a few reasons why that didn't happen, but the main two were Three Mile Island and Chernobyl. It's a startling statistic that the US built more nuclear reactors in the five years leading up to Three Mile Island than it has built since. And similarly on this side of the Atlantic, Europe built more nuclear reactors in the five years leading up to Chernobyl than it has built since, which is just astounding, especially given that nobody died in Three Mile Island and nobody was even exposed to anything beyond the background radiation as a result of that nuclear accident.Chernobyl, of course, was far more consequential and far more serious than Three Mile Island. 30-odd people died in the immediate aftermath, mostly people who were working at the power station and the first responders, famously the firefighters who were exposed to massive amounts of radiation, and probably a couple of hundred people died in the affected population from thyroid cancer. It was people who were children and adolescents at the time of the accident.So although every death from Chernobyl was a tragedy because it was avoidable, they're not in proportion to the mythic reputation of the night in question. It certainly wasn't reason to effectively end nuclear power expansion in Europe because of course we had to get that power from somewhere, and it mainly came from fossil fuels, which are not just a little bit more deadly than nuclear power, they're orders of magnitude more deadly than nuclear power. When you add up all of the deaths from nuclear power and compare those deaths to the amount of electricity that we harvest from nuclear power, it's actually as safe as wind and solar, whereas fossil fuels kill hundreds or thousands of times more people per unit of power. To answer your question, it's complicated and there are many answers, but the main two were Three Mile Island and Chernobyl.I wonder how things might have unfolded if those events hadn't happened or if society had responded proportionally to the actual damage. Three Mile Island and Chernobyl are portrayed in documentaries and on TV as far deadlier than they really were, and they still loom large in the public imagination in a really unhelpful way.You see it online, actually, quite a lot about the predicted death toll from Chernobyl, because, of course, there's no way of saying exactly which cases of cancer were caused by Chernobyl and which ones would've happened anyway. Sometimes you see estimates that are up in the tens of thousands, hundreds of thousands of deaths from Chernobyl. They are always based on a flawed scientific hypothesis called the linear no-threshold model that I go into in quite some detail in chapter eight of my book, which is all about the human health effects of exposure to radiation. This model is very contested in the literature. It's one of the most controversial areas of medical science, actually, the effects of radiation on the human body, and all of these massive numbers you see of the death toll from Chernobyl, they're all based on this really kind of clunky, flawed, contentious hypothesis. My reading of the literature is that there's very, very little physical evidence to support this particular hypothesis, but people take it and run. I don't know if it would be too far to accuse people of pushing a certain idea of Chernobyl, but it almost certainly vastly, vastly overestimates the effects.I think a large part of the reason of why this had such a massive impact on the public and politicians is this lingering sense of radiophobia that completely blight society. We've all seen it in the movies, in TV shows, even in music and computer games — radiation is constantly used as a tool to invoke fear and mistrust. It's this invisible, centerless, silent specter that's kind of there in the background: It means birth defects, it means cancers, it means ill health. We've all kind of grown up in this culture where the motif of radiation is bad news, it's dangerous, and that inevitably gets tied to people's sense of nuclear power. So when you get something like Three Mile Island, society's imagination and its preconceptions of radiation, it's just like a dry haystack waiting for a flint spark to land on it, and up it goes in flames and people's imaginations run away with them.The truth is that radiation, we're living in it all the time, it's completely inescapable because we're all living in a sea of background radiation. There's this amazing statistic that if you live within a couple of miles of a nuclear power station, the extra amount of radiation you're exposed to annually is about the same as eating a banana. Bananas are slightly radioactive because of the slight amount of potassium-40 that they naturally contain. Even in the wake of these nuclear accidents like Chernobyl, and more recently Fukushima, the amount of radiation that the public was exposed to barely registers and, in fact, is less than the background radiation in lots of places on the earth.Motivators for a revival (7:20)We have no idea what emerging technologies are on the horizon that will also require massive amounts of power, and that's exactly where nuclear can shine.You just suddenly reminded me of a story of when I was in college in the late 1980s, taking a class on the nuclear fuel cycle. You know it was an easy class because there was an ampersand in it. “Nuclear fuel cycle” would've been difficult. “Nuclear fuel cycle & the environment,” you knew it was not a difficult class.The man who taught it was a nuclear scientist and, at one point, he said that he would have no problem having a nuclear reactor in his backyard. This was post-Three Mile Island, post-Chernobyl, and the reaction among the students — they were just astounded that he would be willing to have this unbelievably dangerous facility in his backyard.We have this fear of nuclear power, and there's sort of an economic component, but now we're seeing what appears to be a nuclear renaissance. I don't think it's driven by fear of climate change, I think it's driven A) by fear that if you are afraid of climate change, just solar and wind aren't going to get you to where you want to be; and then B) we seem like we're going to need a lot of clean energy for all these AI data centers. So it really does seem to be a perfect storm after a half-century.And who knows what next. When I started writing Going Nuclear, the AI story hadn't broken yet, and so all of the electricity projections for our future demand, which, they range from doubling to tripling, we're going to need a lot of carbon-free electricity if we've got any hope of electrifying society whilst getting rid of fossil fuels. All of those estimates were underestimates because nobody saw AI coming.It's been very, very interesting just in the last six, 12 months seeing Big Tech in North America moving first on this. Google, Microsoft, Amazon, and Meta have all either invested or actually placed orders for small modular reactors specifically to power their AI data centers. In some ways, they've kind of led the charge on this. They've moved faster than most nation states, although it is encouraging, actually, here in the UK, just a couple of weeks ago, the government announced that our new nuclear power station is definitely going ahead down in Sizewell in Suffolk in the south of England. That's a 3.2 gigawatt nuclear reactor, it's absolutely massive. But it's been really, really encouraging to see Big Tech in the private sector in North America take the situation into their own hands. If anyone's real about electricity demands and how reliable you need it, it's Big Tech with these data centers.I always think, go back five, 10 years, talk of AI was only on the niche subreddits and techie podcasts where people were talking about it. It broke into the mainstream all of a sudden. Who knows what is going to happen in the next five or 10 years. We have no idea what emerging technologies are on the horizon that will also require massive amounts of power, and that's exactly where nuclear can shine.In the US, at least, I don't think decarbonization alone is enough to win broad support for nuclear, since a big chunk of the country doesn't think we actually need to do that. But I think that pairing it with the promise of rapid AI-driven economic growth creates a stronger case.I tried to appeal to a really broad church in Going Nuclear because I really, really do believe that whether you are completely preoccupied by climate change and environmental issues or you're completely preoccupied by economic growth, and raising living, standards and all of that kind of thing, all the monetary side of things, nuclear is for you because if you solve the energy problem, you solve both problems at once. You solve the economic problem and the environmental problem.There's this really interesting relationship between GDP per head — which is obviously incredibly important in economic terms — and energy consumption per head, and it's basically a straight line relationship between the two. There are no rich countries that aren't also massive consumers of energy, so if you really, really care about the economy, you should really also be caring about energy consumption and providing energy abundance so people can go out and use that energy to create wealth and prosperity. Again, that's where nuclear comes in. You can use nuclear power to sate that massive energy demand that growing economies require.This podcast is very pro-wealth and prosperity, but I'll also say, if the nuclear dreams of the '60s where you had, in this country, what was the former Atomic Energy Commission expecting there to be 1000 nuclear reactors in this country by the year 2000, we're not having this conversation about climate change. It is amazing that what some people view as an existential crisis could have been prevented — by the United States and other western countries, at least — just making a different political decision.We would be spending all of our time talking about something else, and how nice would that be?For sure. I'm sure there'd be other existential crises to worry about.But for sure, we wouldn't be talking about climate change was anywhere near the volume or the sense of urgency as we are now if we would've carried on with the nuclear expansion that really took off in the '70s and the '80s. It would be something that would be coming our way in a couple of centuries.About nuclear waste . . . (12:41). . . a 100 percent nuclear-powered life for about 80 years, their nuclear waste would barely fill a wine glass or a coffee cup. I don't know if you've ever seen the television show For All Mankind?I haven't. So many people have recommended it to me.It's great. It's an alt-history that looks at what if the Space Race had never stopped. As a result, we had a much more tech-enthusiastic society, which included being much more pro-nuclear.Anyway, imagine if you are on a plane talking to the person next to you, and the topic of your book comes up, and the person says hey, I like energy, wealth, prosperity, but what are you going to do about the nuclear waste?That almost exact situation has happened, but on a train rather than an airplane. One of the cool things about uranium is just how much energy you can get from a very small amount of it. If typical person in a highly developed economy, say North America, Europe, something like that, if they produced all of their power over their entire lifetime from nuclear alone, so forget fossil fuels, forget wind and solar, a 100 percent nuclear-powered life for about 80 years, their nuclear waste would barely fill a wine glass or a coffee cup. You need a very small amount of uranium to power somebody's life, and the natural conclusion of that is you get a very small amount of waste for a lifetime of power. So in terms of the numbers, and the amount of nuclear waste, it's just not that much of a problem.However, I don't want to just try and trivialize it out of existence with some cool pithy statistics and some cool back-of-the-envelopes physics calculations because we still have to do something with the nuclear waste. This stuff is going to be radioactive for the best part of a million years. Thankfully, it's quite an easy argument to make because good old Finland, which is one of the most nuclear nations on the planet as a share of nuclear in its grid, has solved this problem. It has implemented — and it's actually working now — the world's first and currently only geological repository for nuclear waste. Their idea is essentially to bury it in impermeable bedrock and leave it there because, as with all radioactive objects, nuclear waste becomes less radioactive over time. The idea is that, in a million years, Finland's nuclear waste won't be nuclear waste anymore, it will just be waste. A million years sounds like a really long time to our ears, but it's actually —It does.It sounds like a long time, but it is the blink of an eye, geologically. So to a geologist, a million years just comes and goes straight away. So it's really not that difficult to keep nuclear waste safe underground on those sorts of timescales. However — and this is the really cool thing, and this is one of the arguments that I make in my book — there are actually technologies that we can use to recycle nuclear waste. It turns out that when you pull uranium out of a reactor, once it's been burned for a couple of years in a reactor, 95 percent of the atoms are still usable. You can still use them to generate nuclear power. So by throwing away nuclear waste when it's been through a nuclear reactor once, we're actually squandering like 95 percent of material that we're throwing away.The theory is this sort of the technology behind breeder reactors?That's exactly right, yes.What about the plutonium? People are worried about the plutonium!People are worried about the plutonium, but in a breeder reactor, you get rid of the plutonium because you split it into fission products, and fission products are still radioactive, but they have much shorter half-lives than plutonium. So rather than being radioactive for, say, a million years, they're only radioactive, really, for a couple of centuries, maybe 1000 years, which is a very, very different situation when you think about long-term storage.I read so many papers and memos from the '50s when these reactors were first being built and demonstrated, and they worked, by the way, they're actually quite easy to build, it just happened in a couple of years. Breeder reactors were really seen as the future of humanity's power demands. Forget traditional nuclear power stations that we all use at the moment, which are just kind of once through and then you throw away 95 percent of the energy at the end of it. These breeder reactors were really, really seen as the future.They never came to fruition because we discovered lots of uranium around the globe, and so the supply of uranium went up around the time that the nuclear power expansion around the world kind of seized up, so the uranium demand dropped as the supply increased, so the demand for these breeder reactors kind of petered out and fizzled out. But if we're really, really serious about the medium-term future of humanity when it comes to energy, abundance, and prosperity, we need to be taking a second look at these breeder reactors because there's enough uranium and thorium in the ground around the world now to power the world for almost 1000 years. After that, we'll have something else. Maybe we'll have nuclear fusion.Well, I hope it doesn't take a thousand years for nuclear fusion.Yes, me too.Not your mother's reactors (17:25)In 2005, France got 80 percent of its electricity from nuclear. They almost decarbonized their grid by accident before anybody cared about climate change, and that was during a time when their economy was absolutely booming.I don't think most people are aware of how much innovation has taken place around nuclear in the past few years, or even few decades. It's not just a climate change issue or that we need to power these data centers — the technology has vastly improved. There are newer, safer technologies, so we're not talking about 1975-style reactors.Even if it were the 1975-style reactors, that would be fine because they're pretty good and they have an absolutely impeccable safety record punctuated by a very small number of high-profile events such as Chernobyl and Fukushima. I'm not to count Three Mile Island on that list because nobody died, but you know what I mean.But the modern nuclear reactors are amazing. The ones that are coming out of France, the EPRs, the European Power Reactors, there are going to be two of those in the UK's new nuclear power station, and they've been designed to withstand an airplane flying into the side of them, so they're basically bomb-proof.As for these small modular reactors, that's getting people very excited, too. As their name suggests, they're small. How small is a reasonable question — the answer is as small as you want to go. These things are scalable, and I've seen designs for just one-megawatt reactors that could easily fit inside a shipping container. They could fit in the parking lots around the side of a data center, or in the basement even, all the way up to multi-hundred-megawatt reactors that could fit on a couple of tennis courts worth of land. But it's really the modular part that's the most interesting thing. That's the ‘M' and that's never been done before.Which really gets to the economics of the SMRs.It really does. The idea is you could build upwards of 90 percent of these reactors on a factory line. We know from the history of industrialization that as soon as you start mass producing things, the unit cost just plummets and the timescales shrink. No one has achieved that yet, though. There's a lot of hype around small modular reactors, and so it's kind of important not to get complacent and really keep our eye on the ultimate goal, which is mass-production and mass rapid deployment of nuclear power stations, crucially in the places where you need them the most, as well.We often think about just decarbonizing our electricity supply or decoupling our electricity supply from volatilities in the fossil fuel market, but it's about more than electricity, as well. We need heat for things like making steel, making the ammonia that feeds most people on the planet, food and drinks factories, car manufacturers, plants that rely on steam. You need heat, and thankfully, the primary energy from a nuclear reactor is heat. The electricity is secondary. We have to put effort into making that. The heat just kind of happens. So there's this idea that we could use the surplus heat from nuclear reactors to power industrial processes that are very, very difficult to decarbonize. Small modular reactors would be perfect for that because you could nestle them into the industrial centers that need the heat close by. So honestly, it is really our imaginations that are the limits with these small modular reactors.They've opened a couple of nuclear reactors down in Georgia here. The second one was a lot cheaper and faster to build because they had already learned a bunch of lessons building that first one, and it really gets at sort of that repeatability where every single reactor doesn't have to be this one-off bespoke project. That is not how it works in the world of business. How you get cheaper things is by building things over and over, you get very good at building them, and then you're able to turn these things out at scale. That has not been the economic situation with nuclear reactors, but hopefully with small modular reactors, or even if we just start building a lot of big advanced reactors, we'll get those economies of scale and hopefully the economic issue will then take care of itself.For sure, and it is exactly the same here in the UK. The last reactor that we connected to the grid was in 1995. I was 18 months old. I don't even know if I was fluent in speaking at 18 months old. I was really, really young. Our newest nuclear power station, Hinkley Point C, which is going to come online in the next couple of years, was hideously expensive. The uncharitable view of that is that it's just a complete farce and is just a complete embarrassment, but honestly, you've got to think about it: 1995, the last nuclear reactor in the UK, it was going to take a long time, it was going to be expensive, basically doing it from scratch. We had no supply chain. We didn't really have a workforce that had ever built a nuclear reactor before, and with this new reactor that just got announced a couple of weeks ago, the projected price is 20 percent cheaper, and it is still too expensive, it's still more expensive than it should be, but you're exactly right.By tapping into those economies of scale, the cost per nuclear reactor will fall, and France did this in the '70s and '80s. Their nuclear program is so amazing. France is still the most nuclear nation on the planet as a share of its total electricity. In 2005, France got 80 percent of its electricity from nuclear. They almost decarbonized their grid by accident before anybody cared about climate change, and that was during a time when their economy was absolutely booming. By the way, still today, all of those reactors are still working and they pay less than the European Union average for that electricity, so this idea that nuclear makes your electricity expensive is simply not true. They built 55 nuclear reactors in 25 years, and they did them in parallel. It was just absolutely amazing. I would love to see a French-style nuclear rollout in all developed countries across the world. I think that would just be absolutely amazing.Commercial fusion, coming soon . . . ? (23:06)I think we're pretty good at doing things when we put our minds to it, but certainly not in the next couple of decades. But luckily, we already have a proven way of producing lots of energy, and that's with nuclear fission, in the meantime.What is your enthusiasm level or expectation about nuclear fusion? I can tell you that the Silicon Valley people I talk to are very positive. I know they're inherently very positive people, but they're very enthusiastic about the prospects over the next decade, if not sooner, of commercial fusion. How about you?It would be incredible. The last question that I was asked in my PhD interview 10 years ago was, “If you could solve one scientific or engineering problem, what would it be?” and my answer was nuclear fusion. And that would be the answer that I would give today. It just seems to me to be obviously the solution to the long-term energy needs of humanity. However, I'm less optimistic, perhaps, than the Silicon Valley crowd. The running joke, of course, is that it's always 40 years away and it recedes into the future at one year per year. So I would love to be proved wrong, but realistically — no one's even got it working in a prototype power station. That's before we even think about commercializing it and deploying it at scale. I really, really think that we're decades away, maybe even something like a century. I'd be surprised if it took longer than a century, actually. I think we're pretty good at doing things when we put our minds to it, but certainly not in the next couple of decades. But luckily, we already have a proven way of producing lots of energy, and that's with nuclear fission, in the meantime.Don't go to California with that attitude. I can tell you that even when I go there and I talk about AI, if I say that AI will do anything less than improve economic growth by a factor of 100, they just about throw me out over there. Let me just finish up by asking you this: Earlier, we mentioned Three Mile Island and Chernobyl. How resilient do you think this nuclear renaissance is to an accident?Even if we take the rate of accident over the last 70 years of nuclear power production and we maintain that same level of rate of accident, if you like, it's still one of the safest things that our species does, and everyone talks about the death toll from nuclear power, but nobody talks about the lives that it's already saved because of the fossil fuels, that it's displaced fossil fuels. They're so amazing in some ways, they're so convenient, they're so energy-dense, they've created the modern world as we all enjoy it in the developed world and as the developing world is heading towards it. But there are some really, really nasty consequences of fossil fuels, and whether or not you care about climate change, even the air pollution alone and the toll that that takes on human health is enough to want to phase them out. Nuclear power already is orders of magnitude safer than fossil fuels and I read this really amazing paper that globally, it was something like between the '70s and the '90s, nuclear power saved about two million lives because of the fossil fuels that it displaced. That's, again, orders of magnitude more lives that have been lost as a consequence of nuclear power, mostly because of Chernobyl and Fukushima. Even if the safety record of nuclear in the past stays the same and we forward-project that into the future, it's still a winning horse to bet on.If in the UK they've started up one new nuclear reactor in the past 30 years, right? How many would you guess will be started over the next 15 years?Four or five. Something like that, I think; although I don't know.Is that a significant number to you?It's not enough for my liking. I would like to see many, many more. Look at France. I know I keep going back to it, but it's such a brilliant example. If France hadn't done what they'd done in between the '70s and the '90s — 55 nuclear reactors in 25 years, all of which are still working — it would be a much more difficult case to make because there would be no historical precedent for it. So, maybe predictably, I wouldn't be satisfied with anything less than a French-scale nuclear rollout, let's put it that way.On sale everywhere The Conservative Futurist: How To Create the Sci-Fi World We Were PromisedMicro Reads▶ Economics* The U.S. Marches Toward State Capitalism With American Characteristics - WSJ* AI Spending Is Propping Up the Economy, Right? It's Complicated. - Barron's* Goodbye, $165,000 Tech Jobs. Student Coders Seek Work at Chipotle. - NYT* Sam Altman says Gen Z are the 'luckiest' kids in history thanks to AI, despite mounting job displacement dread - NYT* Lab-Grown Diamonds Are Testing the Power of Markets - Bberg Opinion* Why globalisation needs a leader: Hegemons, alignment, and trade - CEPR* The Rising Returns to R&D: Ideas Are not Getting Harder to Find - SSRN* An Assessment of China's Innovative Capacity - The Fed* Markets are so used to the TACO trade they didn't even blink when Trump extended a tariff delay with China - Fortune* Labor unions mobilize to challenge advance of algorithms in workplaces - Wapo* ChatGPT loves this bull market. Human investors are more cautious. - Axios* What is required for a post-growth model? - Arxiv* What Would It Take to Bring Back US Manufacturing? - Bridgewater▶ Business* An AI Replay of the Browser Wars, Bankrolled by Google - Bberg* Alexa Got an A.I. Brain Transplant. How Smart Is It Now? - NYT* Google and IBM believe first workable quantum computer is in sight - FT* Why does Jeff Bezos keep buying launches from Elon Musk? - Ars* Beijing demands Chinese tech giants justify purchases of Nvidia's H20 chips - FT* An AI Replay of the Browser Wars, Bankrolled by Google - Bberg Opinion* Why Businesses Say Tariffs Have a Delayed Effect on Inflation - Richmond Fed* Lisa Su Runs AMD—and Is Out for Nvidia's Blood - Wired* Forget the White House Sideshow. Intel Must Decide What It Wants to Be. - WSJ* With Billions at Risk, Nvidia CEO Buys His Way Out of the Trade Battle - WSJ* Donald Trump's 100% tariff threat looms over chip sector despite relief for Apple - FT* Sam Altman challenges Elon Musk with plans for Neuralink rival - FT* Threads is nearing X's daily app users, new data shows - TechCrunch▶ Policy/Politics* Trump's China gamble - Axios* U.S. Government to Take Cut of Nvidia and AMD A.I. Chip Sales to China - NYT* A Guaranteed Annual Income Flop - WSJ Opinion* Big Tech's next major political battle may already be brewing in your backyard - Politico* Trump order gives political appointees vast powers over research grants - Nature* China has its own concerns about Nvidia H20 chips - FT* How the US Could Lose the AI Arms Race to China - Bberg Opinion* America's New AI Plan Is Great. There's Just One Problem. - Bberg Opinion* Trump, Seeking Friendlier Economic Data, Names New Statistics Chief - NYT* Trump's chief science adviser faces a storm of criticism: what's next? - Nature* Trump Is Squandering the Greatest Gift of the Manhattan Project - NYT Opinion▶ AI/Digital* Can OpenAI's GPT-5 model live up to sky-high expectations? - FT* Google, Schmoogle: When to Ditch Web Search for Deep Research - WSJ* AI Won't Kill Software. It Will Simply Give It New Life. - Barron's* Chatbot Conversations Never End. That's a Problem for Autistic People. - WSJ* Volunteers fight to keep ‘AI slop' off Wikipedia - Wapo* Trump's Tariffs Won't Solve U.S. Chip-Making Dilemma - WSJ* GenAI Misinformation, Trust, and News Consumption: Evidence from a Field Experiment - NBER* GPT-5s Are Alive: Basic Facts, Benchmarks and the Model Card - Don't Worry About the Vase* What you may have missed about GPT-5 - MIT* Why A.I. Should Make Parents Rethink Posting Photos of Their Children Online - NYT* 21 Ways People Are Using A.I. at Work - NYT* AI and Jobs: The Final Word (Until the Next One) - EIG* These workers don't fear artificial intelligence. They're getting degrees in it. - Wapo* AI Gossip - Arxiv* Meet the early-adopter judges using AI - MIT* The GPT-5 rollout has been a big mess - Ars* A Humanoid Social Robot as a Teaching Assistant in the Classroom - Arxiv* OpenAI Scrambles to Update GPT-5 After Users Revolt - Wired* Sam Altman and the whale - MIT* This is what happens when ChatGPT tries to write scripture - Vox* How AI could create the first one-person unicorn - Economist* AI Robs My Students of the Ability to Think - WSJ Opinion* Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning - Arxiv▶ Biotech/Health* Scientists Are Finally Making Progress Against Alzheimer's - WSJ Opinion* The Dawn of a New Era in Alzheimer's and Parkinson's Treatment - RealClearScience* RFK Jr. shifts $500 million from mRNA research to 'safer' vaccines. Do the data back that up? - Reason* How Older People Are Reaping Brain Benefits From New Tech - NYT* Did Disease Defeat Napoleon? - SciAm* Scientists Discover a Viral Cause of One of The World's Most Common Cancers - ScienceAlert* ‘A tipping point': An update from the frontiers of Alzheimer's disease research - Yale News* A new measure of health is revolutionising how we think about ageing - NS* First proof brain's powerhouses drive – and can reverse – dementia symptoms - NA* The Problem Is With Men's Sperm - NYT Opinion▶ Clean Energy/Climate* The Whole World Is Switching to EVs Faster Than You - Bberg Opinion* Misperceptions About Air Pollution: Implications for Willingness to Pay and Environmental Inequality - NBER* Texas prepares for war as invasion of flesh-eating flies appears imminent - Ars* Data Center Energy Demand Will Double Over the Next Five Years - Apollo Academy* Why Did Air Conditioning Adoption Accelerate Faster Than Predicted? Evidence from Mexico - NBER* Microwaving rocks could help mining operations pull CO2 out of the air - NS* Ford's Model T Moment Isn't About the Car - Heatmap* Five countries account for 71% of the world's nuclear generation capacity - EIA* AI may need the power equivalent of 50 large nuclear plants - E&E▶ Space/Transportation* NASA plans to build a nuclear reactor on the Moon—a space lawyer explains why - Ars* Rocket Lab's Surprise Stock Move After Solid Earnings - Barron's▶ Up Wing/Down Wing* James Lovell, the steady astronaut who brought Apollo 13 home safely, has died - Ars* Vaccine Misinformation Is a Symptom of a Dangerous Breakdown - NYT Opinion* We're hardwired for negativity. That doesn't mean we're doomed to it. - Vox* To Study Viking Seafarers, He Took 26 Voyages in a Traditional Boat - NYT* End is near for the landline-based service that got America online in the '90s - Wapo▶ Substacks/Newsletters* Who will actually profit from the AI boom? - Noahpinion* OpenAI GPT-5 One Unified System - AI Supremacy* Proportional representation is the solution to gerrymandering - Slow Boring* Why I Stopped Being a Climate Catastrophist - The Ecomodernist* How Many Jobs Depend on Exports? - Conversable Economist* ChatGPT Classic - Joshua Gans' Newsletter* Is Air Travel Getting Worse? - Maximum Progress▶ Social Media* On AI Progress - @daniel_271828* On AI Usage - @emollick* On Generative AI and Student Learning - @jburnmurdoch Faster, Please! is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit fasterplease.substack.com/subscribe
ChatGPT-5 just launched, marking a major milestone for OpenAI and the entire AI ecosystem.Fresh off the live stream, Erik Torenberg was joined in the studio by three people who played key roles in making this model a reality:Christina Kim, Researcher at OpenAI, who leads the core models team on post-trainingIsa Fulford, Researcher at OpenAI, who leads deep research and the ChatGPT agent team on post-trainingSarah Wang, General Partner at a16z, who helped lead our investment in OpenAI since 2021They discuss what's actually new in ChatGPT-5—from major leaps in reasoning, coding, and creative writing to meaningful improvements in trustworthiness, behavior, and post-training techniques.We also discuss:How GPT-5 was trained, including RL environments and why data quality matters more than everThe shift toward agentic workflows—what “agents” really are, why async matters, and how it's empowering a new golden age of the “ideas guy”What GPT-5 means for builders, startups, and the broader AI ecosystem going forwardWhether you're an AI researcher, founder, or curious user, this is the deep-dive conversation you won't want to miss.Timecodes:0:00 ChatGPT Origins1:57 Model Capabilities & Coding Improvements4:00 Model Behaviors & Sycophancy6:15 Usage, Pricing & Startup Opportunities8:03 Broader Impact & AGI Discourse16:56 Creative Writing & Model Progress32:37 Training, Data & Reflections36:21 Company Growth & Culture41:39 Closing Thoughts & MissionResourcesFind Christina on X: https://x.com/christinahkimFind Isa on X: https://x.com/isafulfFind Sarah on X: https://x.com/sarahdingwangStay Updated: Let us know what you think: https://ratethispodcast.com/a16zFind a16z on Twitter: https://twitter.com/a16zFind a16z on LinkedIn: https://www.linkedin.com/company/a16zSubscribe on your favorite podcast app: https://a16z.simplecast.com/Follow our host: https://x.com/eriktorenbergPlease note that the content here is for informational purposes only; should NOT be taken as legal, business, tax, or investment advice or be used to evaluate any investment or security; and is not directed at any investors or potential investors in any a16z fund. a16z and its affiliates may maintain investments in the companies discussed. For more details please see a16z.com/disclosures
Johnny and Andy are back talking all things RL in the Northern hemisphere with the strange goings on after the SL voted to expand the competition immediately.Along with more Nigel Wood things as standard along with many more topics of debate
Chapters 00:00:00 Welcome and Guest Introduction 00:01:18 Tulu, OVR, and the RLVR Journey 00:03:40 Industry Approaches to Post-Training and Preference Data 00:06:08 Understanding RLVR and Its Impact 00:06:18 Agents, Tool Use, and Training Environments 00:10:34 Open Data, Human Feedback, and Benchmarking 00:12:44 Chatbot Arena, Sycophancy, and Evaluation Platforms 00:15:42 RLHF vs RLVR: Books, Algorithms, and Future Directions 00:17:54 Frontier Models: Reasoning, Hybrid Models, and Data 00:22:11 Search, Retrieval, and Emerging Model Capabilities 00:29:23 Tool Use, Curriculum, and Model Training Challenges 00:38:06 Skills, Planning, and Abstraction in Agent Models 00:46:50 Parallelism, Verifiers, and Scaling Approaches 00:54:33 Overoptimization and Reward Design in RL 01:02:27 Open Models, Personalization, and the Model Spec 01:06:50 Open Model Ecosystem and Infrastructure 01:13:05 Meta, Hardware, and the Future of AI Competition 01:15:42 Building an Open DeepSeek and Closing Thoughts We first had Nathan on to give us his RLHF deep dive when he was joining AI2, and now he's back to help us catch up on the evolution to RLVR (Reinforcement Learning with Verifiable Rewards), first proposed in his Tulu 3 paper. While RLHF remains foundational, RLVR has emerged as a powerful approach for training models on tasks with clear success criteria and using verifiable, objective functions as reward signals—particularly useful in domains like math, code correctness, and instruction-following. Instead of relying solely on subjective human feedback, RLVR leverages deterministic signals to guide optimization, making it more scalable and potentially more reliable across many domains. However, he notes that RLVR is still rapidly evolving, especially regarding how it handles tool use and multi-step reasoning. We also discussed the Tulu model series, a family of instruction-tuned open models developed at AI2. Tulu is designed to be a reproducible, state-of-the-art post-training recipe for the open community. Unlike frontier labs like OpenAI or Anthropic, which rely on vast and often proprietary datasets, Tulu aims to distill and democratize best practices for instruction and preference tuning. We are impressed with how small eval suites, careful task selection, and transparent methodology can rival even the best proprietary models on specific benchmarks. One of the most fascinating threads is the challenge of incorporating tool use into RL frameworks. Lambert highlights that while you can prompt a model to use tools like search or code execution, getting the model to reliably learn when and how to use them through RL is much harder. This is compounded by the difficulty of designing reward functions that avoid overoptimization—where models learn to “game” the reward signal rather than solve the underlying task. This is particularly problematic in code generation, where models might reward hack unit tests by inserting pass statements instead of correct logic. As models become more agentic and are expected to plan, retrieve, and act across multiple tools, reward design becomes a critical bottleneck. Other topics covered: - The evolution from RLHF (Reinforcement Learning from Human Feedback) to RLVR (Reinforcement Learning from Verifiable Rewards) - The goals and technical architecture of the Tulu models, including the motivation to open-source post-training recipes - Challenges of tool use in RL: verifiability, reward design, and scaling across domains - Evaluation frameworks and the role of platforms like Chatbot Arena and emerging “arena”-style benchmarks - The strategic tension between hybrid reasoning models and unified reasoning models at the frontier - Planning, abstraction, and calibration in reasoning agents and why these concepts matter - The future of open-source AI models, including DeepSeek, OLMo, and the potential for an “American DeepSeek” - The importance of model personality, character tuning, and the model spec paradigm - Overoptimization in RL settings and how it manifests in different domains (control tasks, code, math) - Industry trends in inference-time scaling and model parallelism Finally, the episode closes with a vision for the future of open-source AI. Nathan has now written up his ambition to build an “American DeepSeek”—a fully open, end-to-end reasoning-capable model with transparent training data, tools, and infrastructure. He emphasizes that open-source AI is not just about weights; it's about releasing recipes, evaluations, and methods that lower the barrier for everyone to build and understand cutting-edge systems. It would seem the
Produced as part of MATS 8.0 under the mentorship of Alex Turner and Alex Cloud. This research note overviews some early results which we are looking for feedback on. TL;DR: We train language models with RL in toy environments. We show that penalizing some property of the output is sufficient to suppress that property in the chain of thought also, even when that property is relevant to task completion. For example, when we penalize a model for mentioning in its output that it completed a task via a certain form of cheating, its reasoning also omits this fact. We therefore suspect that training against an output monitor is sufficient to cause obfuscated CoTs in more realistic settings. Introduction Chain of thought (CoT) supervision appears in many control and scalable oversight protocols. It has been argued that being able to monitor CoTs for unwanted behavior is a critical property [...] ---Outline:(00:56) Introduction(02:38) Setup(03:48) Single-Turn Setting(04:26) Multi-Turn Setting(06:51) Results(06:54) Single-Turn Setting(08:21) Multi-Turn Terminal-Based Setting(08:25) Word-Usage Penalty(09:12) LLM Judge Penalty(10:12) Takeaways(10:57) AcknowledgementsThe original text contained 1 footnote which was omitted from this narration. --- First published: July 30th, 2025 Source: https://www.lesswrong.com/posts/CM7AsQoBxDW4vhkP3/optimizing-the-final-output-can-obfuscate-cot-research-note --- Narrated by TYPE III AUDIO. ---Images from the article:
Read more about Kafka Diskless-topics, KIP by Aiven:KIP-1150: https://fnf.dev/3EuL7mvSummary:In this conversation, Kaivalya Apte and Alexis Schlomer discuss the internals of query optimization with the new project optd. They explore the challenges faced by existing query optimizers, the importance of cost models, and the advantages of using Rust for performance and safety. The discussion also covers the innovative streaming model of query execution, feedback mechanisms for refining optimizations, and the future developments planned for optd, including support for various databases and enhanced cost models.Chapters00:00 Introduction to optd and Its Purpose03:57 Understanding Query Optimization and Its Importance10:26 Defining Query Optimization and Its Challenges17:32 Exploring the Limitations of Existing Optimizers21:39 The Role of Calcite in Query Optimization26:54 The Need for a Domain-Specific Language40:10 Advantages of Using Rust for optd44:37 High-Level Overview of optd's Functionality48:36 Optimizing Query Execution with Coroutines50:03 Streaming Model for Query Optimization51:36 Client Interaction and Feedback Mechanism54:18 Adaptive Decision Making in Query Execution54:56 Persistent Memoization for Enhanced Performance57:12 Guided Scheduling in Query Optimization59:55 Balancing Execution Time and Optimization01:01:43 Understanding Cost Models in Query Optimization01:04:22 Exploring Storage Solutions for Query Optimization01:07:13 Enhancing Observability and Caching Mechanisms01:07:44 Future Optimizations and System Improvements01:18:02 Challenges in Query Optimization Development01:20:33 Upcoming Features and Roadmap for optdReferences:- NeuroCard: learned Cardinality Estimation: https://vldb.org/pvldb/vol14/p61-yang.pdf- RL-based QO: https://arxiv.org/pdf/1808.03196- Microsoft book about QO: https://www.microsoft.com/en-us/research/publication/extensible-query-optimizers-in-practice/- Cascades paper: https://15721.courses.cs.cmu.edu/spring2016/papers/graefe-ieee1995.pdf- optd source code: https://github.com/cmu-db/optd- optd website (for now): https://db.cs.cmu.edu/projects/optd/For memberships: join this channel as a member here:https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinDon't forget to like, share, and subscribe for more insights!=============================================================================Like building stuff? Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.https://app.codecrafters.io/join?via=geeknarrator=============================================================================Database internals series: https://youtu.be/yV_Zp0Mi3xsPopular playlists:Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_dModern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsNStay Curios! Keep Learning!#database #queryoptimization #sql #postgres
Atai Life Sciences CEO Dr Srinivas Rao talked with Proactive's Stephen Gunnion about the company's recent topline results from its Phase 2b trial evaluating BPL-003 for treatment-resistant depression. Rao described BPL-003 as a short-acting psychedelic with a total psychedelic duration of under two hours. He said, “The majority of patients were actually discharge ready by 90 minutes.” The study, conducted in nearly 200 patients, used three dosing levels and showed that the 8mg dose produced a change of over six points on the MADRS scale at four weeks — a benefit that persisted through eight weeks. He noted the findings confirmed strong efficacy and durability, comparable to psilocybin but with a significantly shorter duration of effect. The safety profile was positive, with no serious adverse events and the vast majority of side effects being mild or moderate. Rao also previewed next steps, including data from an open-label extension and a two-dose induction strategy. He confirmed plans to meet with regulators for end-of-Phase 2 guidance. Beyond BPL-003, Atai is progressing VLS-01 and EMP-01. VLS-01 is in a Phase 2b study, expected to report in Q1 next year. Rao also updated viewers on RL-007, a non-psychedelic cognitive enhancer in development through Recognify Life Sciences. Although RL-007 showed numerical improvement in a recent Phase 2b trial, it didn't meet statistical significance, and Atai plans to explore partnership opportunities rather than continuing development in-house. Visit Proactive's YouTube channel for more interviews like this one. Don't forget to like the video, subscribe to the channel, and enable notifications for future updates. #AtaiLifeSciences #BPL003 #PsychedelicTherapy #TreatmentResistantDepression #ClinicalTrials #MentalHealth #BiotechNews #DrugDevelopment #RL007 #VLS01 #PsychiatryInnovation #HealthcareInvesting
Join Tommy Shaughnessy as he hosts Pondering Durian (Lead at Delphi Intelligence) and José Macedo (Co-Founder at Delphi Labs & Founding Partner at Delphi Ventures) to introduce Delphi Intelligence — Delphi's new open research initiative focused on artificial intelligence. Learn why Delphi is going deep into frontier models, robotics, reinforcement learning, and the intersection of crypto and AI, and how this initiative aims to uncover transformative opportunities across emerging tech.Delphi Intelligence: https://www.delphiintelligence.io/
No Priors: Artificial Intelligence | Machine Learning | Technology | Startups
Superintelligence, at least in an academic sense, has already been achieved. But Misha Laskin thinks that the next step towards artificial superintelligence, or ASI, should look both more user and problem-focused. ReflectionAI co-founder and CEO Misha Laskin joins Sarah Guo to introduce Asimov, their new code comprehension agent built on reinforcement learning (RL). Misha talks about creating tools and designing AI agents based on customer needs, and how that influences eval development and the scope of the agent's memory. The two also discuss the challenges in solving scaling for RL, the future of ASI, and the implications for Google's “non-acquisition” of Windsurf. Sign up for new podcasts every week. Email feedback to show@no-priors.com Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil | @MishaLaskin | @reflection_ai Chapters: 00:00 – Misha Laskin Introduction 00:44 – Superintelligence vs. Super Intelligent Autonomous Systems 03:26 – Misha's Journey from Physics to AI 07:48 – Asimov Product Release 11:52 – What Differentiates Asimov from Other Agents 16:15 – Asimov's Eval Philosophy 21:52 – The Types of Queries Where Asimov Shines 24:35 – Designing a Team-Wide Memory for Asimov 28:38 – Leveraging Pre-Trained Models 32:47 – The Challenges of Solving Scaling in RL 37:21 – Training Agents in Copycat Software Environments 38:25 – When Will We See ASI? 44:27 – Thoughts on Windsurf's Non-Acquisition 48:10 – Exploring Non-RL Datasets 55:12 – Tackling Problems Beyond Engineering and Coding 57:54 – Where We're At in Deploying ASI in Different Fields 01:02:30 – Conclusion
Hey everyone, Alex here
In today's episode we get back to interviews! I'm joined by Greybeard, a huge advocate for the SSA region, and a lover of Rocket Leauge. We had a great chat about his time with RL. The ups and downs and lessons that he has learned along the way.
Twitter | Paper PDF Seven years ago, OpenAI five had just been released, and many people in the AI safety community expected AIs to be opaque RL agents. Luckily, we ended up with reasoning models that speak their thoughts clearly enough for us to follow along (most of the time). In a new multi-org position paper, we argue that we should try to preserve this level of reasoning transparency and turn chain of thought monitorability into a systematic AI safety agenda. This is a measure that improves safety in the medium term, and it might not scale to superintelligence even if somehow a superintelligent AI still does its reasoning in English. We hope that extending the time when chains of thought are monitorable will help us do more science on capable models, practice more safety techniques "at an easier difficulty", and allow us to extract more useful work from [...] --- First published: July 15th, 2025 Source: https://www.lesswrong.com/posts/7xneDbsgj6yJDJMjK/chain-of-thought-monitorability-a-new-and-fragile --- Narrated by TYPE III AUDIO. ---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
Una parte del dinero de los ahorrantes de la Asociación Cooperativa de Ahorro y Crédito Santa Victoria (Cosavi de RL) se usó en los últimos tres años para la compra de lujosos apartamentos en Miami, Estados Unidos, valorados en $5.8 millones. Otra parte se destinó a la adquisición de 8,670 botellas de vino, cervezas, limonadas y licores importados desde Francia, por un monto aproximado de $117,428. Las compras fueron realizadas por empresas de Manuel Alberto Coto Barrientos, el gerente de Cosavi que falleció en un accidente aéreo en septiembre de 2024, y que meses antes del escándalo financiero intentó “solucionar” el problema de la cooperativa por medio de un exasesor jurídico de Casa Presidencial.
全球最聰明的 AI 誕生了,而且它不是 GPT。xAI 推出的 Grok 4,在最新的 AI 大魔王考試裡,不只全場最高分,甚至學會了怎麼自己叫工具、自己算數學、還自己訂貨賣東西,靠經營虛擬販賣機賺了 4694 美金,撐了 324 天不崩潰。它的祕密武器叫做——巨量強化學習。這集我們就來聊聊:
Storytelling: The Most Scalable Post-AI Business Skill Worth MasteringIn a world of AI-generated content and fractured attention, your ability to tell compelling stories may be your greatest competitive advantage.Episode SummaryIn this episode of The Digital Contrarian, host Ryan Levesque dives into why storytelling is the most scalable post-AI business skill worth developing.You'll learn how stories create deeper connection than any other content, discover the three story types you need to master, and find five contrarian tips for telling better stories that cut through the noise.Question of the Day
My Contrarian YouTube Strategy: Building a Strategic Content EcosystemMy contrarian YouTube strategy creates 500K+ annual views with just ONE piece of original content per week.Episode SummaryIn this episode of The Digital Contrarian, host Ryan Levesque dives into the evolution of his strategic content ecosystem six months after first introducing it.You'll learn how one weekly "source of truth" piece creates a sustainable content system that respects family time, discover the long-form leverage strategy for YouTube, and find how to cultivate influence versus chasing attention.Question of the Day
Hey everyone, Alex here
Johnny is joined by Andy from the Only in RL podcast as they discus what the f*** is going on with RL in England. This is a new Regular Podcast with Andy and other guests which focuses on the Northern Hemisphere.
The Breakdown of Shared Reality: AI's Most Dangerous Unintended ConsequenceNobel Prize winner Geoffrey Hinton warns that AI-driven personalization is destroying our collective understanding of what's real.Episode SummaryIn this episode of The Digital Contrarian, host Ryan Levesque dives into the breakdown of shared reality caused by AI-driven hyper-personalization and its profound implications for business and society.You'll learn why isolated algorithmic realities undermine strategic thinking, discover the concept of the "Promethean Transition" we're navigating, and find how to choose between being a tunnel digger or pathfinder in our AI future.Question of the Day
How do you build a foundation model that can write code at a human level? Eiso Kant (CTO & co-founder, Poolside) reveals the technical architecture, distributed team strategies, and reinforcement learning breakthroughs powering one of Europe's most ambitious AI startups. Learn how Poolside operates 10,000+ H200s, runs the world's largest code execution RL environment, and why CTOs must rethink engineering orgs for an agent-driven future.
Between Death and Danger: Wilderness Wisdom from the Grand CanyonBetween death and danger lies the path up the mountain—a profound insight revealed during my life-changing vision quest.Episode SummaryIn this episode of The Digital Contrarian, host Ryan Levesque dives into a deeply personal father-son Grand Canyon journey that became the catalyst for his upcoming book "Return to Real".You'll learn why reconnecting with nature isn't just a luxury but essential for breakthrough thinking, discover the symbolic message from a desert pocket mouse and California condor, and find how God-made things offer clarity in our AI-driven world.Question of the Day
Category of One: A $10 Million Business BlueprintAfter a decade building a highly profitable $10M/year consulting and software company, I reveal the contrarian framework that made it all possible.Episode SummaryIn this episode of The Digital Contrarian, host Ryan Levesque dives into what it really means to create a "Category of One" business and why it's the only type worth building.You'll learn how to position yourself where no direct comparison exists, discover the exact framework for charging premium prices, and find the three-phase growth strategy that took his company from zero to over $1M/month.Question of the Day
Munaf Manji and Griffin Warner talk Tuesday MLB betting. Best bets as always.
Setbacks and Breakthroughs: Why Feeling Stuck Might Mean You're on the BrinkWhat if your greatest setbacks are actually seeds for your most meaningful breakthroughs?Episode SummaryIn this episode of The Digital Contrarian, host Ryan Levesque dives into the nature of setbacks and breakthroughs through personal farm stories and neuroscience research.You'll learn why breakthroughs often cluster after prolonged setbacks, discover how moderate adversity builds resilience, and find three reflection questions to help you prepare for your next breakthrough.Question of the Day
Courage for the Rest of Us: Giving Up Good to Go After GreatMost of us aren't afraid of failure—we're afraid of other people seeing us fail.Episode SummaryIn this episode of The Digital Contrarian, host Ryan Levesque dives into the true nature of courage and why giving up "good" to go after "great" is so difficult.You'll learn why the opinions of others hold more power over us than our own fears, discover Brené Brown's "Square Squad" technique to silence the noise, and find a simple 30-day experiment to build courage.Question of the Day
More stories from Reddit... FOLLOW ME ON KICK! https://kick.com/southerncannibal BUY MY MERCH PLEASE! https://southern-cannibal-shop.fourthwall.com/? Send your TRUE Scary Stories HERE! ► https://southerncannibal.com/ OR Email at southerncannibalstories@gmail.com LISTEN TO THE DINNER TABLE PODCAST! ► https://open.spotify.com/show/3zfschBzphkHhhpV870gFW?si=j53deGSXRxyyo9rsxqbFgw Faqs about me ► https://youtube.fandom.com/wiki/Southern_Cannibal Stalk Me! ► Twitter: https://twitter.com/iAmCanni ► Instagram: https://instagram.com/SouthernCannibal ► Scary Story Playlist: https://www.youtube.com/playlist?list=PL18YGadwJHERUzNMxTSoIYRIoUWfcGO2I ► DISCLAIMER: All Stories and Music featured in today's video were granted FULL permission for use on the Southern Cannibal YouTube Channel! Huge Thanks to these brave folks who sent in their stories! #1. - u/Rozalera #2. - Anonymous #3. - Anonymous #4. - RL #5. - FF #6. - BettyJoe #7. - John Huge Thanks to these talented folks for their creepy music! ► Myuuji: https://www.youtube.com/c/myuuji ♪ ► CO.AG Music: https://www.youtube.com/channel/UCcavSftXHgxLBWwLDm_bNvA ♪ ► Kevin MacLeod: http://incompetech.com ♪ ► Piano Horror: https://www.youtube.com/PianoHorror ♪ https://creativecommons.org/licenses/by/3.0/us/
The Trust Molecule: Why Oxytocin (Not Dopamine) Will Define the Post AI EraIn a world built around dopamine hits, oxytocin might just be the brain molecule that matters most for your business.Episode SummaryIn this episode of The Digital Contrarian, host Ryan Levesque dives into the neuroscience of trust and why oxytocin will become your greatest competitive advantage.You'll learn about the four key happiness chemicals and why oxytocin stands apart, discover the "Global Oxytocin Deficit" creating both crisis and opportunity, and get three science-backed strategies to strategically elicit oxytocin in your business.Question of the Day
This week on Dividend Talk, Derek is joined by fellow Dutch investor DazZMikey as European DGI enjoys a well-earned holiday. Together, we go on a unique “Stock Safari,” exploring dividend gems far from our usual American and Western European terrain.Along the way, they reflect on macro news, dividend hikes and cuts, and how to handle markets you don't fully trust. Hope you enjoy
New episode with my good friends Sholto Douglas & Trenton Bricken. Sholto focuses on scaling RL and Trenton researches mechanistic interpretability, both at Anthropic.We talk through what's changed in the last year of AI research; the new RL regime and how far it can scale; how to trace a model's thoughts; and how countries, workers, and students should prepare for AGI.See you next year for v3. Here's last year's episode, btw. Enjoy!Watch on YouTube; listen on Apple Podcasts or Spotify.----------SPONSORS* WorkOS ensures that AI companies like OpenAI and Anthropic don't have to spend engineering time building enterprise features like access controls or SSO. It's not that they don't need these features; it's just that WorkOS gives them battle-tested APIs that they can use for auth, provisioning, and more. Start building today at workos.com.* Scale is building the infrastructure for safer, smarter AI. Scale's Data Foundry gives major AI labs access to high-quality data to fuel post-training, while their public leaderboards help assess model capabilities. They also just released Scale Evaluation, a new tool that diagnoses model limitations. If you're an AI researcher or engineer, learn how Scale can help you push the frontier at scale.com/dwarkesh.* Lighthouse is THE fastest immigration solution for the technology industry. They specialize in expert visas like the O-1A and EB-1A, and they've already helped companies like Cursor, Notion, and Replit navigate U.S. immigration. Explore which visa is right for you at lighthousehq.com/ref/Dwarkesh.To sponsor a future episode, visit dwarkesh.com/advertise.----------TIMESTAMPS(00:00:00) – How far can RL scale?(00:16:27) – Is continual learning a key bottleneck?(00:31:59) – Model self-awareness(00:50:32) – Taste and slop(01:00:51) – How soon to fully autonomous agents?(01:15:17) – Neuralese(01:18:55) – Inference compute will bottleneck AGI(01:23:01) – DeepSeek algorithmic improvements(01:37:42) – Why are LLMs ‘baby AGI' but not AlphaZero?(01:45:38) – Mech interp(01:56:15) – How countries should prepare for AGI(02:10:26) – Automating white collar work(02:15:35) – Advice for students Get full access to Dwarkesh Podcast at www.dwarkesh.com/subscribe
S&P Futures are trading lower this morning. There are news reports indicating that Isreal is preparing to launch a strike on Iran. In March, President Trump gave Iran a 60-day deadline to reach a deal, that deadline has passed. House Republicans appear close to passing their reconciliation bill, The Senate will likely make changes to the bill. If the bill passes in the House and Senate, it will likely be a negative for markets as it will increase the deficit. Yesterday, President Trump unveiled a missile defense plan, LHX shares are higher. Medtronic plans to separate its diabetes business into a stand-alone company. Take Two announced a $1 Billion stock offering. KEYS, BIDU, & LOW are higher after earning announcements. After the bell today SNOW, ZM and URBN are set to report. On Thursday morning ADI, BJ & RL will repor
Truth, Hell, & The Lie They Believe: A 12-Part Framework For Owning Your CategoryWhat if your most powerful business advantage isn't just what you know, but understanding the lie your audience believes?Episode SummaryIn this episode of The Digital Contrarian, host Ryan Levesque dives into a complete 12-part strategic framework for differentiating yourself in any market.You'll learn how to identify your audience's "hell" and the lie they believe, discover why contrarian truths create more impact than incremental improvements, and get a strategic blueprint you can apply to any major project or business repositioning.Question of the Day
Inside the Wolf’s Den an Entrepreneurial Journey with Shawn and Joni Wolfswinkel
Join hosts Shawn and Joni Wolfswinkel for an inspiring episode of Inside The Wolf's Den as they sit down with Peter Lohmann, a successful entrepreneur and expert in the property management industry. As co-founder and CEO of RL Property Management in Columbus, Ohio, Peter oversees over hundreds of residential units and has built a reputation for his innovative systems and leadership in the field. In this episode, Peter shares his remarkable journey from control systems engineer to leading a thriving property management company. Discover what inspired him to start RL, how the industry has evolved since 2013, and the early challenges he faced—along with his strategies for overcoming them. Peter also discusses the vital qualities of effective leadership, maintaining team motivation, and differentiating in a competitive market. Listeners will gain insights into current industry trends, especially how technology is transforming property management operations. Peter offers practical advice for property owners selecting a management company and emphasizes the importance of trust, transparency, and communication in building strong client and tenant relationships. Beyond business, Peter shares his leadership philosophy, balancing a demanding career with family life, and the influences shaping his approach to success. He also provides a glimpse into the future of property management, highlighting innovative projects and emerging trends to watch. Whether you're an aspiring entrepreneur, seasoned property manager, or property owner, this episode delivers valuable insights from a leader who is shaping the future of the industry. Tune in for an engaging conversation packed with actionable tips, inspiring stories, and expert advice. RL Property Management Link: https://rlpmg.com Email Link: info@rlpmg.com
This Week in Machine Learning & Artificial Intelligence (AI) Podcast
Today, we're joined by Mahesh Sathiamoorthy, co-founder and CEO of Bespoke Labs, to discuss how reinforcement learning (RL) is reshaping the way we build custom agents on top of foundation models. Mahesh highlights the crucial role of data curation, evaluation, and error analysis in model performance, and explains why RL offers a more robust alternative to prompting, and how it can improve multi-step tool use capabilities. We also explore the limitations of supervised fine-tuning (SFT) for tool-augmented reasoning tasks, the reward-shaping strategies they've used, and Bespoke Labs' open-source libraries like Curator. We also touch on the models MiniCheck for hallucination detection and MiniChart for chart-based QA. The complete show notes for this episode can be found at https://twimlai.com/go/731.
Who Are You Really Building For: The 100,000, The 100, or The One?Are you diluting your message by trying to please everyone instead of focusing on your ideal customer?Episode SummaryIn this episode of The Digital Contrarian, host Ryan Levesque explores the strategic dilemma of who entrepreneurs should truly optimize their business for.You'll learn why chasing scale often leads to diluted messaging, how focusing on "The One" ideal customer creates authentic resonance, and discover why bestselling authors write for specific real people rather than abstract audiences.Question of the Day
What happens when three unstoppable forces converge and rewrite the rules of modern life?Episode SummaryIn this episode of The Digital Contrarian, host Ryan Levesque unpacks three seismic shifts reshaping civilization.You'll discover Ray Dalio's “Big Cycle” of American decline, understand AI's existential threat to human meaning, and learn how the end of infinite growth is fuelling a worldwide “return to real.”Question of the Day
In this inaugural episode of The Digital Contrarian podcast, host Ryan Levesque announces the launch of the audio edition of his popular weekly newsletter.You'll learn what to expect from this new format, discover some of the most popular past issues we'll be exploring, and find out how this podcast serves digital entrepreneurs building meaningful businesses in our AI-driven world.Question of the Day
Join Tommy Shaughnessy from Delphi Ventures as he hosts Sam Lehman, Principal at Symbolic Capital and AI researcher, for a deep dive into the Reinforcement Learning (RL) renaissance and its implications for decentralized AI. Sam recently authored a widely discussed post, "The World's RL Gym", exploring the evolution of AI scaling and the exciting potential of decentralized networks for training next-generation models. The World's RL Gym: https://www.symbolic.capital/writing/the-worlds-rl-gym
Vasek Mlejnsky from E2B joins us today to talk about sandboxes for AI agents. In the last 2 years, E2B has grown from a handful of developers building on it to being used by ~50% of the Fortune 500 and generating millions of sandboxes each week for their customers. As the “death of chat completions” approaches, LLMs workflows and agents are relying more and more on tool usage and multi-modality. The most common use cases for their sandboxes: - Run data analysis and charting (like Perplexity) - Execute arbitrary code generated by the model (like Manus does) - Running evals on code generation (see LMArena Web) - Doing reinforcement learning for code capabilities (like HuggingFace) Timestamps: 00:00:00 Introductions 00:00:37 Origin of DevBook -> E2B 00:02:35 Early Experiments with GPT-3.5 and Building AI Agents 00:05:19 Building an Agent Cloud 00:07:27 Challenges of Building with Early LLMs 00:10:35 E2B Use Cases 00:13:52 E2B Growth vs Models Capabilities 00:15:03 The LLM Operating System (LLMOS) Landscape 00:20:12 Breakdown of JavaScript vs Python Usage on E2B 00:21:50 AI VMs vs Traditional Cloud 00:26:28 Technical Specifications of E2B Sandboxes 00:29:43 Usage-based billing infrastructure 00:34:08 Pricing AI on Value Delivered vs Token Usage 00:36:24 Forking, Checkpoints, and Parallel Execution in Sandboxes 00:39:18 Future Plans for Toolkit and Higher-Level Agent Frameworks 00:42:35 Limitations of Chat-Based Interfaces and the Future of Agents 00:44:00 MCPs and Remote Agent Capabilities 00:49:22 LLMs.txt, scrapers, and bad AI bots 00:53:00 Manus and Computer Use on E2B 00:55:03 E2B for RL with Hugging Face 00:56:58 E2B for Agent Evaluation on LMArena 00:58:12 Long-Term Vision: E2B as Full Lifecycle Infrastructure for LLMs 01:00:45 Future Plans for Hosting and Deployment of LLM-Generated Apps 01:01:15 Why E2B Moved to San Francisco 01:05:49 Open Roles and Hiring Plans at E2B
Yang Tang, co-founder of Memetica, joins Sam to dive deep into the world of AI agents — what they are, how they're trained, and how they're already generating value across Web2 and Web3. From his background in institutional finance and machine learning to launching BSD and Liam, Yang walks us through building intelligent, monetizable agents and why the future of AI is vertical-specific and application-first.Key Timestamps[00:00:00] Introduction: Sam welcomes Yang Tang and introduces the topic of AI agents.[00:01:00] Yang's Background: From Wall Street to machine learning to Web3.[00:03:00] Evolution of Trading: How everything became algorithmic post-2008.[00:05:00] Why AI Agents Now: LLMs aren't applications — agents are.[00:06:00] Core Features: Memetica's pillars — memory, RL, and utility.[00:08:00] Competing with Giants: Why focus beats AGI and big capital.[00:10:00] Data Strategy: Why private data is useless without context.[00:12:00] Use Cases: Real-world agent examples like Liam and BSD.[00:14:00] Reinforcement Learning: How Liam evolved to boost impressions.[00:16:00] Tokens and Agents: The rise of BSD and market cap milestones.[00:18:00] Pricing and Ownership: Who owns the agent's IP and revenue?[00:20:00] SME and Enterprise Use: From sports betting to social media ops.[00:23:00] Institutional AI Demand: Why application matters more than research.[00:25:00] Distribution Challenges: Why even strong products struggle to scale.[00:28:00] Time vs. Decision Value: Where AI agents can win right now.[00:30:00] Agent vs. Human: Running A/B tests with agents on social.[00:34:00] AI Misuse: The Trump chart story and hallucination risks.[00:36:00] Launching Tokens: What it takes to create tokenized agents.[00:38:00] Utility vs. Distraction: The token paradox for founders.[00:41:00] Building for SMEs: Future plans to support long-tail businesses.[00:44:00] Hiring and Scaling: What Memetica needs to grow.[00:46:00] Accuracy & Safeguards: How Memetica agents reach 95%+ accuracy.[00:47:00] Final Ask: Yang is raising, hiring, and looking to onboard more creators and partners.Connecthttps://memetica.ai/https://x.com/memeticaAIhttps://www.linkedin.com/company/qstarlabs/https://x.com/yangtanghttps://www.linkedin.com/in/yangtang/DisclaimerNothing mentioned in this podcast is investment advice and please do your own research. Finally, it would mean a lot if you can leave a review of this podcast on Apple Podcasts or Spotify and share this podcast with a friend.Be a guest on the podcast or contact us - https://www.web3pod.xyz/
¡Qué no se te pase! Rl registro para el programa "Mi derecho, Mi lugar" concluye el 15 de abril ¿Te quedas en la CDMX en Semana Santa? ¡Aquí tenemos algunas actividades!ONU reduce su personal tras la salida de EU Más información en nuestro podcast
In this inspiring episode of The Writing Glitch, Cheri Dotterer sits down with John Munro, Head of School at the GOW School—an internationally recognized boarding and day school transforming the lives of students with language-based learning disabilities. John shares the school's rich history, rooted in the work of Dr. Samuel Orton, and details the school's signature Reconstructive Language (RL) curriculum that empowers students to master reading and writing through neuroscience-backed methods. Discover how small classes, structured literacy, a robotics program inspired by BattleBots, and deep staff-student relationships make GOW a hidden gem for students from around the world.https://www.gow.org/**************************************************************************TIME STAMPS01:00 GOW's mission to transform life trajectories for students02:00 The meaning behind “ignite learning” at GOW03:00 John's background and motivation for joining GOW04:00 The school's 99-year history and founding story06:00 From boys-only to co-ed and its current demographics07:00 International student body and cultural representation08:00 Supporting English language learners with dyslexia09:00 Overview of GOW's academic structure (6-day school week)10:00 Athletics and extracurriculars at GOW11:00 Outdoor education and unique enrichment offerings12:00 Day student experience mirrors that of boarders13:00 Faculty's intensive involvement in student life14:00 Teacher commitment and long-term retention15:00 Academic calendar with built-in recharge breaks17:00 Handling breaks and student housing during holidays18:00 Personal boarding school connection and perspectives19:00 Transition to discussion about Reconstructive Language20:00 What is RL and how it originated at GOW21:00 Structure of the RL deck and how it builds reading skills23:00 Integration of RL with writing instruction24:00 Enrollment capacity and class sizes25:00 Robotics program and BattleBots championship success27:00 Admitting students who are a mission fit28:00 GOW as a college-prep school, not a therapeutic school29:00 Summer program overview: academics + camp fun30:00 How summer school feeds full-year enrollment31:00 Structured literacy benefits all learners32:00 Website and open house details33:00 The school's four pillars: Honesty, Hard Work, Respect, Kindness****************************************************************************BOOKSHandwriting Brain Body DISconnect Digital Version: https://disabilitylabs.com/courses/hwbbd On Amazon: https://www.amazon.com/Handwriting-Br...*****************************************************************************SUBSCRIBE and LISTEN to the Audio version of the podcast here on YouTube or your favorite podcast app.APPLE: https://podcasts.apple.com/us/podcast/the-writing-glitch/id1641728130?uo=4SPOTIFY: https://open.spotify.com/show/5rU9kLxjkqJE5GbyCycrHEAMAZON MUSIC/AUDIBLE: https://music.amazon.com/podcasts/894b3ab2-3b1c-4a97-af60-b1f2589d271fYOUTUBE: https://www.youtube.com/@TheWritingGlitchPodcast*****************************************************************************FREE WEBINARSpecial Offer coming in March. Sign up TODAY! https://3MathInterventions.eventbrite.com*************************************************************************Other ways to connect with Cheri Linked In: https://www.linkedin.com/in/cheridott...FB: https://www.facebook.com/groups/tier1...IG: https://www.instagram.com/cheridotterer/X: https://twitter.com/CheriDottererTikTok:
Join Tom Shaughnessy as he hosts Travis Good, CEO and co-founder of Ambient, for a deep dive into the world's first useful proof-of-work blockchain powered by AI. Fresh out of stealth, Ambient reimagines the intersection of crypto and AI by creating a decentralized network where mining secures the chain through verified AI inference on a 600B+ parameter model.
In this illuminating episode of The Cognitive Revolution, host Nathan Labenz speaks with Jack Rae, principal research scientist at Google DeepMind and technical lead on Google's thinking and inference time scaling work. They explore the technical breakthroughs behind Google's Gemini 2.5 Pro model, discussing why reasoning techniques are suddenly working so effectively across the industry and whether these advances represent true breakthroughs or incremental progress. The conversation delves into critical questions about the relationship between reasoning and agency, the role of human data in shaping model behavior, and the roadmap from current capabilities to AGI, providing listeners with an insider's perspective on the trajectory of AI development. SPONSORS: Oracle Cloud Infrastructure (OCI): Oracle Cloud Infrastructure offers next-generation cloud solutions that cut costs and boost performance. With OCI, you can run AI projects and applications faster and more securely for less. New U.S. customers can save 50% on compute, 70% on storage, and 80% on networking by switching to OCI before May 31, 2024. See if you qualify at https://oracle.com/cognitive Shopify: Shopify is revolutionizing online selling with its market-leading checkout system and robust API ecosystem. Its exclusive library of cutting-edge AI apps empowers e-commerce businesses to thrive in a competitive market. Cognitive Revolution listeners can try Shopify for just $1 per month at https://shopify.com/cognitive NetSuite: Over 41,000 businesses trust NetSuite by Oracle, the #1 cloud ERP, to future-proof their operations. With a unified platform for accounting, financial management, inventory, and HR, NetSuite provides real-time insights and forecasting to help you make quick, informed decisions. Whether you're earning millions or hundreds of millions, NetSuite empowers you to tackle challenges and seize opportunities. Download the free CFO's guide to AI and machine learning at https://netsuite.com/cognitive PRODUCED BY: https://aipodcast.ing CHAPTERS: (00:00) About the Episode (05:09) Introduction and Welcome (07:28) RL for Reasoning (10:46) Research Time Management (13:41) Convergence in Model Development (18:31) RL on Smaller Models (Part 1) (20:01) Sponsors: Oracle Cloud Infrastructure (OCI) | Shopify (22:35) RL on Smaller Models (Part 2) (23:30) Sculpting Cognitive Behaviors (25:05) Language Switching Behavior (28:02) Sharing Chain of Thought (32:03) RL on Chain of Thought (Part 1) (33:46) Sponsors: NetSuite (35:19) RL on Chain of Thought (Part 2) (35:26) Eliciting Human Reasoning (39:27) Reasoning vs. Agency (40:17) Understanding Model Reasoning (44:29) Reasoning in Latent Space (47:54) Interpretability Challenges (51:36) Platonic Model Hypothesis (56:05) Roadmap to AGI (01:00:57) Multimodal Integration (01:04:38) System Card Questions (01:07:51) Long Context Capabilities (01:13:49) Outro
Tyler Freel shares insights on hunting big game with the 338 Lapua Magnum in this podcast interview, focusing on terminal performance, favorite loads, and recoil management. Sponsor: Go to https://BigGameHuntingPodcast.com/ebook and sign up for my free e-book on the best hunting calibers at to receive the entertaining and informative emails I send out about hunting, firearms, and ballistics every weekday. In this episode of The Big Game Hunting Podcast, host John McAdams sits down with Tyler Freel to explore the 338 Lapua Magnum—a cartridge renowned for its power and extended range performance. Unlike past discussions with Tyler on smaller rounds, this interview dives into the mighty 338 Lapua Magnum's big game hunting potential. Tyler recounts how he first got hooked on the 338 Lapua, drawn by its long-range ballistics and ability to anchor massive animals like moose. He details a few instances of taking moose with a 285gr Hornady ELD Match (SD: 0.356) performed on moose at ranges from 200-550 yards. Tyler compares the 338 Lapua to the 338 Win Mag and 300 PRC, noting its edge in energy retention, but also in recoil. On reloading, Tyler favors slow-burning powders like Retumbo, H1000, and RL-26 to achieve ~2,700 fps of muzzle velocity with 285-grain Hornady ELD Match bullets that slip gracefully through the air, but still perform incredibly well on even the biggest moose. Tyler's takeaway? The 338 Lapua isn't for everyone, but it's a solid performer for use on really big game at extended range. Plus, it's tough to beat the cool factor that comes with this round! Please hit that "SUBSCRIBE" or "FOLLOW" button in your podcast app to receive future episodes automatically! Resources Read Tyler's article on Outdoor Life about the 338 Lapua here. Subscribe to Tyler's Tundra Talk Podcast here. Follow Tyler on Instagram @thetylerfreel.
If you're in SF: Join us for the Claude Plays Pokemon hackathon this Sunday!If you're not: Fill out the 2025 State of AI Eng survey for $250 in Amazon cards!We are SO excited to share our conversation with Dharmesh Shah, co-founder of HubSpot and creator of Agent.ai.A particularly compelling concept we discussed is the idea of "hybrid teams" - the next evolution in workplace organization where human workers collaborate with AI agents as team members. Just as we previously saw hybrid teams emerge in terms of full-time vs. contract workers, or in-office vs. remote workers, Dharmesh predicts that the next frontier will be teams composed of both human and AI members. This raises interesting questions about team dynamics, trust, and how to effectively delegate tasks between human and AI team members.The discussion of business models in AI reveals an important distinction between Work as a Service (WaaS) and Results as a Service (RaaS), something Dharmesh has written extensively about. While RaaS has gained popularity, particularly in customer support applications where outcomes are easily measurable, Dharmesh argues that this model may be over-indexed. Not all AI applications have clearly definable outcomes or consistent economic value per transaction, making WaaS more appropriate in many cases. This insight is particularly relevant for businesses considering how to monetize AI capabilities.The technical challenges of implementing effective agent systems are also explored, particularly around memory and authentication. Shah emphasizes the importance of cross-agent memory sharing and the need for more granular control over data access. He envisions a future where users can selectively share parts of their data with different agents, similar to how OAuth works but with much finer control. This points to significant opportunities in developing infrastructure for secure and efficient agent-to-agent communication and data sharing.Other highlights from our conversation* The Evolution of AI-Powered Agents – Exploring how AI agents have evolved from simple chatbots to sophisticated multi-agent systems, and the role of MCPs in enabling that.* Hybrid Digital Teams and the Future of Work – How AI agents are becoming teammates rather than just tools, and what this means for business operations and knowledge work.* Memory in AI Agents – The importance of persistent memory in AI systems and how shared memory across agents could enhance collaboration and efficiency.* Business Models for AI Agents – Exploring the shift from software as a service (SaaS) to work as a service (WaaS) and results as a service (RaaS), and what this means for monetization.* The Role of Standards Like MCP – Why MCP has been widely adopted and how it enables agent collaboration, tool use, and discovery.* The Future of AI Code Generation and Software Engineering – How AI-assisted coding is changing the role of software engineers and what skills will matter most in the future.* Domain Investing and Efficient Markets – Dharmesh's approach to domain investing and how inefficiencies in digital asset markets create business opportunities.* The Philosophy of Saying No – Lessons from "Sorry, You Must Pass" and how prioritization leads to greater productivity and focus.Timestamps* 00:00 Introduction and Guest Welcome* 02:29 Dharmesh Shah's Journey into AI* 05:22 Defining AI Agents* 06:45 The Evolution and Future of AI Agents* 13:53 Graph Theory and Knowledge Representation* 20:02 Engineering Practices and Overengineering* 25:57 The Role of Junior Engineers in the AI Era* 28:20 Multi-Agent Systems and MCP Standards* 35:55 LinkedIn's Legal Battles and Data Scraping* 37:32 The Future of AI and Hybrid Teams* 39:19 Building Agent AI: A Professional Network for Agents* 40:43 Challenges and Innovations in Agent AI* 45:02 The Evolution of UI in AI Systems* 01:00:25 Business Models: Work as a Service vs. Results as a Service* 01:09:17 The Future Value of Engineers* 01:09:51 Exploring the Role of Agents* 01:10:28 The Importance of Memory in AI* 01:11:02 Challenges and Opportunities in AI Memory* 01:12:41 Selective Memory and Privacy Concerns* 01:13:27 The Evolution of AI Tools and Platforms* 01:18:23 Domain Names and AI Projects* 01:32:08 Balancing Work and Personal Life* 01:35:52 Final Thoughts and ReflectionsTranscriptAlessio [00:00:04]: Hey everyone, welcome back to the Latent Space podcast. This is Alessio, partner and CTO at Decibel Partners, and I'm joined by my co-host Swyx, founder of Small AI.swyx [00:00:12]: Hello, and today we're super excited to have Dharmesh Shah to join us. I guess your relevant title here is founder of Agent AI.Dharmesh [00:00:20]: Yeah, that's true for this. Yeah, creator of Agent.ai and co-founder of HubSpot.swyx [00:00:25]: Co-founder of HubSpot, which I followed for many years, I think 18 years now, gonna be 19 soon. And you caught, you know, people can catch up on your HubSpot story elsewhere. I should also thank Sean Puri, who I've chatted with back and forth, who's been, I guess, getting me in touch with your people. But also, I think like, just giving us a lot of context, because obviously, My First Million joined you guys, and they've been chatting with you guys a lot. So for the business side, we can talk about that, but I kind of wanted to engage your CTO, agent, engineer side of things. So how did you get agent religion?Dharmesh [00:01:00]: Let's see. So I've been working, I'll take like a half step back, a decade or so ago, even though actually more than that. So even before HubSpot, the company I was contemplating that I had named for was called Ingenisoft. And the idea behind Ingenisoft was a natural language interface to business software. Now realize this is 20 years ago, so that was a hard thing to do. But the actual use case that I had in mind was, you know, we had data sitting in business systems like a CRM or something like that. And my kind of what I thought clever at the time. Oh, what if we used email as the kind of interface to get to business software? And the motivation for using email is that it automatically works when you're offline. So imagine I'm getting on a plane or I'm on a plane. There was no internet on planes back then. It's like, oh, I'm going through business cards from an event I went to. I can just type things into an email just to have them all in the backlog. When it reconnects, it sends those emails to a processor that basically kind of parses effectively the commands and updates the software, sends you the file, whatever it is. And there was a handful of commands. I was a little bit ahead of the times in terms of what was actually possible. And I reattempted this natural language thing with a product called ChatSpot that I did back 20...swyx [00:02:12]: Yeah, this is your first post-ChatGPT project.Dharmesh [00:02:14]: I saw it come out. Yeah. And so I've always been kind of fascinated by this natural language interface to software. Because, you know, as software developers, myself included, we've always said, oh, we build intuitive, easy-to-use applications. And it's not intuitive at all, right? Because what we're doing is... We're taking the mental model that's in our head of what we're trying to accomplish with said piece of software and translating that into a series of touches and swipes and clicks and things like that. And there's nothing natural or intuitive about it. And so natural language interfaces, for the first time, you know, whatever the thought is you have in your head and expressed in whatever language that you normally use to talk to yourself in your head, you can just sort of emit that and have software do something. And I thought that was kind of a breakthrough, which it has been. And it's gone. So that's where I first started getting into the journey. I started because now it actually works, right? So once we got ChatGPT and you can take, even with a few-shot example, convert something into structured, even back in the ChatGP 3.5 days, it did a decent job in a few-shot example, convert something to structured text if you knew what kinds of intents you were going to have. And so that happened. And that ultimately became a HubSpot project. But then agents intrigued me because I'm like, okay, well, that's the next step here. So chat's great. Love Chat UX. But if we want to do something even more meaningful, it felt like the next kind of advancement is not this kind of, I'm chatting with some software in a kind of a synchronous back and forth model, is that software is going to do things for me in kind of a multi-step way to try and accomplish some goals. So, yeah, that's when I first got started. It's like, okay, what would that look like? Yeah. And I've been obsessed ever since, by the way.Alessio [00:03:55]: Which goes back to your first experience with it, which is like you're offline. Yeah. And you want to do a task. You don't need to do it right now. You just want to queue it up for somebody to do it for you. Yes. As you think about agents, like, let's start at the easy question, which is like, how do you define an agent? Maybe. You mean the hardest question in the universe? Is that what you mean?Dharmesh [00:04:12]: You said you have an irritating take. I do have an irritating take. I think, well, some number of people have been irritated, including within my own team. So I have a very broad definition for agents, which is it's AI-powered software that accomplishes a goal. Period. That's it. And what irritates people about it is like, well, that's so broad as to be completely non-useful. And I understand that. I understand the criticism. But in my mind, if you kind of fast forward months, I guess, in AI years, the implementation of it, and we're already starting to see this, and we'll talk about this, different kinds of agents, right? So I think in addition to having a usable definition, and I like yours, by the way, and we should talk more about that, that you just came out with, the classification of agents actually is also useful, which is, is it autonomous or non-autonomous? Does it have a deterministic workflow? Does it have a non-deterministic workflow? Is it working synchronously? Is it working asynchronously? Then you have the different kind of interaction modes. Is it a chat agent, kind of like a customer support agent would be? You're having this kind of back and forth. Is it a workflow agent that just does a discrete number of steps? So there's all these different flavors of agents. So if I were to draw it in a Venn diagram, I would draw a big circle that says, this is agents, and then I have a bunch of circles, some overlapping, because they're not mutually exclusive. And so I think that's what's interesting, and we're seeing development along a bunch of different paths, right? So if you look at the first implementation of agent frameworks, you look at Baby AGI and AutoGBT, I think it was, not Autogen, that's the Microsoft one. They were way ahead of their time because they assumed this level of reasoning and execution and planning capability that just did not exist, right? So it was an interesting thought experiment, which is what it was. Even the guy that, I'm an investor in Yohei's fund that did Baby AGI. It wasn't ready, but it was a sign of what was to come. And so the question then is, when is it ready? And so lots of people talk about the state of the art when it comes to agents. I'm a pragmatist, so I think of the state of the practical. It's like, okay, well, what can I actually build that has commercial value or solves actually some discrete problem with some baseline of repeatability or verifiability?swyx [00:06:22]: There was a lot, and very, very interesting. I'm not irritated by it at all. Okay. As you know, I take a... There's a lot of anthropological view or linguistics view. And in linguistics, you don't want to be prescriptive. You want to be descriptive. Yeah. So you're a goals guy. That's the key word in your thing. And other people have other definitions that might involve like delegated trust or non-deterministic work, LLM in the loop, all that stuff. The other thing I was thinking about, just the comment on Baby AGI, LGBT. Yeah. In that piece that you just read, I was able to go through our backlog and just kind of track the winter of agents and then the summer now. Yeah. And it's... We can tell the whole story as an oral history, just following that thread. And it's really just like, I think, I tried to explain the why now, right? Like I had, there's better models, of course. There's better tool use with like, they're just more reliable. Yep. Better tools with MCP and all that stuff. And I'm sure you have opinions on that too. Business model shift, which you like a lot. I just heard you talk about RAS with MFM guys. Yep. Cost is dropping a lot. Yep. Inference is getting faster. There's more model diversity. Yep. Yep. I think it's a subtle point. It means that like, you have different models with different perspectives. You don't get stuck in the basin of performance of a single model. Sure. You can just get out of it by just switching models. Yep. Multi-agent research and RL fine tuning. So I just wanted to let you respond to like any of that.Dharmesh [00:07:44]: Yeah. A couple of things. Connecting the dots on the kind of the definition side of it. So we'll get the irritation out of the way completely. I have one more, even more irritating leap on the agent definition thing. So here's the way I think about it. By the way, the kind of word agent, I looked it up, like the English dictionary definition. The old school agent, yeah. Is when you have someone or something that does something on your behalf, like a travel agent or a real estate agent acts on your behalf. It's like proxy, which is a nice kind of general definition. So the other direction I'm sort of headed, and it's going to tie back to tool calling and MCP and things like that, is if you, and I'm not a biologist by any stretch of the imagination, but we have these single-celled organisms, right? Like the simplest possible form of what one would call life. But it's still life. It just happens to be single-celled. And then you can combine cells and then cells become specialized over time. And you have much more sophisticated organisms, you know, kind of further down the spectrum. In my mind, at the most fundamental level, you can almost think of having atomic agents. What is the simplest possible thing that's an agent that can still be called an agent? What is the equivalent of a kind of single-celled organism? And the reason I think that's useful is right now we're headed down the road, which I think is very exciting around tool use, right? That says, okay, the LLMs now can be provided a set of tools that it calls to accomplish whatever it needs to accomplish in the kind of furtherance of whatever goal it's trying to get done. And I'm not overly bothered by it, but if you think about it, if you just squint a little bit and say, well, what if everything was an agent? And what if tools were actually just atomic agents? Because then it's turtles all the way down, right? Then it's like, oh, well, all that's really happening with tool use is that we have a network of agents that know about each other through something like an MMCP and can kind of decompose a particular problem and say, oh, I'm going to delegate this to this set of agents. And why do we need to draw this distinction between tools, which are functions most of the time? And an actual agent. And so I'm going to write this irritating LinkedIn post, you know, proposing this. It's like, okay. And I'm not suggesting we should call even functions, you know, call them agents. But there is a certain amount of elegance that happens when you say, oh, we can just reduce it down to one primitive, which is an agent that you can combine in complicated ways to kind of raise the level of abstraction and accomplish higher order goals. Anyway, that's my answer. I'd say that's a success. Thank you for coming to my TED Talk on agent definitions.Alessio [00:09:54]: How do you define the minimum viable agent? Do you already have a definition for, like, where you draw the line between a cell and an atom? Yeah.Dharmesh [00:10:02]: So in my mind, it has to, at some level, use AI in order for it to—otherwise, it's just software. It's like, you know, we don't need another word for that. And so that's probably where I draw the line. So then the question, you know, the counterargument would be, well, if that's true, then lots of tools themselves are actually not agents because they're just doing a database call or a REST API call or whatever it is they're doing. And that does not necessarily qualify them, which is a fair counterargument. And I accept that. It's like a good argument. I still like to think about—because we'll talk about multi-agent systems, because I think—so we've accepted, which I think is true, lots of people have said it, and you've hopefully combined some of those clips of really smart people saying this is the year of agents, and I completely agree, it is the year of agents. But then shortly after that, it's going to be the year of multi-agent systems or multi-agent networks. I think that's where it's going to be headed next year. Yeah.swyx [00:10:54]: Opening eyes already on that. Yeah. My quick philosophical engagement with you on this. I often think about kind of the other spectrum, the other end of the cell spectrum. So single cell is life, multi-cell is life, and you clump a bunch of cells together in a more complex organism, they become organs, like an eye and a liver or whatever. And then obviously we consider ourselves one life form. There's not like a lot of lives within me. I'm just one life. And now, obviously, I don't think people don't really like to anthropomorphize agents and AI. Yeah. But we are extending our consciousness and our brain and our functionality out into machines. I just saw you were a Bee. Yeah. Which is, you know, it's nice. I have a limitless pendant in my pocket.Dharmesh [00:11:37]: I got one of these boys. Yeah.swyx [00:11:39]: I'm testing it all out. You know, got to be early adopters. But like, we want to extend our personal memory into these things so that we can be good at the things that we're good at. And, you know, machines are good at it. Machines are there. So like, my definition of life is kind of like going outside of my own body now. I don't know if you've ever had like reflections on that. Like how yours. How our self is like actually being distributed outside of you. Yeah.Dharmesh [00:12:01]: I don't fancy myself a philosopher. But you went there. So yeah, I did go there. I'm fascinated by kind of graphs and graph theory and networks and have been for a long, long time. And to me, we're sort of all nodes in this kind of larger thing. It just so happens that we're looking at individual kind of life forms as they exist right now. But so the idea is when you put a podcast out there, there's these little kind of nodes you're putting out there of like, you know, conceptual ideas. Once again, you have varying kind of forms of those little nodes that are up there and are connected in varying and sundry ways. And so I just think of myself as being a node in a massive, massive network. And I'm producing more nodes as I put content or ideas. And, you know, you spend some portion of your life collecting dots, experiences, people, and some portion of your life then connecting dots from the ones that you've collected over time. And I found that really interesting things happen and you really can't know in advance how those dots are necessarily going to connect in the future. And that's, yeah. So that's my philosophical take. That's the, yes, exactly. Coming back.Alessio [00:13:04]: Yep. Do you like graph as an agent? Abstraction? That's been one of the hot topics with LandGraph and Pydantic and all that.Dharmesh [00:13:11]: I do. The thing I'm more interested in terms of use of graphs, and there's lots of work happening on that now, is graph data stores as an alternative in terms of knowledge stores and knowledge graphs. Yeah. Because, you know, so I've been in software now 30 plus years, right? So it's not 10,000 hours. It's like 100,000 hours that I've spent doing this stuff. And so I've grew up with, so back in the day, you know, I started on mainframes. There was a product called IMS from IBM, which is basically an index database, what we'd call like a key value store today. Then we've had relational databases, right? We have tables and columns and foreign key relationships. We all know that. We have document databases like MongoDB, which is sort of a nested structure keyed by a specific index. We have vector stores, vector embedding database. And graphs are interesting for a couple of reasons. One is, so it's not classically structured in a relational way. When you say structured database, to most people, they're thinking tables and columns and in relational database and set theory and all that. Graphs still have structure, but it's not the tables and columns structure. And you could wonder, and people have made this case, that they are a better representation of knowledge for LLMs and for AI generally than other things. So that's kind of thing number one conceptually, and that might be true, I think is possibly true. And the other thing that I really like about that in the context of, you know, I've been in the context of data stores for RAG is, you know, RAG, you say, oh, I have a million documents, I'm going to build the vector embeddings, I'm going to come back with the top X based on the semantic match, and that's fine. All that's very, very useful. But the reality is something gets lost in the chunking process and the, okay, well, those tend, you know, like, you don't really get the whole picture, so to speak, and maybe not even the right set of dimensions on the kind of broader picture. And it makes intuitive sense to me that if we did capture it properly in a graph form, that maybe that feeding into a RAG pipeline will actually yield better results for some use cases, I don't know, but yeah.Alessio [00:15:03]: And do you feel like at the core of it, there's this difference between imperative and declarative programs? Because if you think about HubSpot, it's like, you know, people and graph kind of goes hand in hand, you know, but I think maybe the software before was more like primary foreign key based relationship, versus now the models can traverse through the graph more easily.Dharmesh [00:15:22]: Yes. So I like that representation. There's something. It's just conceptually elegant about graphs and just from the representation of it, they're much more discoverable, you can kind of see it, there's observability to it, versus kind of embeddings, which you can't really do much with as a human. You know, once they're in there, you can't pull stuff back out. But yeah, I like that kind of idea of it. And the other thing that's kind of, because I love graphs, I've been long obsessed with PageRank from back in the early days. And, you know, one of the kind of simplest algorithms in terms of coming up, you know, with a phone, everyone's been exposed to PageRank. And the idea is that, and so I had this other idea for a project, not a company, and I have hundreds of these, called NodeRank, is to be able to take the idea of PageRank and apply it to an arbitrary graph that says, okay, I'm going to define what authority looks like and say, okay, well, that's interesting to me, because then if you say, I'm going to take my knowledge store, and maybe this person that contributed some number of chunks to the graph data store has more authority on this particular use case or prompt that's being submitted than this other one that may, or maybe this one was more. popular, or maybe this one has, whatever it is, there should be a way for us to kind of rank nodes in a graph and sort them in some, some useful way. Yeah.swyx [00:16:34]: So I think that's generally useful for, for anything. I think the, the problem, like, so even though at my conferences, GraphRag is super popular and people are getting knowledge, graph religion, and I will say like, it's getting space, getting traction in two areas, conversation memory, and then also just rag in general, like the, the, the document data. Yeah. It's like a source. Most ML practitioners would say that knowledge graph is kind of like a dirty word. The graph database, people get graph religion, everything's a graph, and then they, they go really hard into it and then they get a, they get a graph that is too complex to navigate. Yes. And so like the, the, the simple way to put it is like you at running HubSpot, you know, the power of graphs, the way that Google has pitched them for many years, but I don't suspect that HubSpot itself uses a knowledge graph. No. Yeah.Dharmesh [00:17:26]: So when is it over engineering? Basically? It's a great question. I don't know. So the question now, like in AI land, right, is the, do we necessarily need to understand? So right now, LLMs for, for the most part are somewhat black boxes, right? We sort of understand how the, you know, the algorithm itself works, but we really don't know what's going on in there and, and how things come out. So if a graph data store is able to produce the outcomes we want, it's like, here's a set of queries I want to be able to submit and then it comes out with useful content. Maybe the underlying data store is as opaque as a vector embeddings or something like that, but maybe it's fine. Maybe we don't necessarily need to understand it to get utility out of it. And so maybe if it's messy, that's okay. Um, that's, it's just another form of lossy compression. Uh, it's just lossy in a way that we just don't completely understand in terms of, because it's going to grow organically. Uh, and it's not structured. It's like, ah, we're just gonna throw a bunch of stuff in there. Let the, the equivalent of the embedding algorithm, whatever they called in graph land. Um, so the one with the best results wins. I think so. Yeah.swyx [00:18:26]: Or is this the practical side of me is like, yeah, it's, if it's useful, we don't necessarilyDharmesh [00:18:30]: need to understand it.swyx [00:18:30]: I have, I mean, I'm happy to push back as long as you want. Uh, it's not practical to evaluate like the 10 different options out there because it takes time. It takes people, it takes, you know, resources, right? Set. That's the first thing. Second thing is your evals are typically on small things and some things only work at scale. Yup. Like graphs. Yup.Dharmesh [00:18:46]: Yup. That's, yeah, no, that's fair. And I think this is one of the challenges in terms of implementation of graph databases is that the most common approach that I've seen developers do, I've done it myself, is that, oh, I've got a Postgres database or a MySQL or whatever. I can represent a graph with a very set of tables with a parent child thing or whatever. And that sort of gives me the ability, uh, why would I need anything more than that? And the answer is, well, if you don't need anything more than that, you don't need anything more than that. But there's a high chance that you're sort of missing out on the actual value that, uh, the graph representation gives you. Which is the ability to traverse the graph, uh, efficiently in ways that kind of going through the, uh, traversal in a relational database form, even though structurally you have the data, practically you're not gonna be able to pull it out in, in useful ways. Uh, so you wouldn't like represent a social graph, uh, in, in using that kind of relational table model. It just wouldn't scale. It wouldn't work.swyx [00:19:36]: Uh, yeah. Uh, I think we want to move on to MCP. Yeah. But I just want to, like, just engineering advice. Yeah. Uh, obviously you've, you've, you've run, uh, you've, you've had to do a lot of projects and run a lot of teams. Do you have a general rule for over-engineering or, you know, engineering ahead of time? You know, like, because people, we know premature engineering is the root of all evil. Yep. But also sometimes you just have to. Yep. When do you do it? Yes.Dharmesh [00:19:59]: It's a great question. This is, uh, a question as old as time almost, which is what's the right and wrong levels of abstraction. That's effectively what, uh, we're answering when we're trying to do engineering. I tend to be a pragmatist, right? So here's the thing. Um, lots of times doing something the right way. Yeah. It's like a marginal increased cost in those cases. Just do it the right way. And this is what makes a, uh, a great engineer or a good engineer better than, uh, a not so great one. It's like, okay, all things being equal. If it's going to take you, you know, roughly close to constant time anyway, might as well do it the right way. Like, so do things well, then the question is, okay, well, am I building a framework as the reusable library? To what degree, uh, what am I anticipating in terms of what's going to need to change in this thing? Uh, you know, along what dimension? And then I think like a business person in some ways, like what's the return on calories, right? So, uh, and you look at, um, energy, the expected value of it's like, okay, here are the five possible things that could happen, uh, try to assign probabilities like, okay, well, if there's a 50% chance that we're going to go down this particular path at some day, like, or one of these five things is going to happen and it costs you 10% more to engineer for that. It's basically, it's something that yields a kind of interest compounding value. Um, as you get closer to the time of, of needing that versus having to take on debt, which is when you under engineer it, you're taking on debt. You're going to have to pay off when you do get to that eventuality where something happens. One thing as a pragmatist, uh, so I would rather under engineer something than over engineer it. If I were going to err on the side of something, and here's the reason is that when you under engineer it, uh, yes, you take on tech debt, uh, but the interest rate is relatively known and payoff is very, very possible, right? Which is, oh, I took a shortcut here as a result of which now this thing that should have taken me a week is now going to take me four weeks. Fine. But if that particular thing that you thought might happen, never actually, you never have that use case transpire or just doesn't, it's like, well, you just save yourself time, right? And that has value because you were able to do other things instead of, uh, kind of slightly over-engineering it away, over-engineering it. But there's no perfect answers in art form in terms of, uh, and yeah, we'll, we'll bring kind of this layers of abstraction back on the code generation conversation, which we'll, uh, I think I have later on, butAlessio [00:22:05]: I was going to ask, we can just jump ahead quickly. Yeah. Like, as you think about vibe coding and all that, how does the. Yeah. Percentage of potential usefulness change when I feel like we over-engineering a lot of times it's like the investment in syntax, it's less about the investment in like arc exacting. Yep. Yeah. How does that change your calculus?Dharmesh [00:22:22]: A couple of things, right? One is, um, so, you know, going back to that kind of ROI or a return on calories, kind of calculus or heuristic you think through, it's like, okay, well, what is it going to cost me to put this layer of abstraction above the code that I'm writing now, uh, in anticipating kind of future needs. If the cost of fixing, uh, or doing under engineering right now. Uh, we'll trend towards zero that says, okay, well, I don't have to get it right right now because even if I get it wrong, I'll run the thing for six hours instead of 60 minutes or whatever. It doesn't really matter, right? Like, because that's going to trend towards zero to be able, the ability to refactor a code. Um, and because we're going to not that long from now, we're going to have, you know, large code bases be able to exist, uh, you know, as, as context, uh, for a code generation or a code refactoring, uh, model. So I think it's going to make it, uh, make the case for under engineering, uh, even stronger. Which is why I take on that cost. You just pay the interest when you get there, it's not, um, just go on with your life vibe coded and, uh, come back when you need to. Yeah.Alessio [00:23:18]: Sometimes I feel like there's no decision-making in some things like, uh, today I built a autosave for like our internal notes platform and I literally just ask them cursor. Can you add autosave? Yeah. I don't know if it's over under engineer. Yep. I just vibe coded it. Yep. And I feel like at some point we're going to get to the point where the models kindDharmesh [00:23:36]: of decide where the right line is, but this is where the, like the, in my mind, the danger is, right? So there's two sides to this. One is the cost of kind of development and coding and things like that stuff that, you know, we talk about. But then like in your example, you know, one of the risks that we have is that because adding a feature, uh, like a save or whatever the feature might be to a product as that price tends towards zero, are we going to be less discriminant about what features we add as a result of making more product products more complicated, which has a negative impact on the user and navigate negative impact on the business. Um, and so that's the thing I worry about if it starts to become too easy, are we going to be. Too promiscuous in our, uh, kind of extension, adding product extensions and things like that. It's like, ah, why not add X, Y, Z or whatever back then it was like, oh, we only have so many engineering hours or story points or however you measure things. Uh, that least kept us in check a little bit. Yeah.Alessio [00:24:22]: And then over engineering, you're like, yeah, it's kind of like you're putting that on yourself. Yeah. Like now it's like the models don't understand that if they add too much complexity, it's going to come back to bite them later. Yep. So they just do whatever they want to do. Yeah. And I'm curious where in the workflow that's going to be, where it's like, Hey, this is like the amount of complexity and over-engineering you can do before you got to ask me if we should actually do it versus like do something else.Dharmesh [00:24:45]: So you know, we've already, let's like, we're leaving this, uh, in the code generation world, this kind of compressed, um, cycle time. Right. It's like, okay, we went from auto-complete, uh, in the GitHub co-pilot to like, oh, finish this particular thing and hit tab to a, oh, I sort of know your file or whatever. I can write out a full function to you to now I can like hold a bunch of the context in my head. Uh, so we can do app generation, which we have now with lovable and bolt and repletage. Yeah. Association and other things. So then the question is, okay, well, where does it naturally go from here? So we're going to generate products. Make sense. We might be able to generate platforms as though I want a platform for ERP that does this, whatever. And that includes the API's includes the product and the UI, and all the things that make for a platform. There's no nothing that says we would stop like, okay, can you generate an entire software company someday? Right. Uh, with the platform and the monetization and the go-to-market and the whatever. And you know, that that's interesting to me in terms of, uh, you know, what, when you take it to almost ludicrous levels. of abstract.swyx [00:25:39]: It's like, okay, turn it to 11. You mentioned vibe coding, so I have to, this is a blog post I haven't written, but I'm kind of exploring it. Is the junior engineer dead?Dharmesh [00:25:49]: I don't think so. I think what will happen is that the junior engineer will be able to, if all they're bringing to the table is the fact that they are a junior engineer, then yes, they're likely dead. But hopefully if they can communicate with carbon-based life forms, they can interact with product, if they're willing to talk to customers, they can take their kind of basic understanding of engineering and how kind of software works. I think that has value. So I have a 14-year-old right now who's taking Python programming class, and some people ask me, it's like, why is he learning coding? And my answer is, is because it's not about the syntax, it's not about the coding. What he's learning is like the fundamental thing of like how things work. And there's value in that. I think there's going to be timeless value in systems thinking and abstractions and what that means. And whether functions manifested as math, which he's going to get exposed to regardless, or there are some core primitives to the universe, I think, that the more you understand them, those are what I would kind of think of as like really large dots in your life that will have a higher gravitational pull and value to them that you'll then be able to. So I want him to collect those dots, and he's not resisting. So it's like, okay, while he's still listening to me, I'm going to have him do things that I think will be useful.swyx [00:26:59]: You know, part of one of the pitches that I evaluated for AI engineer is a term. And the term is that maybe the traditional interview path or career path of software engineer goes away, which is because what's the point of lead code? Yeah. And, you know, it actually matters more that you know how to work with AI and to implement the things that you want. Yep.Dharmesh [00:27:16]: That's one of the like interesting things that's happened with generative AI. You know, you go from machine learning and the models and just that underlying form, which is like true engineering, right? Like the actual, what I call real engineering. I don't think of myself as a real engineer, actually. I'm a developer. But now with generative AI. We call it AI and it's obviously got its roots in machine learning, but it just feels like fundamentally different to me. Like you have the vibe. It's like, okay, well, this is just a whole different approach to software development to so many different things. And so I'm wondering now, it's like an AI engineer is like, if you were like to draw the Venn diagram, it's interesting because the cross between like AI things, generative AI and what the tools are capable of, what the models do, and this whole new kind of body of knowledge that we're still building out, it's still very young, intersected with kind of classic engineering, software engineering. Yeah.swyx [00:28:04]: I just described the overlap as it separates out eventually until it's its own thing, but it's starting out as a software. Yeah.Alessio [00:28:11]: That makes sense. So to close the vibe coding loop, the other big hype now is MCPs. Obviously, I would say Cloud Desktop and Cursor are like the two main drivers of MCP usage. I would say my favorite is the Sentry MCP. I can pull in errors and then you can just put the context in Cursor. How do you think about that abstraction layer? Does it feel... Does it feel almost too magical in a way? Do you think it's like you get enough? Because you don't really see how the server itself is then kind of like repackaging theDharmesh [00:28:41]: information for you? I think MCP as a standard is one of the better things that's happened in the world of AI because a standard needed to exist and absent a standard, there was a set of things that just weren't possible. Now, we can argue whether it's the best possible manifestation of a standard or not. Does it do too much? Does it do too little? I get that, but it's just simple enough to both be useful and unobtrusive. It's understandable and adoptable by mere mortals, right? It's not overly complicated. You know, a reasonable engineer can put a stand up an MCP server relatively easily. The thing that has me excited about it is like, so I'm a big believer in multi-agent systems. And so that's going back to our kind of this idea of an atomic agent. So imagine the MCP server, like obviously it calls tools, but the way I think about it, so I'm working on my current passion project is agent.ai. And we'll talk more about that in a little bit. More about the, I think we should, because I think it's interesting not to promote the project at all, but there's some interesting ideas in there. One of which is around, we're going to need a mechanism for, if agents are going to collaborate and be able to delegate, there's going to need to be some form of discovery and we're going to need some standard way. It's like, okay, well, I just need to know what this thing over here is capable of. We're going to need a registry, which Anthropic's working on. I'm sure others will and have been doing directories of, and there's going to be a standard around that too. How do you build out a directory of MCP servers? I think that's going to unlock so many things just because, and we're already starting to see it. So I think MCP or something like it is going to be the next major unlock because it allows systems that don't know about each other, don't need to, it's that kind of decoupling of like Sentry and whatever tools someone else was building. And it's not just about, you know, Cloud Desktop or things like, even on the client side, I think we're going to see very interesting consumers of MCP, MCP clients versus just the chat body kind of things. Like, you know, Cloud Desktop and Cursor and things like that. But yeah, I'm very excited about MCP in that general direction.swyx [00:30:39]: I think the typical cynical developer take, it's like, we have OpenAPI. Yeah. What's the new thing? I don't know if you have a, do you have a quick MCP versus everything else? Yeah.Dharmesh [00:30:49]: So it's, so I like OpenAPI, right? So just a descriptive thing. It's OpenAPI. OpenAPI. Yes, that's what I meant. So it's basically a self-documenting thing. We can do machine-generated, lots of things from that output. It's a structured definition of an API. I get that, love it. But MCPs sort of are kind of use case specific. They're perfect for exactly what we're trying to use them for around LLMs in terms of discovery. It's like, okay, I don't necessarily need to know kind of all this detail. And so right now we have, we'll talk more about like MCP server implementations, but We will? I think, I don't know. Maybe we won't. At least it's in my head. It's like a back processor. But I do think MCP adds value above OpenAPI. It's, yeah, just because it solves this particular thing. And if we had come to the world, which we have, like, it's like, hey, we already have OpenAPI. It's like, if that were good enough for the universe, the universe would have adopted it already. There's a reason why MCP is taking office because marginally adds something that was missing before and doesn't go too far. And so that's why the kind of rate of adoption, you folks have written about this and talked about it. Yeah, why MCP won. Yeah. And it won because the universe decided that this was useful and maybe it gets supplanted by something else. Yeah. And maybe we discover, oh, maybe OpenAPI was good enough the whole time. I doubt that.swyx [00:32:09]: The meta lesson, this is, I mean, he's an investor in DevTools companies. I work in developer experience at DevRel in DevTools companies. Yep. Everyone wants to own the standard. Yeah. I'm sure you guys have tried to launch your own standards. Actually, it's Houseplant known for a standard, you know, obviously inbound marketing. But is there a standard or protocol that you ever tried to push? No.Dharmesh [00:32:30]: And there's a reason for this. Yeah. Is that? And I don't mean, need to mean, speak for the people of HubSpot, but I personally. You kind of do. I'm not smart enough. That's not the, like, I think I have a. You're smart. Not enough for that. I'm much better off understanding the standards that are out there. And I'm more on the composability side. Let's, like, take the pieces of technology that exist out there, combine them in creative, unique ways. And I like to consume standards. I don't like to, and that's not that I don't like to create them. I just don't think I have the, both the raw wattage or the credibility. It's like, okay, well, who the heck is Dharmesh, and why should we adopt a standard he created?swyx [00:33:07]: Yeah, I mean, there are people who don't monetize standards, like OpenTelemetry is a big standard, and LightStep never capitalized on that.Dharmesh [00:33:15]: So, okay, so if I were to do a standard, there's two things that have been in my head in the past. I was one around, a very, very basic one around, I don't even have the domain, I have a domain for everything, for open marketing. Because the issue we had in HubSpot grew up in the marketing space. There we go. There was no standard around data formats and things like that. It doesn't go anywhere. But the other one, and I did not mean to go here, but I'm going to go here. It's called OpenGraph. I know the term was already taken, but it hasn't been used for like 15 years now for its original purpose. But what I think should exist in the world is right now, our information, all of us, nodes are in the social graph at Meta or the professional graph at LinkedIn. Both of which are actually relatively closed in actually very annoying ways. Like very, very closed, right? Especially LinkedIn. Especially LinkedIn. I personally believe that if it's my data, and if I would get utility out of it being open, I should be able to make my data open or publish it in whatever forms that I choose, as long as I have control over it as opt-in. So the idea is around OpenGraph that says, here's a standard, here's a way to publish it. I should be able to go to OpenGraph.org slash Dharmesh dot JSON and get it back. And it's like, here's your stuff, right? And I can choose along the way and people can write to it and I can prove. And there can be an entire system. And if I were to do that, I would do it as a... Like a public benefit, non-profit-y kind of thing, as this is a contribution to society. I wouldn't try to commercialize that. Have you looked at AdProto? What's that? AdProto.swyx [00:34:43]: It's the protocol behind Blue Sky. Okay. My good friend, Dan Abramov, who was the face of React for many, many years, now works there. And he actually did a talk that I can send you, which basically kind of tries to articulate what you just said. But he does, he loves doing these like really great analogies, which I think you'll like. Like, you know, a lot of our data is behind a handle, behind a domain. Yep. So he's like, all right, what if we flip that? What if it was like our handle and then the domain? Yep. So, and that's really like your data should belong to you. Yep. And I should not have to wait 30 days for my Twitter data to export. Yep.Dharmesh [00:35:19]: you should be able to at least be able to automate it or do like, yes, I should be able to plug it into an agentic thing. Yeah. Yes. I think we're... Because so much of our data is... Locked up. I think the trick here isn't that standard. It is getting the normies to care.swyx [00:35:37]: Yeah. Because normies don't care.Dharmesh [00:35:38]: That's true. But building on that, normies don't care. So, you know, privacy is a really hot topic and an easy word to use, but it's not a binary thing. Like there are use cases where, and we make these choices all the time, that I will trade, not all privacy, but I will trade some privacy for some productivity gain or some benefit to me that says, oh, I don't care about that particular data being online if it gives me this in return, or I don't mind sharing this information with this company.Alessio [00:36:02]: If I'm getting, you know, this in return, but that sort of should be my option. I think now with computer use, you can actually automate some of the exports. Yes. Like something we've been doing internally is like everybody exports their LinkedIn connections. Yep. And then internally, we kind of merge them together to see how we can connect our companies to customers or things like that.Dharmesh [00:36:21]: And not to pick on LinkedIn, but since we're talking about it, but they feel strongly enough on the, you know, do not take LinkedIn data that they will block even browser use kind of things or whatever. They go to great, great lengths, even to see patterns of usage. And it says, oh, there's no way you could have, you know, gotten that particular thing or whatever without, and it's, so it's, there's...swyx [00:36:42]: Wasn't there a Supreme Court case that they lost? Yeah.Dharmesh [00:36:45]: So the one they lost was around someone that was scraping public data that was on the public internet. And that particular company had not signed any terms of service or whatever. It's like, oh, I'm just taking data that's on, there was no, and so that's why they won. But now, you know, the question is around, can LinkedIn... I think they can. Like, when you use, as a user, you use LinkedIn, you are signing up for their terms of service. And if they say, well, this kind of use of your LinkedIn account that violates our terms of service, they can shut your account down, right? They can. And they, yeah, so, you know, we don't need to make this a discussion. By the way, I love the company, don't get me wrong. I'm an avid user of the product. You know, I've got... Yeah, I mean, you've got over a million followers on LinkedIn, I think. Yeah, I do. And I've known people there for a long, long time, right? And I have lots of respect. And I understand even where the mindset originally came from of this kind of members-first approach to, you know, a privacy-first. I sort of get that. But sometimes you sort of have to wonder, it's like, okay, well, that was 15, 20 years ago. There's likely some controlled ways to expose some data on some member's behalf and not just completely be a binary. It's like, no, thou shalt not have the data.swyx [00:37:54]: Well, just pay for sales navigator.Alessio [00:37:57]: Before we move to the next layer of instruction, anything else on MCP you mentioned? Let's move back and then I'll tie it back to MCPs.Dharmesh [00:38:05]: So I think the... Open this with agent. Okay, so I'll start with... Here's my kind of running thesis, is that as AI and agents evolve, which they're doing very, very quickly, we're going to look at them more and more. I don't like to anthropomorphize. We'll talk about why this is not that. Less as just like raw tools and more like teammates. They'll still be software. They should self-disclose as being software. I'm totally cool with that. But I think what's going to happen is that in the same way you might collaborate with a team member on Slack or Teams or whatever you use, you can imagine a series of agents that do specific things just like a team member might do, that you can delegate things to. You can collaborate. You can say, hey, can you take a look at this? Can you proofread that? Can you try this? You can... Whatever it happens to be. So I think it is... I will go so far as to say it's inevitable that we're going to have hybrid teams someday. And what I mean by hybrid teams... So back in the day, hybrid teams were, oh, well, you have some full-time employees and some contractors. Then it was like hybrid teams are some people that are in the office and some that are remote. That's the kind of form of hybrid. The next form of hybrid is like the carbon-based life forms and agents and AI and some form of software. So let's say we temporarily stipulate that I'm right about that over some time horizon that eventually we're going to have these kind of digitally hybrid teams. So if that's true, then the question you sort of ask yourself is that then what needs to exist in order for us to get the full value of that new model? It's like, okay, well... You sort of need to... It's like, okay, well, how do I... If I'm building a digital team, like, how do I... Just in the same way, if I'm interviewing for an engineer or a designer or a PM, whatever, it's like, well, that's why we have professional networks, right? It's like, oh, they have a presence on likely LinkedIn. I can go through that semi-structured, structured form, and I can see the experience of whatever, you know, self-disclosed. But, okay, well, agents are going to need that someday. And so I'm like, okay, well, this seems like a thread that's worth pulling on. That says, okay. So I... So agent.ai is out there. And it's LinkedIn for agents. It's LinkedIn for agents. It's a professional network for agents. And the more I pull on that thread, it's like, okay, well, if that's true, like, what happens, right? It's like, oh, well, they have a profile just like anyone else, just like a human would. It's going to be a graph underneath, just like a professional network would be. It's just that... And you can have its, you know, connections and follows, and agents should be able to post. That's maybe how they do release notes. Like, oh, I have this new version. Whatever they decide to post, it should just be able to... Behave as a node on the network of a professional network. As it turns out, the more I think about that and pull on that thread, the more and more things, like, start to make sense to me. So it may be more than just a pure professional network. So my original thought was, okay, well, it's a professional network and agents as they exist out there, which I think there's going to be more and more of, will kind of exist on this network and have the profile. But then, and this is always dangerous, I'm like, okay, I want to see a world where thousands of agents are out there in order for the... Because those digital employees, the digital workers don't exist yet in any meaningful way. And so then I'm like, oh, can I make that easier for, like... And so I have, as one does, it's like, oh, I'll build a low-code platform for building agents. How hard could that be, right? Like, very hard, as it turns out. But it's been fun. So now, agent.ai has 1.3 million users. 3,000 people have actually, you know, built some variation of an agent, sometimes just for their own personal productivity. About 1,000 of which have been published. And the reason this comes back to MCP for me, so imagine that and other networks, since I know agent.ai. So right now, we have an MCP server for agent.ai that exposes all the internally built agents that we have that do, like, super useful things. Like, you know, I have access to a Twitter API that I can subsidize the cost. And I can say, you know, if you're looking to build something for social media, these kinds of things, with a single API key, and it's all completely free right now, I'm funding it. That's a useful way for it to work. And then we have a developer to say, oh, I have this idea. I don't have to worry about open AI. I don't have to worry about, now, you know, this particular model is better. It has access to all the models with one key. And we proxy it kind of behind the scenes. And then expose it. So then we get this kind of community effect, right? That says, oh, well, someone else may have built an agent to do X. Like, I have an agent right now that I built for myself to do domain valuation for website domains because I'm obsessed with domains, right? And, like, there's no efficient market for domains. There's no Zillow for domains right now that tells you, oh, here are what houses in your neighborhood sold for. It's like, well, why doesn't that exist? We should be able to solve that problem. And, yes, you're still guessing. Fine. There should be some simple heuristic. So I built that. It's like, okay, well, let me go look for past transactions. You say, okay, I'm going to type in agent.ai, agent.com, whatever domain. What's it actually worth? I'm looking at buying it. It can go and say, oh, which is what it does. It's like, I'm going to go look at are there any published domain transactions recently that are similar, either use the same word, same top-level domain, whatever it is. And it comes back with an approximate value, and it comes back with its kind of rationale for why it picked the value and comparable transactions. Oh, by the way, this domain sold for published. Okay. So that agent now, let's say, existed on the web, on agent.ai. Then imagine someone else says, oh, you know, I want to build a brand-building agent for startups and entrepreneurs to come up with names for their startup. Like a common problem, every startup is like, ah, I don't know what to call it. And so they type in five random words that kind of define whatever their startup is. And you can do all manner of things, one of which is like, oh, well, I need to find the domain for it. What are possible choices? Now it's like, okay, well, it would be nice to know if there's an aftermarket price for it, if it's listed for sale. Awesome. Then imagine calling this valuation agent. It's like, okay, well, I want to find where the arbitrage is, where the agent valuation tool says this thing is worth $25,000. It's listed on GoDaddy for $5,000. It's close enough. Let's go do that. Right? And that's a kind of composition use case that in my future state. Thousands of agents on the network, all discoverable through something like MCP. And then you as a developer of agents have access to all these kind of Lego building blocks based on what you're trying to solve. Then you blend in orchestration, which is getting better and better with the reasoning models now. Just describe the problem that you have. Now, the next layer that we're all contending with is that how many tools can you actually give an LLM before the LLM breaks? That number used to be like 15 or 20 before you kind of started to vary dramatically. And so that's the thing I'm thinking about now. It's like, okay, if I want to... If I want to expose 1,000 of these agents to a given LLM, obviously I can't give it all 1,000. Is there some intermediate layer that says, based on your prompt, I'm going to make a best guess at which agents might be able to be helpful for this particular thing? Yeah.Alessio [00:44:37]: Yeah, like RAG for tools. Yep. I did build the Latent Space Researcher on agent.ai. Okay. Nice. Yeah, that seems like, you know, then there's going to be a Latent Space Scheduler. And then once I schedule a research, you know, and you build all of these things. By the way, my apologies for the user experience. You realize I'm an engineer. It's pretty good.swyx [00:44:56]: I think it's a normie-friendly thing. Yeah. That's your magic. HubSpot does the same thing.Alessio [00:45:01]: Yeah, just to like quickly run through it. You can basically create all these different steps. And these steps are like, you know, static versus like variable-driven things. How did you decide between this kind of like low-code-ish versus doing, you know, low-code with code backend versus like not exposing that at all? Any fun design decisions? Yeah. And this is, I think...Dharmesh [00:45:22]: I think lots of people are likely sitting in exactly my position right now, coming through the choosing between deterministic. Like if you're like in a business or building, you know, some sort of agentic thing, do you decide to do a deterministic thing? Or do you go non-deterministic and just let the alum handle it, right, with the reasoning models? The original idea and the reason I took the low-code stepwise, a very deterministic approach. A, the reasoning models did not exist at that time. That's thing number one. Thing number two is if you can get... If you know in your head... If you know in your head what the actual steps are to accomplish whatever goal, why would you leave that to chance? There's no upside. There's literally no upside. Just tell me, like, what steps do you need executed? So right now what I'm playing with... So one thing we haven't talked about yet, and people don't talk about UI and agents. Right now, the primary interaction model... Or they don't talk enough about it. I know some people have. But it's like, okay, so we're used to the chatbot back and forth. Fine. I get that. But I think we're going to move to a blend of... Some of those things are going to be synchronous as they are now. But some are going to be... Some are going to be async. It's just going to put it in a queue, just like... And this goes back to my... Man, I talk fast. But I have this... I only have one other speed. It's even faster. So imagine it's like if you're working... So back to my, oh, we're going to have these hybrid digital teams. Like, you would not go to a co-worker and say, I'm going to ask you to do this thing, and then sit there and wait for them to go do it. Like, that's not how the world works. So it's nice to be able to just, like, hand something off to someone. It's like, okay, well, maybe I expect a response in an hour or a day or something like that.Dharmesh [00:46:52]: In terms of when things need to happen. So the UI around agents. So if you look at the output of agent.ai agents right now, they are the simplest possible manifestation of a UI, right? That says, oh, we have inputs of, like, four different types. Like, we've got a dropdown, we've got multi-select, all the things. It's like back in HTML, the original HTML 1.0 days, right? Like, you're the smallest possible set of primitives for a UI. And it just says, okay, because we need to collect some information from the user, and then we go do steps and do things. And generate some output in HTML or markup are the two primary examples. So the thing I've been asking myself, if I keep going down that path. So people ask me, I get requests all the time. It's like, oh, can you make the UI sort of boring? I need to be able to do this, right? And if I keep pulling on that, it's like, okay, well, now I've built an entire UI builder thing. Where does this end? And so I think the right answer, and this is what I'm going to be backcoding once I get done here, is around injecting a code generation UI generation into, the agent.ai flow, right? As a builder, you're like, okay, I'm going to describe the thing that I want, much like you would do in a vibe coding world. But instead of generating the entire app, it's going to generate the UI that exists at some point in either that deterministic flow or something like that. It says, oh, here's the thing I'm trying to do. Go generate the UI for me. And I can go through some iterations. And what I think of it as a, so it's like, I'm going to generate the code, generate the code, tweak it, go through this kind of prompt style, like we do with vibe coding now. And at some point, I'm going to be happy with it. And I'm going to hit save. And that's going to become the action in that particular step. It's like a caching of the generated code that I can then, like incur any inference time costs. It's just the actual code at that point.Alessio [00:48:29]: Yeah, I invested in a company called E2B, which does code sandbox. And they powered the LM arena web arena. So it's basically the, just like you do LMS, like text to text, they do the same for like UI generation. So if you're asking a model, how do you do it? But yeah, I think that's kind of where.Dharmesh [00:48:45]: That's the thing I'm really fascinated by. So the early LLM, you know, we're understandably, but laughably bad at simple arithmetic, right? That's the thing like my wife, Normies would ask us, like, you call this AI, like it can't, my son would be like, it's just stupid. It can't even do like simple arithmetic. And then like we've discovered over time that, and there's a reason for this, right? It's like, it's a large, there's, you know, the word language is in there for a reason in terms of what it's been trained on. It's not meant to do math, but now it's like, okay, well, the fact that it has access to a Python interpreter that I can actually call at runtime, that solves an entire body of problems that it wasn't trained to do. And it's basically a form of delegation. And so the thought that's kind of rattling around in my head is that that's great. So it's, it's like took the arithmetic problem and took it first. Now, like anything that's solvable through a relatively concrete Python program, it's able to do a bunch of things that I couldn't do before. Can we get to the same place with UI? I don't know what the future of UI looks like in a agentic AI world, but maybe let the LLM handle it, but not in the classic sense. Maybe it generates it on the fly, or maybe we go through some iterations and hit cache or something like that. So it's a little bit more predictable. Uh, I don't know, but yeah.Alessio [00:49:48]: And especially when is the human supposed to intervene? So, especially if you're composing them, most of them should not have a UI because then they're just web hooking to somewhere else. I just want to touch back. I don't know if you have more comments on this.swyx [00:50:01]: I was just going to ask when you, you said you got, you're going to go back to code. What
Věděli jste, že nám v ČR vzniká budoucí špička algoritmického tradingu pomocí AI? Konkrétně Martin Schmid a jeho tým v Equilibre využívá RL (reinforcement learning) a o tom jak, nám přišel popovídat. Ano, jeho jméno vám může být povědomé, jelikož stojí za DeepStackem - prvním velmi úspěšným algoritmem vítězícím nad profi hráči pokeru Texas Holdem. * https://www.equilibretechnologies.com * https://x.com/Lifrordi * https://www.deepstack.ai
Prof. Jakob Foerster, a leading AI researcher at Oxford University and Meta, and Chris Lu, a researcher at OpenAI -- they explain how AI is moving beyond just mimicking human behaviour to creating truly intelligent agents that can learn and solve problems on their own. Foerster champions open-source AI for responsible, decentralised development. He addresses AI scaling, goal misalignment (Goodhart's Law), and the need for holistic alignment, offering a quick look at the future of AI and how to guide it.SPONSOR MESSAGES:***CentML offers competitive pricing for GenAI model deployment, with flexible options to suit a wide range of models, from small to large-scale deployments. Check out their super fast DeepSeek R1 hosting!https://centml.ai/pricing/Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. They are hiring a Chief Engineer and ML engineers. Events in Zurich. Goto https://tufalabs.ai/***TRANSCRIPT/REFS:https://www.dropbox.com/scl/fi/yqjszhntfr00bhjh6t565/JAKOB.pdf?rlkey=scvny4bnwj8th42fjv8zsfu2y&dl=0 Prof. Jakob Foersterhttps://x.com/j_foersthttps://www.jakobfoerster.com/University of Oxford Profile: https://eng.ox.ac.uk/people/jakob-foerster/Chris Lu:https://chrislu.page/TOC1. GPU Acceleration and Training Infrastructure [00:00:00] 1.1 ARC Challenge Criticism and FLAIR Lab Overview [00:01:25] 1.2 GPU Acceleration and Hardware Lottery in RL [00:05:50] 1.3 Data Wall Challenges and Simulation-Based Solutions [00:08:40] 1.4 JAX Implementation and Technical Acceleration2. Learning Frameworks and Policy Optimization [00:14:18] 2.1 Evolution of RL Algorithms and Mirror Learning Framework [00:15:25] 2.2 Meta-Learning and Policy Optimization Algorithms [00:21:47] 2.3 Language Models and Benchmark Challenges [00:28:15] 2.4 Creativity and Meta-Learning in AI Systems3. Multi-Agent Systems and Decentralization [00:31:24] 3.1 Multi-Agent Systems and Emergent Intelligence [00:38:35] 3.2 Swarm Intelligence vs Monolithic AGI Systems [00:42:44] 3.3 Democratic Control and Decentralization of AI Development [00:46:14] 3.4 Open Source AI and Alignment Challenges [00:49:31] 3.5 Collaborative Models for AI DevelopmentREFS[[00:00:05] ARC Benchmark, Chollethttps://github.com/fchollet/ARC-AGI[00:03:05] DRL Doesn't Work, Irpanhttps://www.alexirpan.com/2018/02/14/rl-hard.html[00:05:55] AI Training Data, Data Provenance Initiativehttps://www.nytimes.com/2024/07/19/technology/ai-data-restrictions.html[00:06:10] JaxMARL, Foerster et al.https://arxiv.org/html/2311.10090v5[00:08:50] M-FOS, Lu et al.https://arxiv.org/abs/2205.01447[00:09:45] JAX Library, Google Researchhttps://github.com/jax-ml/jax[00:12:10] Kinetix, Mike and Michaelhttps://arxiv.org/abs/2410.23208[00:12:45] Genie 2, DeepMindhttps://deepmind.google/discover/blog/genie-2-a-large-scale-foundation-world-model/[00:14:42] Mirror Learning, Grudzien, Kuba et al.https://arxiv.org/abs/2208.01682[00:16:30] Discovered Policy Optimisation, Lu et al.https://arxiv.org/abs/2210.05639[00:24:10] Goodhart's Law, Goodharthttps://en.wikipedia.org/wiki/Goodhart%27s_law[00:25:15] LLM ARChitect, Franzen et al.https://github.com/da-fr/arc-prize-2024/blob/main/the_architects.pdf[00:28:55] AlphaGo, Silver et al.https://arxiv.org/pdf/1712.01815.pdf[00:30:10] Meta-learning, Lu, Towers, Foersterhttps://direct.mit.edu/isal/proceedings-pdf/isal2023/35/67/2354943/isal_a_00674.pdf[00:31:30] Emergence of Pragmatics, Yuan et al.https://arxiv.org/abs/2001.07752[00:34:30] AI Safety, Amodei et al.https://arxiv.org/abs/1606.06565[00:35:45] Intentional Stance, Dennetthttps://plato.stanford.edu/entries/ethics-ai/[00:39:25] Multi-Agent RL, Zhou et al.https://arxiv.org/pdf/2305.10091[00:41:00] Open Source Generative AI, Foerster et al.https://arxiv.org/abs/2405.08597
Dylan Patel is the founder of SemiAnalysis, a research & analysis company specializing in semiconductors, GPUs, CPUs, and AI hardware. Nathan Lambert is a research scientist at the Allen Institute for AI (Ai2) and the author of a blog on AI called Interconnects. Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep459-sc See below for timestamps, and to give feedback, submit questions, contact Lex, etc. CONTACT LEX: Feedback - give feedback to Lex: https://lexfridman.com/survey AMA - submit questions, videos or call-in: https://lexfridman.com/ama Hiring - join our team: https://lexfridman.com/hiring Other - other ways to get in touch: https://lexfridman.com/contact EPISODE LINKS: Dylan's X: https://x.com/dylan522p SemiAnalysis: https://semianalysis.com/ Nathan's X: https://x.com/natolambert Nathan's Blog: https://www.interconnects.ai/ Nathan's Podcast: https://www.interconnects.ai/podcast Nathan's Website: https://www.natolambert.com/ Nathan's YouTube: https://youtube.com/@natolambert Nathan's Book: https://rlhfbook.com/ SPONSORS: To support this podcast, check out our sponsors & get discounts: Invideo AI: AI video generator. Go to https://invideo.io/i/lexpod GitHub: Developer platform and AI code editor. Go to https://gh.io/copilot Shopify: Sell stuff online. Go to https://shopify.com/lex NetSuite: Business management software. Go to http://netsuite.com/lex AG1: All-in-one daily nutrition drinks. Go to https://drinkag1.com/lex OUTLINE: (00:00) - Introduction (13:28) - DeepSeek-R1 and DeepSeek-V3 (35:02) - Low cost of training (1:01:19) - DeepSeek compute cluster (1:08:52) - Export controls on GPUs to China (1:19:10) - AGI timeline (1:28:35) - China's manufacturing capacity (1:36:30) - Cold war with China (1:41:00) - TSMC and Taiwan (2:04:38) - Best GPUs for AI (2:19:30) - Why DeepSeek is so cheap (2:32:49) - Espionage (2:41:52) - Censorship (2:54:46) - Andrej Karpathy and magic of RL (3:05:17) - OpenAI o3-mini vs DeepSeek r1 (3:24:25) - NVIDIA (3:28:53) - GPU smuggling (3:35:30) - DeepSeek training on OpenAI data (3:45:59) - AI megaclusters (4:21:21) - Who wins the race to AGI? (4:31:34) - AI agents (4:40:16) - Programming and AI (4:47:43) - Open source (4:56:55) - Stargate (5:04:24) - Future of AI PODCAST LINKS: - Podcast Website: https://lexfridman.com/podcast - Apple Podcasts: https://apple.co/2lwqZIr - Spotify: https://spoti.fi/2nEwCF8 - RSS: https://lexfridman.com/feed/podcast/ - Podcast Playlist: https://www.youtube.com/playlist?list=PLrAXtmErZgOdP_8GztsuKi9nrraNbKKp4 - Clips Channel: https://www.youtube.com/lexclips
Mark Pomar served as assistant director of the Russian Service at Radio Free Europe/Radio Liberty, director of the USSR Division at the Voice of America, executive director of the Board for International Broadcasting. He joined David Priess to talk about the origins of US government-funded international broadcasting, differences between RFE/RL and VOA, tensions between strategists and purists over the radios' content, the impacts of detente and of Reagan's more hawkish approach, KGB infiltrations of RFE/RL, changes to the radios toward the end of the Cold War, the role of RL in August 1991's failed coup against Gorbachev, perceptions of the radios after the Cold War, Mark's book Cold War Radio and his current research into Radio Liberty, the relevance of this history for today, and more.Chatter is a production of Lawfare and Goat Rodeo. This episode was produced and edited by Cara Shillenn of Goat Rodeo. Podcast theme by David Priess, featuring music created using Groovepad.Support this show http://supporter.acast.com/lawfare. Hosted on Acast. See acast.com/privacy for more information.