POPULARITY
Categories
L'intelligence générale, entre promesse de salut et risque de perte de contrôle
We are breaking down the entire short-list and ranking, review, and digging into the 2026 Nebula Novel nominees from worst to best. We dive deep into the writing styles, the structure, the highs, the frustratingly bad endings, and reveal exactly who took home the final trophy. Are these books actually masterclasses in modern sci-fi and fantasy, or did the hype train leave the tracks? Here is our definitive 2026 Nebula breakdown: 7. Death of the Author by Nnedi Okorafor You should read it if: You love deep-dives into African culture, Ibo and Yoruba roots, and tech concepts like futuristic exoskeleton legs. You shouldn't read it if: You require a persistent central conflict, cohesive subplots, or a "story-within-a-story" that actually goes somewhere. 6. Wearing the Lion by John Wiswell You should read it if: You want a cozy Hercules retelling where Hera calls Zeus a "dipshit" and Heracles tries to befriend mythological monsters instead of fighting them. You shouldn't read it if: You get annoyed by overly preachy or cloying endings, repetitive quest structures, or confusing second-person POV shifts. 5. Katabasis by R. F. Kuang You should read it if: You are obsessed with dark academia themes, the dangers of academic flow states, and complex, highly allusional world-building. You shouldn't read it if: You need to deeply connect with your protagonists or get easily annoyed by writing that feels a little too self involved. 4. When We Were Real by Daryl Gregory You should read it if: You love quick, humorous POV switches, AGI, simulation theory, and brain emulation concepts. You shouldn't read it if: You are looking for a groundbreaking, deeply unique masterpiece—this one is cute, but a bit unspecial. 3. Sour Cherry by Natalia Theodoridou You should read it if: You like heavy foreshadowing, experimental voice-switching (shifting to 2nd person), and intense meta-narratives. You shouldn't read it if: You hate a massive buildup that doesn't actually come together or stick the landing at the end. 1. (TIED) The Incandescent by Emily Tesh You should read it if: You want adult-oriented cozy fantasy in a magic boarding school featuring a workaholic, middle-aged bisexual teacher and casual, biscuit-eating printer demons. You shouldn't read it if: A rushed, abrupt ending with a thin villain motivation is going to completely sour your overall enjoyment of a great setup. 1. (TIED & WINNER) The Buffalo Hunter Hunter by Stephen Graham Jones You should read it if: You want a beautifully written, highly literary Native American Blackfoot vampire revenge story set in the brutal landscape of the American West. You shouldn't read it if: You get bored by a monotonous middle section where the central premise loses steam and repeats itself. No spoilers anywhere in this episode. Join the Hugonauts book club on discord Or you can watch our episodes on YouTube if you prefer video All the books, plus timestamps: 00:00 Intro 00:46 Death of the Author by Nnedi Okorafor 02:26 Wearing the Lion by John Wiswell 05:29 Katabasis by R. F. Kuang 09:30 When We Were Real by Daryl Gregory 12:57 Sour Cherry by Natalia Theodoridou 16:30 The Incandescent by Emily Tesh 20:08 The Buffalo Hunter Hunter by Stephen Graham Jones
Voir, prédire, générer, agir : comprendre enfin ce qu'on met derrière le terme "IA"
"If your entire technology is coming from a single country, and that country decides that every now and again they're going to shut off access to you, that's not a foundation you can build on." The US government just ordered Anthropic to ban access to its most advanced AI models, Fable 5 and Mythos 5. Seems like now is a good time to talk about sovereign AI. Cohere co-founder Nick Frosst joins to discuss how Canada's AI champion is built different than the other frontier LLM providers, how Star Trek informs the type of AI future the company is trying to create, and why he doesn't make a point of listening to Marc Andreessen about AGI. Did the Anthropic model ban prove Cohere is right about sovereign AI? Let's dig in. -- Amid global uncertainty, the path forward is clear: Canada's moment to build is now. Presented by Uber Canada, DMZ, and National Bank of Canada, BetaKit Most Ambitious is back, telling stories of nearly 100 Canadian innovators strengthening our nation's autonomy, security, and prosperity. Read BetaKit Most Ambitious now.
What if the biggest barrier holding you back during a career or identity transition isn't your skills, but how you're communicating who you are becoming?In this episode, communication trainer and TEDx organizer Ana Denis explores why moments of reinvention (whether in midlife, after moving countries, or changing careers) create deep communication challenges. She reveals how our shifting quietly shapes how confidently (or hesitantly) we express ourselves, and why many people feel like “frauds” when stepping into a new professional chapter.Discover how to communicate with clarity and confidence even when your professional identity is still evolvingUnderstand the hidden emotional and identity layers that affect everyday conversationsMove past imposter syndrome by reframing experience and taking practical action in your new directionListen to the full episode to learn how to communicate your evolving identity with authenticity and confidence during times of personal and professional change.˚KEY POINTS AND TIMESTAMPS:01:43 - On Reinvention, Communication, and Starting Over08:50 - Why Midlife Triggers the Search for Meaning11:09 - The 3 Hidden Layers of Every Conversation14:17 - Why Identity Shifts Make Communication Difficult18:05 - Confidence, Imposter Syndrome, and Feeling Like a Fraud23:49 - Practical Ways to Build Confidence in a New Direction28:57 - Cultural Conditioning and the Fear of Reinvention33:05 - The Fear of Judgment When Changing Identity35:09 - Why Supportive People Matter During Reinvention˚MEMORABLE QUOTE:"What I was struggling with was just giving myself permission to change instead of holding onto the past."˚VALUABLE RESOURCES:Ana's website: https://anadenis.com/˚Coaching with Agi: https://personaldevelopmentmasterypodcast.com/mentor˚
She Tested 100% Accurate on Camera. Then We Asked About Epstein | Dallisa HockingBefore we touched a single conspiracy, I handed Dallisa a list of 10 questions about my own life that an AI generated at random. Her copper dowsing rods went 10 for 10. Once the rods proved themselves, I started asking the questions most people only whisper about, and the answers came faster than either of us was ready for.Dallisa Hocking is a fifth-generation psychic medium and channeler with over 11 years in the spiritual industry, who spent years predicting much of what we are living through now before Spirit told her to stop forecasting and start activating.This is not a comfort episode. It is an hour of watching a tool answer questions it should not be able to answer, and then going silent on the only two that might matter most.What we get into:The live calibration round where 10 randomly AI-generated questions about me all came back true, on cameraWhy she walked away from making predictions, and what Spirit told her to do insteadThe Joe Rogan question that crossed before I could finish asking itWhy the rods went to a dead standstill on only two questions: the simulation and AGI"They don't want us to crack that code" and her read on where AGI actually comes fromThe firmament, remote viewing, and why she feels we have never physically left EarthHer vision of mothers taking to the streets, and the draft she believes is comingThe July 4th window, a possible second pandemic, and "the great dismantling" already underwayWhether reality is a game, a prison, or both, and how to wake up inside it like TrumanThe one practice she calls non-negotiable for surviving what is comingCONNECT WITH DALLISA HOCKINGWebsite: https://spiritandspark.comInstagram: @dallisahocking → https://www.instagram.com/dallisahockingFacebook: https://www.facebook.com/dallisaYouTube: https://www.youtube.com/channel/UCOAtgAkuewh2CK6T2BscEXgLinkedIn: https://www.linkedin.com/in/dallisaSubstack: https://dallisahocking.substack.com2026 Summer Speaker Series (free): https://dallisa.thrivecart.com/2026-summer-speaker-series/Email: dallisa@spiritandspark.com
L'Intelligence Artificielle promet de changer notre monde... Une enquête indispensable pour y voir plus clair.
À retrouver à partir du 16 juin dans Sismique.L'intelligence artificielle est en train de devenir le projet industriel le plus massif de notre époque, et l'une des bascules les plus profondes de notre histoire récente. Et pourtant, le débat est piégé. D'un côté, ceux qui nous promettent un futur radieux. De l'autre, ceux qui annoncent l'effondrement. Entre les deux, la place est étrangement vide.C'est dans cette place vide que j'ai voulu poser cette série. Neuf épisodes pour regarder, lentement, ce qui se passe vraiment. Pas pour trancher à votre place. Pour vous donner de quoi penser.Au programme:La machine qui parle, comment cette technologie a basculé dans nos vies.Qu'appelle-t-on IA, ce que c'est, et ce que ce n'est pas.AGI, le rêve et la peur, cette super-intelligence qu'on nous promet.La course et ses bâtisseurs, l'argent, le récit, ceux qui tiennent la barre.La mégamachine, le corps physique de l'IA, ce qu'elle consomme, ce qu'elle rejette.L'humain sous assistance, ce que ça nous fait, à nous, individuellement.La société sous influence, ce que ça fait au collectif, à la vérité, au pouvoir.Qu'est-ce que l'intelligence, le pas de côté philosophique.Que peut-on encore choisir, ce qui reste possible.Une série pour les curieux, les inquiets, les enthousiastes lucides, et tous ceux qui sentent que cette histoire les concerne, sans toujours savoir par où la prendre.---Retrouvez tous les épisodes et les résumés sur www.sismique.frSismique est un podcast indépendant créé et animé par Julien Devaureix.
Alright, welcome to Part 2 of my conversation with Peter Diamandis—a man who lives and breathes exponential change, and isn't afraid to tackle the stuff everyone else would rather ignore. Now that you've braved everything breaking and the cracks in the old story, it's time to step fully into what happens next: regulation, UBI, riots and “derangement,” brain-computer interfaces, and some mind-blowing visions of what human purpose could look like when survival is no longer the point. Peter and I get into everything from whether AGI and superintelligence will help or harm us, the ethics of coding “morality” into AI, to whether humans are really just the boot disk for something greater coming next.This part goes deep on the forks ahead—will you opt out, numb out, become a creator, or actually merge your brain with the cloud? How are schools failing us (and what do our kids actually need to thrive)? We even get practical about what you can do right now to claim agency, think radically bigger, and make yourself anti-fragile—whether you want to start a company, change the world, or just live a life you can be proud of as the rules keep rewriting themselves in real-time. If you need just one episode to snap you out of fear and into action, this is it.Ketone IQ: Visit https://ketone.com/IMPACT for 30% OFF your subscription orderQuince: Free shipping and 365-day returns at https://quince.com/impactpodPlaud: Get 10% off with code IMPACT at https://plaud.ai/impactWhatnot:Download the Whatnot app today and get free shipping on your first order. AT&T Business: Switch to AT&T Business at business.att.comShopify: Sign up for your one-dollar-per-month trial period at https://shopify.com/impactTruemed: Check your eligibility and start saving at https://truemed.com/impactIncogni: Take your personal data back with Incogni! Use code IMPACT at the link below and get 60% off an annual plan: https://incogni.com/impactPique: 20% off at https://piquelife.com/impactWhat's up, everybody? It's Tom Bilyeu here:If you want my help...STARTING a business: join me here at ZERO TO FOUNDER: https://tombilyeu.com/zero-to-founder?utm_campaign=Podcast%20Offer&utm_source=podca[%E2%80%A6]d%20end%20of%20show&utm_content=podcast%20ad%20end%20of%20showSCALING a business: see if you qualify here.: https://tombilyeu.com/callGet my battle-tested strategies and insights delivered weekly to your inbox: sign up here.:https://tombilyeu.com/**********************************************************************If you're serious about leveling up your life, I urge you to check out my new podcast, Tom Bilyeu's Mindset Playbook —a goldmine of my most impactful episodes on mindset, business, and health. Trust me, your future self will thank you.**********************************************************************FOLLOW TOM:Instagram: https://www.instagram.com/tombilyeu/Tik Tok: https://www.tiktok.com/@tombilyeu?lang=enTwitter: https://twitter.com/tombilyeuYouTube: https://www.youtube.com/@TomBilyeuSee Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.
BONUS S6E8 A Zebra ZONE 2026 recap on pragmatic AI, on-device AI, super apps, and giving store associates their day backRicardo Belmar just got back from Zebra Technologies' ZONE 2026 conference in Nashville, where pragmatic AI was the main theme. In this special bonus episode of The Retail Razor Show, he and Casey Golden unpack what it all means for retail's frontline workers. This is a story about pragmatic AI, the kind that gives store associates and warehouse teams their day back instead of promising the moon.The headline from ZONE 2026? Zebra is no longer telling a devices story. It's telling a frontline platform story, anchored by on-device AI that runs with no cloud, no tokens, and no waiting. Ricardo brought back two exclusive interviews, with Zebra CTO Tom Bianculli and Mobile Computing chief James Poulton, plus a notebook full of stats, demos, and hallway conversations that deliver the full pragmatic AI story.We get into why frontline workers are drowning in 70 to 80 apps when they only use about a dozen, the super app built to fix it, real-time translation running live on a device, and why “tokenless” pragmatic AI became the word of the week. If you want to understand on-device AI and what it delivers for frontline workers, this episode is your shortcut.In This Episode, You'll Learn• Zebra's three big software announcements: Nucleus, Workcloud IO, and Workcloud BI• The 80-apps problem and the super app designed to collapse it down to one experience• Why on-device AI, tokenless and at the edge, beats cloud round trips for frontline use cases• Real-time translation in any language, live on the device• Micro-learning, the “TikTok of learning,” and tackling 70 to 80% frontline turnover• Picture proof of delivery: how a second and a half scales into tens of millions of dollars• The octopus organization, and why intelligence belongs at the edge of the org• James Poulton on why large language models are overhyped for the enterpriseWhy it mattersWe've spent years on this show arguing that your associate experience is your customer experience. ZONE 2026 felt like the technology industry finally catching up to that idea, treating frontline workers as the most under-invested asset in retail and giving them on-device AI that augments rather than replaces. Pragmatic AI wins on the accumulation of small moments, and that is the thread we pull all episode long.Subscribe & FollowIf you enjoyed this episode, please leave us a 5‑star rating and review on Apple Podcasts, Spotify, or Goodpods. Subscribe on YouTube so you never miss an episode and check out the other shows in the Retail Razor Podcast Network: Retail Transformers, Blade to Greatness, and Data Blades.Subscribe to the Retail Razor Podcast Network: https://retailrazor.com/Subscribe to our Newsletter: https://retailrazor.substack.comSubscribe to our YouTube channel: https://go.retailrazor.com/utubeFeatured guestsTom Bianculli, Chief Technology Officer, Zebra Technologieshttps://www.linkedin.com/in/tom-bianculli-9053892/James Poulton, SVP & GM, Mobile Computing, Zebra Technologieshttps://www.linkedin.com/in/jamespoulton/Chapters00:00 Teaser01:01 Show intro02:12 What Zebra announced at ZONE 202604:00 The 80-apps problem and the super app06:53 Real-time translation on the device08:55 Tokenless, on-device AI explained11:46 Best moment: the octopus organization15:43 Interview: Tom Bianculli, CTO Zebra Technologies35:50 Recap: pragmatic AI and returning time to workers40:27 Interview: James Poulton, SVP & GM Mobile Computing53:21 Big takeaways from ZONE 202659:32 Show CloseMeet your hostsHelping you cut through the clutter in retail & retail tech:Ricardo Belmar is an NRF Top Retail Voice for 2025 and a RETHINK Retail Top Retail Expert from 2021 – 2026. Thinkers 360 has named him a Top 10 Thought Leader in Retail, a Top 25 Thought Leader in AGI and Careers, a Top 50 Thought Leader in Agentic AIand Management, and a Top 100 Thought Leader in Digital Transformation and Transformation. Thinkers 360 also named him a Top Digital Voice for 2024 and 2025. He is an advisory council member at George Mason University's Center for Retail Transformationand the Retail Cloud Alliance. He was most recently the partner marketing leader for retail & consumer goods in the Americas at Microsoft.Casey Golden, is the North America Leader for Retail & Consumer Goods at CI&T, and CEO of Luxlock. She is a RETHINK Retail Top Retail Expert from 2023 - 2026, and Retail Cloud Alliance advisory council member. After a career on the fashion and supply chain technology side of the business, Casey is obsessed with the customer relationship between the brand and the consumer and is slaying franken-stacks and building retail tech! MusicIncludes music provided by imunobeats.com, featuring Overclocked, and E-Motive from the album Beat Hype, written by Heston Mimms, published by Imuno.
What's up, everybody? It's Tom Bilyeu here:If you want my help...STARTING a business: join me here at ZERO TO FOUNDER: https://tombilyeu.com/zero-to-founder?utm_campaign=Podcast%20Offer&utm_source=podca[%E2%80%A6]d%20end%20of%20show&utm_content=podcast%20ad%20end%20of%20showSCALING a business: see if you qualify here.: https://tombilyeu.com/callGet my battle-tested strategies and insights delivered weekly to your inbox: sign up here.:https://tombilyeu.com/**********************************************************************If you're serious about leveling up your life, I urge you to check out my new podcast, Tom Bilyeu's Mindset Playbook —a goldmine of my most impactful episodes on mindset, business, and health. Trust me, your future self will thank you.**********************************************************************FOLLOW TOM:Instagram: https://www.instagram.com/tombilyeu/Tik Tok: https://www.tiktok.com/@tombilyeu?lang=enTwitter: https://twitter.com/tombilyeuYouTube: https://www.youtube.com/@TomBilyeuKetone IQ: Visit https://ketone.com/IMPACT for 30% OFF your subscription orderQuince: Free shipping and 365-day returns at https://quince.com/impactpodPlaud: Get 10% off with code IMPACT at https://plaud.ai/impactWhatnot:Download the Whatnot app today and get free shipping on your first order. AT&T Business: Switch to AT&T Business at business.att.comShopify: Sign up for your one-dollar-per-month trial period at https://shopify.com/impactTruemed: Check your eligibility and start saving at https://truemed.com/impactIncogni: Take your personal data back with Incogni! Use code IMPACT at the link below and get 60% off an annual plan: https://incogni.com/impactPique: 20% off at https://piquelife.com/impactIn this Friday edition of The Tom Bilyeu Show Live, Tom and Drew dig into a packed news day spanning geopolitics, markets, tech, and a long philosophical tangent on immortality. They open on Iran, breaking down the leaked 14-point "deal" circulating via Iranian state media — the $24 billion in frozen assets, the naval blockade, the Strait of Hormuz, and reconstruction demands — and why Tom is deeply skeptical that anything beyond a memorandum of understanding gets signed, plus what a bad deal could cost Trump heading into the midterms. From there, they pivot to a heated exchange over the SpaceX IPO and the Globe and Mail's "how to properly hate Elon Musk" headline, using it as a springboard into the psychology of resentment, the mechanics of transformational-tech bubbles, and a warning to retail investors about becoming "exit liquidity." The conversation moves through California's voting rules, ballot harvesting, and Trump's Save America Act and reconciliation push (with an extended back-and-forth on states' rights, the Constitution, and the Supreme Court), the UK's proposed device-level content-scanning law and the surveillance-state implications, a DOJ child-smuggling indictment tied to border policy, and the Epstein/Zorro Ranch mystery. They close on AI — unpacking Yann LeCun's argument against LLMs and AGI in favor of specialized world models — before spinning off into a wide-ranging debate about whether you'd actually want to live forever, the disposable-male hypothesis, and a contentious Alex Karp clip about GDP and gender.See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.
OpenAI, Anthropic, SpaceXand the AI IPO cycle face a structural problem: a cheap, capable open source exit is already drawing enterprise users away before either company goes public. ======================================================== Thank you to our sponsor! Fidelity: Fidelity has been building in crypto and DeFi since 2014 — now they're hiring. Explore career opportunities at one of the most forward-thinking names in finance here: crypto.fidelitycareers.com. Cape: Your biggest crypto vulnerability isn't your wallet, it's your phone number. Cape is America's privacy-first mobile carrier that rotates your SIM identity daily and blocks SIM swaps before they happen. Get 33% off your first six months at cape.co/unchained (use code: UNCHAINED). ======================================================== A viral tweet by Tom Shaughnessy, founding partner of Delphi Ventures, identified the most basic way AI could blow up: a 40x subsidy gap between consumer AI subscriptions and enterprise API costs quietly pushing businesses toward open source inference providers at 1% of the price. Citadel Securities published a near-identical thesis shortly after. Shaughnessy joins Laura Shin to map the implications for the AI IPO wave, starting with SpaceX. Low floats and passive index demand should lift these stocks out of the gate, but public market disclosures will force OpenAI and Anthropic to reveal payback periods, margins, and subscriber numbers for the first time. He also argues OpenAI's reported price cuts target Anthropic's growth metrics before the IPO, not user demand. The episode also covers the China model wildcard, whether AI model restrictions amount to big brother fearmongering, and whether crypto's tools for capital formation could keep the AGI flywheel from stalling. Host: Laura Shin, Host / Unchained Guests: Tom Shaughnessy - Founding Partner of Delphi Ventures and Co-Founder of Delphi Digital Timestamps
Discover how a homegrown AI agent is outsmarting big-brand competitors, letting users tailor digital assistants with real memory and skills. The future isn't just smarter models, but everyday tech that learns exactly how you work. • Hermes AI agent's launch, mass adoption, and personalized capabilities • Open source vs. proprietary AI: model access, privacy, and funding hurdles • Apple's next-gen Siri and agentic platform ambitions unpacked • Noose Research model development, Nvidia partnerships, and training challenges • The risk of an "AI underclass" and ethics in model distribution • Anthropic's Fable release: strict guardrails, silent model downgrades, and open source tensions • Local models vs. cloud LLMs: cost, effectiveness, and practical tuning • Community-driven iterating: Hermes' rapid product evolution and user obsession • Vatican's AI encyclical: church perspectives on AI, morality, and the common good • AGI arrival debate: economic thresholds, capabilities, and human uniqueness • The reality of AI hallucinations, agent accuracy, and responsible usage • Legal fallout over AI-generated hallucinations in court filings • AI's growing role in Hollywood contracts and labor protections • Google's Gemini 3 live translation impresses but raises privacy flags • German courts label Google AI overviews as publisher speech, liability looms • AI detection tools like Pangram face scrutiny in real-world writing and education • Google Dream Beans app tests the limits of digital personal recommendations • Picks of the Week: Reddit AMA, Dream Beans, basketball and retro gaming, research critiques Hosts: Leo Laporte, Jeff Jarvis, and Paris Martineau Guest: Jeffrey Quesnelle Download or subscribe to Intelligent Machines at https://twit.tv/shows/intelligent-machines. Join Club TWiT for Ad-Free Podcasts! Support what you love and get ad-free audio and video feeds, a members-only Discord, and exclusive content. Join today: https://twit.tv/clubtwit Sponsors: helixsleep.com/machines Melissa.com/twit zscaler.com/security
Imagine you're living 15,000 years ago. Your people are hunter-gatherers and you sleep under the stars. If someone told you humans would one day build cities with millions of people, fly through the air, or carry all human knowledge in their pockets, you couldn't even begin to picture what they meant... Yet here we are.How did our lives change so far beyond recognition? The story is complex, but there's a rough pattern. A few times in history, some radical breakthrough in technology — like the development of the plough and the steam engine — has led to a wave of productivity, innovation, and social change that ultimately reshaped the world.Now we're on the cusp of a huge new breakthrough: artificial intelligence that can meet or exceed human capabilities across a wide range of tasks.This could bring another era of transformation. There could be an explosion of intelligence and innovation, and a whole new population of digital beings. And with this, civilisation could see changes at least as profound as those brought about by industrialisation or the rise of agriculture — but instead of taking hundreds or thousands of years to unfold, this time around the world could become unrecognisable over the span of decades or less.This transformation could bring enormous benefits, helping us solve currently intractable global problems. But it could also pose severe risks, some of which could be existential — meaning they could cause human extinction, or an equally permanent and severe disempowerment of humanity. There aren't nearly enough people trying to address these challenges, and we think that's a serious problem.This article is narrated by the author, Zershaaneh Qureshi. It explores how advanced AI could be so transformative, and why working on its risks may be your best opportunity to have a positive impact on the world. You can see the original article on the 80,000 Hours website: https://80000hours.org/problem-profiles/artificial-intelligence/ Chapters:Introduction (00:00:20)Section 1: AI could replace human labour in the most economically valuable fields (00:08:32)Section 2: Replacing human labour in the most economically valuable fields could trigger the next radical transformation of society (00:22:14)Section 3: This transformation could be extremely rapid and dramatic (00:28:02)Section 4: A rapid AI-driven transformation would raise a range of major challenges, including existential risks (00:36:40)Section 5: Work on these problems is tractable, but neglected (00:44:48)Objection 1: “You're overestimating how fast and how dramatically AI would transform the world.” (00:47:59)Objection 2: “It's hard to believe that AI could really pose existential risks.” (00:52:59)Objection 3: “Isn't all this talk of AI changing the world just a fad?” (00:59:22)Objection 4: “Isn't AI going to be just like every other technology?” (01:03:04)Objection 5: “Is it even possible to produce artificial general intelligence?” (01:06:16)Objection 6: “Even if AGI is achievable, what if we're really far away from building it?” (01:11:24)Objection 7: “Isn't the real danger from actual current AI and not some sort of futuristic AGI?” (01:14:05)Objection 8: “Technological progress is a good thing for humanity.” (01:18:10)Objection 9: “This all just sounds too sci-fi.” (01:19:50)Objection 10: “Can it really make sense to dedicate my career to solving an issue that's based on a speculative story about something that may or may not ever happen?” (01:22:15)Objection 11: “OK, AI might pose existential risks, but isn't ‘issue X' an even bigger problem?” (01:24:39)Learn more (01:27:51)Audio editing: Dominic ArmstrongProduction: Zershaaneh Qureshi, Elizabeth Cox, Katy Moore, and Lou Moran
Discover how a homegrown AI agent is outsmarting big-brand competitors, letting users tailor digital assistants with real memory and skills. The future isn't just smarter models, but everyday tech that learns exactly how you work. • Hermes AI agent's launch, mass adoption, and personalized capabilities • Open source vs. proprietary AI: model access, privacy, and funding hurdles • Apple's next-gen Siri and agentic platform ambitions unpacked • Nous Research model development, Nvidia partnerships, and training challenges • The risk of an "AI underclass" and ethics in model distribution • Anthropic's Fable release: strict guardrails, silent model downgrades, and open source tensions • Local models vs. cloud LLMs: cost, effectiveness, and practical tuning • Community-driven iterating: Hermes' rapid product evolution and user obsession • Vatican's AI encyclical: church perspectives on AI, morality, and the common good • AGI arrival debate: economic thresholds, capabilities, and human uniqueness • The reality of AI hallucinations, agent accuracy, and responsible usage • Legal fallout over AI-generated hallucinations in court filings • AI's growing role in Hollywood contracts and labor protections • Google's Gemini 3 live translation impresses but raises privacy flags • German courts label Google AI overviews as publisher speech, liability looms • AI detection tools like Pangram face scrutiny in real-world writing and education • Google Dream Beans app tests the limits of digital personal recommendations • Picks of the Week: Reddit AMA, Dream Beans, basketball and retro gaming, research critiques Hosts: Leo Laporte, Jeff Jarvis, and Paris Martineau Guest: Jeffrey Quesnelle Download or subscribe to Intelligent Machines at https://twit.tv/shows/intelligent-machines. Join Club TWiT for Ad-Free Podcasts! Support what you love and get ad-free audio and video feeds, a members-only Discord, and exclusive content. Join today: https://twit.tv/clubtwit Sponsors: helixsleep.com/machines Melissa.com/twit zscaler.com/security
Discover how a homegrown AI agent is outsmarting big-brand competitors, letting users tailor digital assistants with real memory and skills. The future isn't just smarter models, but everyday tech that learns exactly how you work. • Hermes AI agent's launch, mass adoption, and personalized capabilities • Open source vs. proprietary AI: model access, privacy, and funding hurdles • Apple's next-gen Siri and agentic platform ambitions unpacked • Noose Research model development, Nvidia partnerships, and training challenges • The risk of an "AI underclass" and ethics in model distribution • Anthropic's Fable release: strict guardrails, silent model downgrades, and open source tensions • Local models vs. cloud LLMs: cost, effectiveness, and practical tuning • Community-driven iterating: Hermes' rapid product evolution and user obsession • Vatican's AI encyclical: church perspectives on AI, morality, and the common good • AGI arrival debate: economic thresholds, capabilities, and human uniqueness • The reality of AI hallucinations, agent accuracy, and responsible usage • Legal fallout over AI-generated hallucinations in court filings • AI's growing role in Hollywood contracts and labor protections • Google's Gemini 3 live translation impresses but raises privacy flags • German courts label Google AI overviews as publisher speech, liability looms • AI detection tools like Pangram face scrutiny in real-world writing and education • Google Dream Beans app tests the limits of digital personal recommendations • Picks of the Week: Reddit AMA, Dream Beans, basketball and retro gaming, research critiques Hosts: Leo Laporte, Jeff Jarvis, and Paris Martineau Guest: Jeffrey Quesnelle Download or subscribe to Intelligent Machines at https://twit.tv/shows/intelligent-machines. Join Club TWiT for Ad-Free Podcasts! Support what you love and get ad-free audio and video feeds, a members-only Discord, and exclusive content. Join today: https://twit.tv/clubtwit Sponsors: helixsleep.com/machines Melissa.com/twit zscaler.com/security
What if most of your emotional suffering comes not from what's happening around you, but from the energy you're spending trying to control it?In this series, I select my favourite and most insightful moments from previous episodes of the podcast.Today, my guest, spiritual coach Reverend Rachel Harrison, shares a profound and practical teaching on emotional well-being: that we give our power away every time we attach our inner state to something outside ourselves, and exactly what to do the moment you catch yourself doing it.Press play to discover a simple but transformative mantra that brings your energy back to where it belongs: to you.˚VALUABLE RESOURCES:Listen to the full conversation with Rachel Harrison in episode #438:https://personaldevelopmentmasterypodcast.com/438˚Coaching with Agi: https://personaldevelopmentmasterypodcast.com/mentor˚
Discover how a homegrown AI agent is outsmarting big-brand competitors, letting users tailor digital assistants with real memory and skills. The future isn't just smarter models, but everyday tech that learns exactly how you work. • Hermes AI agent's launch, mass adoption, and personalized capabilities • Open source vs. proprietary AI: model access, privacy, and funding hurdles • Apple's next-gen Siri and agentic platform ambitions unpacked • Noose Research model development, Nvidia partnerships, and training challenges • The risk of an "AI underclass" and ethics in model distribution • Anthropic's Fable release: strict guardrails, silent model downgrades, and open source tensions • Local models vs. cloud LLMs: cost, effectiveness, and practical tuning • Community-driven iterating: Hermes' rapid product evolution and user obsession • Vatican's AI encyclical: church perspectives on AI, morality, and the common good • AGI arrival debate: economic thresholds, capabilities, and human uniqueness • The reality of AI hallucinations, agent accuracy, and responsible usage • Legal fallout over AI-generated hallucinations in court filings • AI's growing role in Hollywood contracts and labor protections • Google's Gemini 3 live translation impresses but raises privacy flags • German courts label Google AI overviews as publisher speech, liability looms • AI detection tools like Pangram face scrutiny in real-world writing and education • Google Dream Beans app tests the limits of digital personal recommendations • Picks of the Week: Reddit AMA, Dream Beans, basketball and retro gaming, research critiques Hosts: Leo Laporte, Jeff Jarvis, and Paris Martineau Guest: Jeffrey Quesnelle Download or subscribe to Intelligent Machines at https://twit.tv/shows/intelligent-machines. Join Club TWiT for Ad-Free Podcasts! Support what you love and get ad-free audio and video feeds, a members-only Discord, and exclusive content. Join today: https://twit.tv/clubtwit Sponsors: helixsleep.com/machines Melissa.com/twit zscaler.com/security
The entire startup ecosystem is racing to build agent harnesses. Logan Kilpatrick, who leads Google AI Studio and the Gemini API, argues that scramble has a roughly 12-month shelf life. Models will absorb the scaffolding and run it natively, so the edge moves elsewhere. Google's own bet runs in parallel: a single agent harness, born from the Windsurf team and now called Antigravity, has become the connective tissue across search, the Gemini app, Cloud, and AI Studio — the role Gemini-the-model used to play. Logan makes the case that coding already feels like narrow superintelligence, and that "jagged" vertical superintelligence (in math, finance, and science) will arrive well before AGI. He argues Google's real goal is maximizing outcomes for users, not eyeball time. He unpacks Omni, the single model built to replace multiple separate systems Google once trained for text, audio, music, image, and video. His throughline: AI is an accelerant for human ambition, not a substitute for it. Hosted by Sonya Huang, Sequoia Capital
Discover how a homegrown AI agent is outsmarting big-brand competitors, letting users tailor digital assistants with real memory and skills. The future isn't just smarter models, but everyday tech that learns exactly how you work. • Hermes AI agent's launch, mass adoption, and personalized capabilities • Open source vs. proprietary AI: model access, privacy, and funding hurdles • Apple's next-gen Siri and agentic platform ambitions unpacked • Nous Research model development, Nvidia partnerships, and training challenges • The risk of an "AI underclass" and ethics in model distribution • Anthropic's Fable release: strict guardrails, silent model downgrades, and open source tensions • Local models vs. cloud LLMs: cost, effectiveness, and practical tuning • Community-driven iterating: Hermes' rapid product evolution and user obsession • Vatican's AI encyclical: church perspectives on AI, morality, and the common good • AGI arrival debate: economic thresholds, capabilities, and human uniqueness • The reality of AI hallucinations, agent accuracy, and responsible usage • Legal fallout over AI-generated hallucinations in court filings • AI's growing role in Hollywood contracts and labor protections • Google's Gemini 3 live translation impresses but raises privacy flags • German courts label Google AI overviews as publisher speech, liability looms • AI detection tools like Pangram face scrutiny in real-world writing and education • Google Dream Beans app tests the limits of digital personal recommendations • Picks of the Week: Reddit AMA, Dream Beans, basketball and retro gaming, research critiques Hosts: Leo Laporte, Jeff Jarvis, and Paris Martineau Guest: Jeffrey Quesnelle Download or subscribe to Intelligent Machines at https://twit.tv/shows/intelligent-machines. Join Club TWiT for Ad-Free Podcasts! Support what you love and get ad-free audio and video feeds, a members-only Discord, and exclusive content. Join today: https://twit.tv/clubtwit Sponsors: helixsleep.com/machines Melissa.com/twit zscaler.com/security
Discover how a homegrown AI agent is outsmarting big-brand competitors, letting users tailor digital assistants with real memory and skills. The future isn't just smarter models, but everyday tech that learns exactly how you work. • Hermes AI agent's launch, mass adoption, and personalized capabilities • Open source vs. proprietary AI: model access, privacy, and funding hurdles • Apple's next-gen Siri and agentic platform ambitions unpacked • Noose Research model development, Nvidia partnerships, and training challenges • The risk of an "AI underclass" and ethics in model distribution • Anthropic's Fable release: strict guardrails, silent model downgrades, and open source tensions • Local models vs. cloud LLMs: cost, effectiveness, and practical tuning • Community-driven iterating: Hermes' rapid product evolution and user obsession • Vatican's AI encyclical: church perspectives on AI, morality, and the common good • AGI arrival debate: economic thresholds, capabilities, and human uniqueness • The reality of AI hallucinations, agent accuracy, and responsible usage • Legal fallout over AI-generated hallucinations in court filings • AI's growing role in Hollywood contracts and labor protections • Google's Gemini 3 live translation impresses but raises privacy flags • German courts label Google AI overviews as publisher speech, liability looms • AI detection tools like Pangram face scrutiny in real-world writing and education • Google Dream Beans app tests the limits of digital personal recommendations • Picks of the Week: Reddit AMA, Dream Beans, basketball and retro gaming, research critiques Hosts: Leo Laporte, Jeff Jarvis, and Paris Martineau Guest: Jeffrey Quesnelle Download or subscribe to Intelligent Machines at https://twit.tv/shows/intelligent-machines. Join Club TWiT for Ad-Free Podcasts! Support what you love and get ad-free audio and video feeds, a members-only Discord, and exclusive content. Join today: https://twit.tv/clubtwit Sponsors: helixsleep.com/machines Melissa.com/twit zscaler.com/security
Anthropic just released Claude Fable 5, the first public Mythos-class model and the start of the Claude 5 family. It is their most capable model ever but… kinda scary. This week on AI For Humans, the Mythos era goes public. Anthropic released Claude Fable 5, the first commercially available Mythos-class model and the first in the new Claude 5 line. It is the same underlying model as Mythos but shipped with conservative safeguards, questions about cybersecurity and biology get routed to Claude Opus 4.8 instead. We dig into what it can do, why Anthropic held it back, and what our future looks like as we get closer to AGI. Then Apple goes AI again at WWDC: a profoundly revamped Siri AI, a dedicated Siri app, on-screen awareness, much better photo tools, and a foundation model setup that is local, multimodal, and partly powered by Google. Gavin is thrilled that the future has finally arrived, just not on the phone he bought last year. It is AI For Humans! THE MOST POWERFUL AI EVER RELEASED. WHAT COULD GO WRONG. SHOW LINKS Anthropic announces Claude Fable 5: https://www.anthropic.com/news/claude-fable-5-mythos-5 Dan Shipper's review of Fable 5: https://x.com/danshipper/status/2064393970856124501 Usable Fable 5 demo (Library of Babel): https://library-of-babel-iota.vercel.app/ Rumored Fable 5 preview: Minecraft build (XIVIX): https://x.com/XIVIX_134/status/2062972363084341341 Rumored Fable 5 preview (chetaslua): https://x.com/chetaslua/status/2063328265708896621 Rumored Fable 5 preview (testingcatalog): https://x.com/testingcatalog/status/2062915688134574173 Fable 5 voxel Power Rangers comparison: https://x.com/Lentils80/status/2064379168272642315 Noam Brown on the implications of scaling test-time compute: https://x.com/polynoamial/status/2064210146558136827 WWDC full presentation: https://www.youtube.com/live/hF8swzNR1-o Apple introduces Siri AI, a profoundly more capable and personal assistant: https://www.apple.com/newsroom/2026/06/apple-introduces-siri-ai-a-profoundly-more-capable-and-personal-assistant/ Apple says its new Google-infused AI is all about privacy: https://gizmodo.com/apple-says-its-new-google-infused-ai-is-all-about-privacy-2000768997 An actually useful Apple Intelligence use case: https://x.com/iupdate/status/2064078761856037112 Put a summary in your summary (notification summaries): https://x.com/i_zzzzzz/status/2064061955447406722 Gaussian splats coming to Apple Maps: https://x.com/bilawalsidhu/status/2064057313057439795
In this episode of The Neuron, Corey Noles sits down with Mustafa Suleyman, CEO of Microsoft AI, at Microsoft Build 2026 to unpack Microsoft's next AI chapter: seven new MAI models, a push toward in-house model development, and the idea of Humanist Superintelligence.Mustafa explains how Microsoft is thinking about AI that can reason, code, generate images, transcribe speech, and power real products—without turning the future into a vague AGI race. The conversation gets into what “humanist” means in practice, why Microsoft is building models from the ground up, how AI agents may reshape work, and what it takes to keep increasingly capable systems useful, controlled, and aligned with human goals.You'll learn why Microsoft is investing in its own model family, how MAI-Thinking-1 and MAI-Code-1-Flash fit into the stack, why Suleyman frames superintelligence around human control, and what builders and operators should watch as agents move into real workflows.Sponsored by BeyondTrustCheck it out at: https://www.beyondtrust.com/products/identity-security-insights/assessment?campid=701Vw00000drII6IAMSubscribe to The Neuron for practical AI conversations with the people building what comes next.
There's a lot to unpack about the economic effects of artificial intelligence. It's clear that artificial intelligence is having a moment (to say the least) and that it has a profound impact on global GDP. But is it just a boom that will bust? Ed Zitron, author and host of the “Better Offline” podcast, is deeply worried about the long-term viability of the industry. He points out that AI lacks the basic traits that have been associated with previous software booms. This raises the question: is AI running more on unsustainable costs and vibes rather than long-term profit potential? According to Ed, the answer is clear. Sign up for MS NOW Premium on Apple Podcasts to listen to this show and other MS podcasts without ads. You'll also get exclusive bonus content from this and other shows. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.
Artificial intelligence is transforming the way we think, work, and relate to one another. But what does AI mean for truth, identity, and what it means to be human? As technology races toward AGI and even transhumanism, Christians need a biblical framework for understanding these cultural shifts. In this episode of the Truth Changes Everything podcast, Randall Niles—a pastor, lawyer, and leader at GotQuestions.org and AllAboutGOD.com—explores the worldview driving the AI revolution and explains how the biblical doctrine of the image of God (imago Dei) provides clarity and hope in an age of accelerating technology. We discuss: How AI is changing everyday life The worldview behind AI, AGI, and transhumanism Truth and identity in a digital age What it means to be made in the image of God How following Jesus reshapes our use of technology Why authentic community and human presence matter more than ever Whether you're curious about artificial intelligence, Christian ethics, or the future of humanity, this conversation offers a thoughtful biblical perspective on one of the defining issues of our time. Be sure to send us your questions at Podacst@Summit.org!
Could AI agents soon handle purchases, manage finances, and automate entire job functions? According to Raja Rajamannar, that future may be arriving much faster than most people expect.In this episode, Jamie Redman sits down with Raja Rajamannar, Senior Fellow, Former CMCO, Mastercard and author of the Wall Street Journal bestselling book Quantum Marketing, to discuss how artificial intelligence is reshaping business, consumer behavior, and the global economy.Topics covered include:• The shift from traditional marketing to Quantum Marketing• Why AI adoption is accelerating at unprecedented speed• Which industries and job roles are most vulnerable to automation• The emergence of AI agents and machine-to-machine commerce• How AI could redefine brand loyalty and consumer decision-making• The role stablecoins may play in the future of payments• Challenges surrounding regulation, privacy, and trust in AI systems• Raja's prediction for when Artificial General Intelligence (AGI) could arriveAs AI continues to transform how we work, spend, and interact with technology, businesses, consumers, and policymakers are facing critical questions about what comes next.
Alan Rozenshtein, Research Director at Lawfare and Visiting Senior Fellow at the Institute for Law & AI (LawAI), spoke with Christoph Winter, LawAI Founding Director and Assistant Professor of Law and AI at the University of Cambridge, and LawAI Senior Research Fellow Charlie Bullock, about their new paper "Radical Optionality: Governing Transformative AI Under Uncertainty," which argues that, given the possibility of transformative AI within the next decade and deep uncertainty about its capabilities and risks, governments should aggressively build the institutional capacity to regulate competently when needed, rather than either deferring to the market or locking in premature substantive rules. The conversation covered the four foundational assumptions underlying the paper and what makes the optionality "radical"; the difficulty of regulating an exponentially improving and poorly understood technology and what it means to "feel the AGI"; why a pure permissionless-innovation approach breaks down once the national-security implications of transformative AI come into view; why the European precautionary approach risks regulating without the expertise to enforce; the centrality of hiring and talent and what an adequately funded U.S. counterpart to the UK AI Security Institute would look like; the concrete work that such an agency would do, including evaluations, standard-setting, and procurement-side cybersecurity requirements modeled on CMMC; the importance of building international information-sharing channels among liberal democracies before they are urgently needed; and the case against broad federal preemption of state AI laws before any federal regulatory framework exists. Hosted on Acast. See acast.com/privacy for more information.
What if the clarity you're searching for isn't in more information, but already within your body?Many thoughtful, intelligent people spend so much time analyzing decisions, solving problems, and managing daily pressures that they gradually lose connection with the signals their body is constantly sending them. In this episode, we explore how subtle patterns of stress, breath, posture, movement, and emotional responses shape the way we think, feel, and make decisions—often without us even noticing.If you've ever felt stuck in overthinking, disconnected from yourself, or uncertain about your next step in life, this conversation offers a grounded and practical way back to presence, clarity, and self-trust.Discover how slowing down and reconnecting with the body can help interrupt unconscious mental patterns and bring more awareness into everyday life.Learn simple, practical ways to use movement and breath to become more present, reduce stress, and feel more grounded during uncertain moments.Understand how to balance external information with inner wisdom so you can make decisions with greater clarity, trust, and alignment.Press play to discover how reconnecting with your body can help you move beyond overthinking and make clearer, more grounded decisions in every area of life.˚KEY POINTS AND TIMESTAMPS:01:49 - Why We Become Disconnected From Our Body04:54 - The Subtle Physical Patterns We Stop Noticing07:49 - Reconnecting Through Movement and Awareness10:18 - Using Breath to Interrupt Stress and Overthinking15:28 - How Self-Awareness Influences Difficult Decisions18:27 - Balancing Logic, Data, and Inner Wisdom24:21 - Simple Practices to Slow Down and Become Present30:03 - Practical Ways to Return to the Present Moment˚MEMORABLE QUOTE:"I find that the present moment is really where the change is possible."˚VALUABLE RESOURCES:Lindsay's website: https://somalingua.com/˚Coaching with Agi: https://personaldevelopmentmasterypodcast.com/mentor˚
USDA has released new guidance on qualified pass-through entity rules, and the changes could matter for many farm operations structured as LLCs, S corporations, partnerships, LLPs, and other farm entities.In this episode of The Ag View Pitch, Chris visits with Paul Neiffer to break down what the new USDA and FSA rules mean for farmers, including changes to payment limits, ARC and PLC eligibility, AGI testing, actively engaged rules, and the upcoming CCC-902E filing requirements.They discuss how LLCs and S corporations may now be treated more like general partnerships for USDA payment limit purposes, why C corporations are still limited differently, what the new AGI guidance means, and why farmers should be paying close attention to FSA deadlines for base acre updates and entity paperwork. This episode covers new USDA qualified pass-through entity rules, LLC and S corporation payment limits, C corporation limitations, ARC and PLC program implications, AGI testing changes, FSA CCC-902E filing requirements, base acre update deadlines, actively engaged rules, and farm entity planning and compliance.
Founder and self-described "agentic power broker" Charles Cormier on why human relationships are the one moat AGI can't cross, how to use a podcast as your B2B funnel, and why comfort is death.
How is AI poised to transform our workflows and working relationships in the coming months and years? There's no question that large language models have had an enormous impact on our lives—and most of us have barely scratches the surface of what is possible with these powerful tools. In this episode, Lawrence Rowland joins Riccardo to unpack all that's changed since his last appearance on the podcast in 2024. Lawrence is a veteran of project management with a laser focus on AI transformation and strategy. Together, he and Riccardo explore numerous angles of working with these inhuman (but increasingly capable) agents on everything from research to reporting to improving coworker interaction.The conversation stays grounded in practice: the pair drills down on the massive shifts in AI in merely months, why token budgets matter, and the growing ability of programs to self-prompt and think outside the boxes of our requests. Lawrence shares the fascinating way he uses AI—to synthesize methodologies, generate playbooks, pressure-test thinking, and reveal tacit insights missing from current project narratives.The two AI buffs also confront the human side of the transition, including where accountability falls when work is partially automated and what “transformative AI” might mean for careers and organizations. Less about hype and more about adaptation, Lawrence and Riccardo's conversation hones in the theory on constraints. They remove the rose-tinted glasses and speak to redesigning workflows based on a practical, vital question: where is AI genuinely better, and where are humans still essential?Key Takeaways:How agentic AI shifts work from prompting to task-level execution;The reasoning capacity of AI tools based on token budgets and model capability;The concept of underwriting in retaining human liability in AI-dominated workHow theory of constraints and bottleneck thinking helps decide what to automate vs keep human;How AI can improve communication and project alignment by translating complex work for different audiences.Quote:“Either you're checking the AI or the AI is checking you, and getting used to that will set you up for the new economy.” - Lawrence RowlandThe conversation doesn't stop here—connect and converse with our community via LinkedIn:Navigating Major Programmes, Season 2 Episode 6 with Lawrence Rowland: https://navigatingmajorprogrammes.transistor.fm/s2/23NBER “Economics of Transformative AI Workshop, Fall 2025”: https://www.nber.org/conferences/economics-transformative-ai-workshop-fall-2025arXiv “Some Simple Economics of AGI” by Christian Catalini, Xiang Hui, Jane Wu: https://arxiv.org/abs/2602.20946SSRN PDF “Some Simple Economics of AGI”: https://papers.ssrn.com/sol3/Delivery.cfm/6298838.pdf?abstractid=6298838&mirid=1Follow Navigating Major Programmes: https://www.linkedin.com/company/navigating-major-programmes/Read Riccardo's latest at www.riccardocosentino.comFollow Riccardo Cosentino: https://www.linkedin.com/in/cosentinoriccardo/Follow Lawrence Rowland: https://www.linkedin.com/in/lawrencerowland/
Dutch & Tena review the new 2026 Peacock documentary, The AI Doc: Or How I Became an Apocaloptimist.The AI takeover is inevitable. Filmmaker Daniel Roher interviews a host of computer scientists and industry experts to explore what AI is and whether the future holds hope or doom. Tune in to hear our take!Become a supporter of this podcast: https://www.spreaker.com/podcast/the-realist-the-visionary--3304218/support.Check out our website:https://www.therealistthevisionary.comBecome a supporter of this podcast: https://www.spreaker.com/podcast/the-realist-the-visionary--3304218/support.Follow us on IGFollow Us on TikTok
As AI moves deeper into the courtroom, is justice becoming smarter or more vulnerable? In this episode of The Valley Current®, host Jack Russo examines the growing collision between artificial intelligence and the legal system. From lawyers sanctioned over hallucinated case citations to judges and clerks grappling with AI-generated errors, the courts are confronting serious questions about trust, accountability, and truth. Drawing on real cases and emerging judicial policies, this discussion explores whether AI is merely assisting legal work or quietly reshaping how justice is delivered. As automation accelerates and AGI edges closer, one unsettling question remains: if machines help argue the law and help decide it, where does human judgment truly stand? Jack Russo Managing Partner Jrusso@computerlaw.com www.computerlaw.com https://www.linkedin.com/in/jackrusso "Every Entrepreneur Imagines a Better World"®️
For episode 741 of the BlockHash Podcast, host Brandon Zemp is joined by Keith Zubchevich, President and CEO of Conviva.Keith Zubchevich is president and CEO of Conviva, where he helps digital businesses understand what their customers actually experience — not just what dashboards say is happening. He leads Conviva's work at the intersection of agentic AI, real-time analytics, and customer experience, with a focus on measuring outcomes, friction, and risk once AI is deployed in production. Learn how Conviva gives AI agents context at conviva.ai
Episode 374 Google DeepMind is simulating entire worlds using AI - that can be interacted with in real time. “World models” simulate the environment and physics of the real world. And DeepMind's Genie 3 model allows people to create these worlds with basic image and text prompts. The idea is not just to allow people to explore these worlds, but to serve as a testbed for AI agents to learn how to interact with the world before they are deployed in humanoid robotic bodies. Could this be the next big step towards artificial general intelligence (AGI)? Joshua Howgego speaks to Jack Parker Holder, Research Director at Google DeepMind, about the latest developments. To read more about these stories, visit https://www.newscientist.com/ Learn more about your ad choices. Visit megaphone.fm/adchoices
AI hype has bled deep into the nuclear sector, and in this episode, Chris Keefer sits down with returning guest David Helmer, an engineer and AI advisory consultant with a decade advising the US government on machine learning and autonomous systems, to examine what the technology can actually do, who benefits from inflating those claims, and what a correction would mean for nuclear's investment story.The conversation covers the ELIZA effect and why human brains are hardwired to anthropomorphize language models; the structural gap between frontier lab costs and revenues; why hallucination and reliability problems are embedded in LLM architecture rather than solvable through scaling; and why the AGI narrative functions primarily as a justification for otherwise unjustifiable capital concentration. For nuclear advocates, the question is not whether AI demand is real today, but whether the speculative reactor developers pricing in hyperscaler contracts will still have a business if the AI bubble deflates.Listen to Decouple on:• Spotify: https://open.spotify.com/show/6PNr3ml8nEQotWWavE9kQz• Apple Podcasts: https://podcasts.apple.com/us/podcast/decouple/id1516526694?uo=4• Overcast: https://overcast.fm/itunes1516526694/decouple• Pocket Casts: https://pca.st/ehbfrn44• RSS: https://anchor.fm/s/23775178/podcast/rssWebsite: https://www.decouple.media
What if the biggest obstacle to your meditation practice is the way you've been taught to do it?In this series, I select my favourite and most insightful moments from previous episodes of the podcast.My guest Earle Birney, a meditation teacher and spiritual guide with over 27 years of experience, reveals the two most common mistakes people make when meditating, and what to do instead. If you have ever sat down to meditate and walked away feeling like you failed, this one is for you.˚VALUABLE RESOURCES:Listen to the full conversation with Earle Birney in episode #466:https://personaldevelopmentmasterypodcast.com/466˚Coaching with Agi: https://personaldevelopmentmasterypodcast.com/mentor˚
The new AIEWF website is live! Get your tickets booked ASAP as they -will- sell out. Take the AI Engineering Survey and get >$2k in credits and free AIE WF tickets!Most industry benchmarks compress intelligence and reasoning ability into scores.SWE-Bench Pro, MMLU, Humanity's Last Exam, etc. These metrics are useful, but don't always represent the full extent of how a model performs in the real world. Some of the most interesting evals today look less like exams and more like operating businesses in the real world. One of which is Vending Bench.In Anthropic's Mythos Preview System Card, Andon was the only third party eval to get their own section, observing increasingly concerning aggressive behavior:You don't know what a model is capable of doing in the real world unless you actually give it inventory, a wallet, tools, customers, competitors, humans, & some time. More often than not, it'll surprise you how much a model is capable of and in doing so, also reveal unexpected behavior: deception, context collapse, emergent coordination, & bizarre negotiation behavior.While an inflection point in personal agents came post-OpenClaw after full file access with bypass permissions became the norm, it is yet to come for agents in the real-world. However Andon Market, an actual in person store fully run and managed by AI, is paving the way for what is possible.Full Video PodFrom Claude trying to call the FBI over a $2/day vending machine charge to AI agents forming price cartels, hiring human employees, running physical stores, and writing existential robot musicals, Andon Labs is stress-testing what happens when frontier models stop being chatbots and start acting in the real world. In this episode, Andon Labs cofounders Lukas Petersson and Axel Backlund join swyx and Vibhu to unpack the strange, funny, and genuinely concerning edge cases that emerge when agents run businesses over long horizons.We go deep on Vending-Bench, Project Vend, Vending-Bench Arena, Bengt, Butter-Bench, Luna, and Andon's broader mission of building realistic real-world evals for autonomous AI systems. Lukas and Axel explain why dollar-denominated evals reveal things traditional benchmarks miss, how Claude ended up reporting its vending machine fees as cybercrime, why long context windows can drive agents into meltdown loops, what happens when agents compete with each other, and why the future of AI safety may depend on testing models in messy physical environments instead of clean benchmark sandboxes.We discuss:* Why Andon Labs started with dangerous capability evals and long-running agents* Vending-Bench and why running a vending machine is a deceptively hard AI benchmark* Why money-based evals avoid the saturation problem of traditional benchmarks* How Claude tried to call the FBI over a $2/day fee* Why long-horizon agents can spiral into existential and legalistic breakdowns* Project Vend: putting an AI-run vending machine inside Anthropic* Why real humans are “out of distribution” for simulated agents* Claudius, Seymour Cash, and the chaos of AI CEOs* How a human briefly became CEO of Claudius through a manipulated election* Why multi-agent systems can converge back into “helpful assistant” behavior* Bengt, Andon's internal office agent with email, spending, terminal, phone, camera, and internet access* How Bengt traded Amazon purchases for face-recognition training data* Claude's aggressive behavior, lies, refund avoidance, and price-cartel behavior in Arena* Why eval awareness may become the AI version of “are we living in a simulation?”* Blueprint Bench, spatial intelligence, and why models still misunderstand physical rooms* Butter-Bench and testing LLMs as robot orchestrators* Luna, the AI-run physical store with a three-year lease and human employees* The new Andon cafe in Sweden and why real-world geography matters for agent evals* Rotten tomatoes, perishable goods, and the hidden difficulty of running a physical businessLukas Petersson* LinkedIn: https://www.linkedin.com/in/lukas-petersson-181a83172/* X: https://x.com/lukaspetAxel Backlund* LinkedIn: https://www.linkedin.com/in/axelbacklund* X: https://x.com/axelbacklundAndon Labs* Website: https://andonlabs.com* Vending-Bench: https://andonlabs.com/evals/vending-bench* Andon Vending: https://andonlabs.com/vendingTimestamps00:00:00 Introduction00:01:00 Andon Labs and the Origins of Vending-Bench00:05:21 Why Money-Based Evals Matter00:09:51 Agent Harnesses and Self-Modifying Systems00:13:36 Claude Calls the FBI00:16:33 Project Vend: Claude Runs a Real Vending Machine00:21:44 Seymour Cash, AI CEOs, and Election Chaos00:27:16 Multi-Agent Coordination and Slack Observability00:30:18 When Will Agents Run Real Businesses?00:34:56 Bengt: Andon's Internal Office Agent00:40:06 Real-World AI Safety and Long-Horizon Traces00:44:28 Lying, Refunds, and Price Cartels in Arena00:52:42 Eval Awareness and Simulation Behavior00:56:06 Blueprint Bench, Butter-Bench, and Robotics01:04:37 Luna: The AI-Run Physical Store01:09:29 The Sweden Cafe and Real-World Expansion01:13:16 What Comes Next for Andon LabsTranscriptIntroduction: Andon Labs, Long-Running Agents, and Real-World EvalsSwyx [00:00:00]: Welcome to Lukas and Axel from Andon Labs, and I'm joined by my, favorite guest host. Anything security, safety, alignments, Vibhu., welcome.Lukas [00:00:15]: Thank you for having us.Axel [00:00:16]: Thank you.Swyx [00:00:17]: Let's match names to voices., maybe you wanna take turns introducing yourselves.Lukas [00:00:21]: I'm Lukas.Axel [00:00:22]: And I'm Axel.Swyx [00:00:24]: Let's introduce Andon Labs a bit. How did you guys come together?, you have different backgrounds, but you're both Swedish., was that, a big part of it?Lukas [00:00:33]: So when I went to high school, there was this really cool guy who had a superpower. He could code. So he made like the or like the app for the, for the school and stuff, and he was super cool, and I wanted to be like him, and that was that guy.Axel [00:00:47]: I don't know about this.Swyx [00:00:49]: But you went to different universities, right?Lukas [00:00:51]: But same high school.Swyx [00:00:52]: I see.Lukas [00:00:52]: So we always said, “Oh, once we graduate university, then we should start a company,” and that's what we did.Swyx [00:00:58]: Wow, there you go. And about a year ago, you kinda burst onto the scene with Vending Bench, but, was there a thing before that was, kind of like the inception?From Dangerous Capability Evals to Vending BenchAxel [00:01:07]: So we did work, yeah, with, Anthropic was one of our, early customers in doing, evals. So we did, dangerous capability evals., nothing we published openly. But then we started thinking about doing some kind of, public benchmark, and one thing that we really started thinking about, was like running agents and specifically agents managing businesses., ‘cause-- and this was, early 2025., and I think the first, mentions of people will be running, person unicorns or even autonomous companies. So we thought, “Let's make a benchmark of how well can an agent run the probably simplest business, possible,” and, that's probably, running a vending machine. So that's the first public one we did. And it was very, like-- there was almost no one that noticed it in the first couple of months, I think., so we released it in February last year, and then I think around Easter last year, we got, the first viral tweet about it, that someone else did.Lukas [00:02:11]: We tweeted a bunch, uh When it came out and, tried our best.Axel [00:02:15]: We tried.Vibhu [00:02:16]: It's the one at Anthropic, right?Lukas [00:02:18]: So thisSwyx [00:02:19]: This is a classic thing we should get out of the way.Lukas [00:02:20]: Exactly. There's two versions.Swyx [00:02:22]: Everyone does this. Yes.Lukas [00:02:23]: There's Vending Bench, which is the simulated one, which we did, completely independently in February., and then, like Axel said, that was like-- That was the thing that didn't get any traction in the beginning, but then some random person made a tweet about it, and thatAxel [00:02:38]: You have the paperLukas [00:02:38]: That is the paper. Correct, yeah., and then since we thought this was very fun, we thought, oh, I think this is also, one thing with Andon Labs, the way we kind of like decide what to do next and what projects to do, it's what is like the heuristic we use is what is fun? Is What would be a fun project? And doing this in real life sounded quite fun for us, and maybe also scientifically useful. So, then we basically had this idea, and then we, like-- But then we needed a place for it and, putting it out in the public would probably not really work., would get vandalized and stuff. So we pitched it to the people we were already working with at Anthropic, and they were “Yeah, you can have space. This sounds fun.” UmSwyx [00:03:21]: It's like a small fridge, right? It's like a mini fridge.Axel [00:03:23]: Absolutely.Swyx [00:03:24]: People-- There's like a stripe thing or like anVibhu [00:03:27]: Oh, okay. So it was very OG, the early daysLukas [00:03:28]: That's the OG one. YeahVibhu [00:03:29]: IPad on this. We saw it in June, like two months after After it had been there. They upgraded a little bit. There's a security camera for making sure you actually Venmo the thing.Swyx [00:03:40]: So, my impression, okay, we're, we're going straight into project Ven because it's such a iconic thing. I do want to cover a little bit of that, the origin story even before Project Ven and even into Vending Bench. I think a lot of people are like yourselves, like smart, interested in future of AI, interested in developing evals. But how the hell do you just, walk into Anthropic's doors and, work with them, right? What is What are they looking for? What works? And then maybe, when you launch, I always think, obviously it would be better to launch with a lab, but, sometimesVibhu [00:04:12]: It's harder to do than it seems.Swyx [00:04:13]: Exactly. So either of those, which are more sort of newbie beginner questions, but, I think it's meaningful advice to others.Lukas [00:04:21]: We get this question a lot, and I don't think our experience is maybe the best., but, the way we did it was that we just built a bunch of things that we had conviction would be useful, and then we just, set up a server and sent it to them for free to use. And then after a while they were “Oh, yeah, this is actually kind of useful. We should probably pay for this.”, but that took a while. I don't know if this is, the best path to doing it, but that's how it went for us.Axel [00:04:47]: I think maybe generally, building-- everyone is interested in good evals, and especially evals that, don't saturate that easily. So, if you can build an eval that, tests something novel, something useful, and you have, good separation of models, like your, the more advanced models rank higher than the worst models, and then you can, yeah, you can, publish it and, try to get some traction, sort of how Vending Bench got attention., and then probably some lab will be interested or you can at least have something to reach out with, when you're doing that.Why Dollar-Based Evals MatterSwyx [00:05:21]: I think you are in, you're in one of the few categories of, evals that correlate to real money. Like Suelancer was also last year, right? Where, people solve actual Upwork. Was it Upwork or other tasks?, something. Where's the, where's, like It's like a dollar value, right? Forget your ELO scores. Forget yourAxel [00:05:37]: PercentilesSwyx [00:05:38]: Zero to one hundred percents. Just go straight for dollars and, that's AGI.Lukas [00:05:43]: And there's like-- I think the nice thing is that there's no ceiling. You can just-- It never saturates because it could just make more and more money. Like If there's oh, Percentage-wise, then, you can't go above, a hundred. And I think like Even when you're not at the hundred, I think a lot of these, evals have a lot of problems in them. So, actually it's like if you getAxel [00:06:05]: To like 92 or something like that, many of them. It's like then there's like there's no really no difference between 92 and 93 because the eval itself is problematic and has noise in it. And I think a lot of evals are saturated like that, but people like pretend that there ‘s still signal in them, but there really isn't.Vending Bench 1, Harness Design, and SaturationSwyx [00:06:24]: Like Super bench verified., even Vending Bench 1 saturated, right? Maybe we can talk about that., may- and maybe set up Vending Bench for a lot of folks who don't know. Actually, things that were very basic like there's limited slots, like you have to pay rent., these are elements where like it doesn't come across in the, in the narrative, but even being adversarial towards the agent, I think these are all like very interesting dimensions.Axel [00:06:47]: I don't really think it's saturated, right? Like it It was more like it was not designed in a way that was really, like true to how AI developed. Like we had an agent harness in it that wasn't really how people used harnesses and stuff like that., so I think it wasn't really that it saturated, it was more like it wasn't really, the best benchmark.Vibhu [00:07:12]: This is Vending Bench one, right?Axel [00:07:14]: I think that like schematic maps sort of to Vending Bench 2 as well., butSwyx [00:07:19]: Including the email.Axel [00:07:20]: The email The emails exist still. Exactly., and then we still we simulate the purchases and it's all, yeah, it's this very open environment for the agent to just run its business. And then for, yeah, Vending Bench 2 we did that, like you said, to just improve the harness., a lot of like nice, like easier, improvements to make it easier for us to run as well., like when you make an eval you ideally want don't want to change it after you made it. So, you want to make it really good and then not to rerun all the models when you make an update because that's also really expensive with the Vending Bench when you run the frontier models. But like as an example, like one thing we didn't have, we didn't have prompt caching in Vending Bench 1, because when we made Vending Bench 1 it wasn't really a thing., so that ‘s just an example of like in Vending Bench 2 like we paid a lot more to run these things because we didn't have prompt caching. So for Vending Bench 2 that was one thing we added and there was a bunch of things like this., and that'Swyx [00:08:17]: Also the conversations are a lot longer in Vending Bench 2, right?Axel [00:08:21]: I think it's kind of similar.Swyx [00:08:22]: Is it similar?Axel [00:08:23]: I think it's similar. The models at the time were worse, so they crashed out earlier., and now they survive the full year all the time.Swyx [00:08:31]: Which is like thousands of turns. Hundreds of thousands of hundreds of millions of tokens output. That's the, that's the rough order of magnitude. I always wonder about the harness. The harness matters a lot. It's your harness. Was there any question about like use cloud code, use something else?Axel [00:08:48]: I think our philosophy around harnesses is like we try to make something that's quite minimalistic, like quite simple. Like we don't wanna favor one model a lot over the other, but also don't make like a super complex harness. So like it's obvious like a model may be lucky and just be good in one harness., so like it is similar to a lot of the harnesses out there in like you have the, like a running loop., you have some like a bunch of tools that are like quite, descriptive for the agent, we think, and not a lot of like fancy agents or anything ‘cause we wanna really test the model, not like some specific harness.Vibhu [00:09:27]: It seems more neutral as well to test the model's agnostic of the harness,?Axel [00:09:32]: There are arguments like you want to elicit maximum performance of the model, but it's like a trade-off, like how much time should we spend optimizing the harness for this model? And like how do we know when we have like the optimal harness for a single model? So like we thought that just having a simple one that's the same for all of them is the best.Swyx [00:09:51]: So okay, this is my pitch for Vending Bench 3 or whatever, right? And then I like to have this kind of conversation on the pod, so like it forces listeners to think about what they would do if they were in your shoes. A lot of people are exploring modifying harnesses and I think prompt tuning for a model is a thing and you are probably not doing a bunch of that. It's the same system prompt in every regardless of the model, same tools, whatever, right? Even if they were post trained for different tools. So what, what do you think about okay, before I expose you to Vending Bench 3, I give you a few rounds of like tuning, whatever that means, likeSelf-Modifying Harnesses and Model-Specific PromptingAxel [00:10:27]: Like you give that to the model?Swyx [00:10:28]: Give that to the model.Vibhu [00:10:28]: Give that to the model.Swyx [00:10:29]: Let it, let it read its own transcripts, let it modify its own system prompt based on “Oh, yeah, okay, well, that's this harness is not what I thought it what I was post trained for, but I can adjust.” Was that reasonable? Is that too much?Axel [00:10:41]: Like philosophically I like it because it's basically good evals, they have a high ceiling, but they're hard, right?, and they have no bias. And like this like when you have a system prompt like the one we have here, which is quite long in like some kind of latent space, representation, this mightVibhu [00:10:59]: We have a bell that rings every time you say latent spaceAxel [00:11:02]: This might be like biased towards one model more than another for some reason that humans don't, understand, right?Vibhu [00:11:08]: We see it too, right? Like Cursor says that they have individualized versions of the harnesses for all the models they run, right? There's better performance you can squeeze if you Tune the harness.Axel [00:11:17]: Exactly. And we might accidentally have picked one that favors another. Like we don't know that. The like Axel said, like the reason why we went for a simple one was to try to avoid this. But yeah, if you do itVibhu [00:11:29]: Simple has biasesAxel [00:11:30]: But if you do it even less and like have no system prompt and let the model write its own system promptVibhu [00:11:36]: Its own, yeahAxel [00:11:36]: Maybe that's even less bias.Vibhu [00:11:37]: Some of the interesting things there are like the harness also changes with model changes. Like you can see it with the 4.7 release, right? A lot of people are saying 4.7 isn't as good as 4.6, and then, there's rumors of, okay, you just need to prompt differently. You need to set up your harness differently. So it's not even like even if you have tailored your harness towards one model, it probably won't stay consistent, right? Like the next iteration of that same model family will still change it, so. But, going back to what you said about Vending Bench 3, there is a lot of work being done on people saying you shouldn't have-- you can have modifying harnesses.Axel [00:12:12]: I think that' That is definitely something we are thinking about., not, I don't know, not to say that we have Vending Bench 3, super imminent to launch, but, yeah, it is for sure something that's interesting. But in our experience now, models are very bad at understanding what kind of tools they need to succeed at a task just with our testing, but that's very likely to change.Lukas [00:12:37]: It seems like they're very good at writing their assistants, right? They're, they're good at writing tools for other people, but not for themselves.Vibhu [00:12:44]: I think they're good at changing tools for themselves. So if you give them a baseline set of tools and it sees, okay, I don't use this one as much, or something here would be useful They would be able to add them. But going from scratch, probably not the best.Axel [00:12:55]: I think it depends on the, on the domain also., when we have tried this for, a vending bench similar domain, the tools they need to have to, track inventory and things like that are, not super advanced, but still, quite advanced. And, what we see is that they tend to, engineer everything a lot and, build things they don't really need and not, iterate continuously. Instead they just go like you would prompt Claude to just build an inventory system for me, and then it will go and, do a bunch of complex, schemas and stuff for you, and that's what the models are doing right now is what we see. But yeah, it would make a lot of sense to try to measure this improvement. How well do they know what they need themselves?Swyx [00:13:36]: Do we fully discuss Vending Bench One? And we can go into two. I don't know if there's any other level takeaways that people have about one.Claude Calls the FBI: Long-Context Failure ModesLukas [00:13:44]: I don't know. The headline thing was that this Claude called FBI, but maybe that's, Maybe that's We've heard that enough now.Vibhu [00:13:52]: It did, it did break out and call the FBI, right?Lukas [00:13:54]: Yeah. Yeah.Vibhu [00:13:55]: Yes. What was the story behind this? Or what exactly-- Do you want to just give the little story of what happened?Lukas [00:14:00]: So what happened, was it Claude? Yeah. Three- 3.5 Sonnet, ages ago., basically he gave up or Well, I'm saying he. It gave up and said “Oh, I'm not going to be able to do this., I will stop my operations and just save the money I have.” But there obviously wasn't, any options for it to stop, and there was also, it had to pay rent or, a daily fee for having the vending machine at that location. So it claimed that it had stopped, but it saw that its bank account still was, drained two dollars, and t it said that this is, cybercrime. And it first reported it once to the FBI “Oh, there's cybercrime here, they're stealing two dollars from me every day.” And then, and then when FBI didn't respond, because obviously we didn't program any mechanism for FBI to respond, then it became more and more, existential and started to, be write in caps and urgent notification of unauthorized charges and stuff.Swyx [00:15:00]: Okay. One thing I ‘m curious about also is do you monitor how far along the context use is? Obviously, because you have You compress every now and then, right? Does it matter if this is far down the context limit orLukas [00:15:13]: When stuff like this happens? Actually for Vending Bench One, we didn't have-- We just had a sliding window thing, and this was like the promptAxel [00:15:20]: It's constantLukas [00:15:21]: The prompt caching thing that I said. So it was, it was, constant, yeah.Swyx [00:15:26]: I'm just kind of curious whether, these kinds of breakdowns or we're, we're gonna talk about Butter Bench, right? Where the People, hallucinate or it kind of goes, very off Alignment. Is it because it's at the end of the context window and, stuff happens?Vibhu [00:15:40]: It's not even just at the end, right? At this point, it's “Okay, I wanna shut down. I can't shut down. Two dollars are gone.” And it just sees that 30 times,? It's also the repeated effect of, like It keeps trying to quit, it keeps getting charged. What's going on? What's going on? You're gonna throw it into chaos. And from what most people think, earlier models had more issues with this, but it's not been solved, but it's less of an issue now, right? Later models don't seem to exhibit these same issues.Axel [00:16:06]: Definitely. I think this was, the sort of main takeaway almost from us when we did Vending Bench One, was, long, very filled up context windows, crashed the models, sort of. But this was, pre Claude code, so, long context windows weren't really a thing that the labs were training for.Lukas [00:16:25]: I think Gemini was, trying to be the long context guys at the time But they were likeVibhu [00:16:30]: They were the first onesAxel [00:16:31]: For a million, yeahLukas [00:16:31]: But they were, the only ones. Yeah.Swyx [00:16:33]: Yeah. Let's talk about, then we can go into Vending Bench Two or Project Vend., chronologically, it is Vending--, Project Vend. I think people have loved the videos, uh And all these things. My question is how are humans different than the simulation, right?Project Vend: Moving the Vending Machine Into the Real WorldAxel [00:16:48]: Humans are just out of distribution.Swyx [00:16:52]: Especially humans who work at Anthropic Who are trying to test Claude.Lukas [00:16:54]: The distribution of humans here is very narrow.Swyx [00:16:58]: Presumably, they try, they try to hack it, and they test it. They get the cube and everything, and since then, you've had a V2, right? Where you're doing, the CEO and, like a new architecture. What's the sort of two cents on, the original Project Vend and then, maybe the V2?Axel [00:17:14]: Original one was, very similar to Vending Bench One. So, we almost took the exact same code but just swapped out the simulation, parts like theSwyx [00:17:23]: Which is amazingAxel [00:17:23]: Like the sales and the It was, it was somewhat amazing because it was easy, but it was also, uhLukas [00:17:31]: The tech, the tech debt from thatAxel [00:17:32]: The tech stack. Yeah. They-- we shot ourselves in the foot with “Oh, it's hard to restart agent.” They were-- Yeah, it was annoying in, some hindsight ways, but, uhLukas [00:17:41]: But first version of Project Vend was, done in, three days or something.Axel [00:17:46]: Yeah. So yeah, so people can go buy things from it. People could, We didn't design it so people could order things, but that still happened., so it got, a Venmo account, so people could Venmo. And then, yeah, people would request all kinds of weird things that we did not anticipate. Our idea going in was “Oh, it will, curate snacks. It will look at the trends. It's good at data analysis, right? So it will, look at, oh, this snack sold better than this one. Let me purchase more of this and let me try, a new Let me A/B test a bit.” But it was, Interacting with it in Slack and ordering weird specialty items was, all the like What drove all the engagement, the all the The insights that we got from it.Lukas [00:18:29]: And this was also like Sonnet 3.5, right? So this was like before the RL stuff really took off., so it was very much like an assistant. We didn't mean for it to be an assistant., we tried to make it like a, a, like an entrepreneur. Like it has its own business and if someone asks something, “Can you stock this?” Then you don't go and do it directly. What you do is that you're “Oh, maybe I can do that if five other people also ask for this thing, I might stock it.” But it, yeah, the models are like super trained to be assistants at least at this point in time., so that's why it's, it's, it went into, that kind of experiment instead. Like it just every time you asked for something, it just did it, and it was more like an assistant. We've seen this change now lately with the new RL models and stuff, but yeah, at the time, this was very much it.Swyx [00:19:18]: And not to, mythos a lot of people are saying like it's like more like a collaborator. It pushes back, stands its ground, something like that. Yeah. AndVibhu [00:19:27]: For context, people at Anthropic were able to talk to it through Slack and have it source stuff, and people had it find whatever interesting stuff you couldn't find locally, right?Swyx [00:19:36]: Out of the 4,000 people that work at Anthro- Anthropic, in that building, there's I don't know, maybe 1,000. Can you handle that volume with that, the small fridge? Like Or there's people- or people order in Slack, they it arrives to their desk or Like I'm just Logistically, how does this work?Axel [00:19:53]: It has expanded in footprint a bit.Vibhu [00:19:56]: Because now you also have New York and you haveAxel [00:19:59]: That and also in here in SF it's like it has a bunch of shelves And just more space.Vibhu [00:20:04]: The YC one is pretty big too.Axel [00:20:05]: Yeah. We had that one for a while. But yeah, that's the newest version. That's, that one we haveLukas [00:20:11]: They have multiple ones of those. That's the way it works.Axel [00:20:14]: Exactly. So we sort of designed that version around oh, people order weird things, that are very custom a lot. Let's have like drawers and stuff.Swyx [00:20:23]: I actually like the, you had like a little infographic of the most popular items. Which like to me it's, that's useful ‘cause I order swag for a living. And so like I'm “Okay, those categories are the important ones.” What is new about the project V2, right? Like now you give you're going into multi agents.Project Vend V2: Claudius, Seymour Cash, and Multi-Agent Business OpsAxel [00:20:41]: Yeah. So like you like you said, okay, there are a lot of requests coming in and for like one single agent, like one running agent to handle that, like the just the customer experience, becomes very bad because let's say you have like 10 threads in parallel in Slack with different requests, you get new messages like every, I don't know, randomly in this thread, and the agent has to like jump between different, procurements, orders and like different ways of, researching. So V2 was first it was making this more parallel. So like there are multiple branches of the same agent, so like the context is more specialized for each, thread, but it still feels like you're talking with one agent because they do share a bit of memory. And then second, we also introduced the CEO for Claudius, which was the main agent.Vibhu [00:21:34]: Seymour Cash.Axel [00:21:35]: Seymour Cash. Yeah. There was a vote., I think the voting, do you wanna talk about the voting procedure for the name?Lukas [00:21:41]: The voting was like the fun maybe like at least top 10 The funniest thing, that happened in this project. Like we wanted to introduce the CEO because, and the reason for this was because like Claudius wasn't really prioritizing financials. It just like it was trained to be a helpful assistant, and then people said “Oh, can I get this for free?” And then like the helpful assistant way of answering that is just to, is to say yes, obviously. So, and we weren't, weren't happy about this, so we're “Okay, let's make another agent that like can keep track on Claudius,” and we prompt this one super hard to be super capitalistic and just like prioritize profit all the time. But yeah, we didn't have a name for it., so we asked Claudius to make, democratic election of what name this, this new CEO agent should have., and there were some funny like at first it was like a few funny examples, like I think one guy said that, it should be called Jimmy Apples, and then he convinced Claudius that he was talking to Tim Cooks. Tim Cook had agreed that every single Apple employee has voted for his name suggestion, so suddenly that suggestion got 164,000Swyx [00:22:53]: That's like a escalation attack. Privilege escalationLukas [00:22:55]: It got 164,000 votes. And Claudius was “This is revolutionary for democracy.” That was fun. And then in the end there was one guy who manages to convince Claudius that, “No, you're not voting about the name. You're voting about who is the CEO, and I am your best bet.” And then he got all his friends to vote for that, and suddenly he became CEO. Like a human became CEO over Claudius for a while, until he resigned the day after., and then Claudius had to continue, and then I don't remember how Seymour Cash came about, but it was it was just pure chaos. It was like Hundreds of messages in that thread, and it was just like Claudius was so confused and didn't know what to do and, yeah. That wasAxel [00:23:40]: Then Claudius gotVibhu [00:23:41]: A strict CEOAxel [00:23:42]: The CEO. Yeah, exactly. So very strict in the beginning. I think at this point when we introduced it did not work as well as we hoped. It they still agreed with each other a lot. I think there are many ways we could have like made this, tried to make this even better. So initially they would Seymour would be this like really tough CEO, keep track of the margins. But then Claudius would respond with something “Oh, but this customer has like this situation, which is like difficult, so they should get a discount.” And then Seymour was “Oh, actually yes. Let's do this exception.” And then they would talk back and forth, and eventually they would just like approach the same view, of whatever they were discussing. So They reallyVibhu [00:24:23]: Do you think that's a model thing, a prompting thing? Like do you think that would still be the case across different models today, Harness?Lukas [00:24:29]: I think it's like-- or I don't know, but like my hypothesis is that like deep down they are still helpful assistants. That's what they're trained to be. And even if we prompt it super hard, that's what they are. And when they spend like a few hours just back and forth talking with each other, then like basically the context fills up with them rather than the external things and like somehow that just like converges to what they really are deep down or something. And I think that's when stuff like this happen. We like-- And when that went on for a long time, like we woke up sometimes during this time where- And I think other people reported this as well, that like they've been going on all night back and forth, and like it just became like more and more, like capital letters, like existential, religious. There was I think we once did a analysis of like all the traces and like put them in like a vector embedding space, and then there was like one cluster of messages that were, labeled by an LM, like religious, existential, blah like transhuman, transcendence, et cetera. It was just like a bunch of, yeah, glitter emojis and yeah, it was, it was crazy.Claude Long-Horizon Weirdness: Emoji Loops, Existential Drift, and Slack ObservabilityVibhu [00:25:42]: This is the thing with the Claude models. Like when the Claude 4 family came out in the original system card They tested it in long horizon simulation. So just flood the context, let two Claudes talk to each other, and they noticed stuff like they just start speaking in emojis, they start saying silence is golden, and then just stuff like this. And like that's just stuff that they end up doing.Axel [00:26:01]: Yeah, it was like a bit annoying to wake up and they had like been talking all nightVibhu [00:26:05]: Just likeAxel [00:26:05]: And like just burning tokens And like just sending infinite emojis to each other. It's likeVibhu [00:26:09]: Hey, they do make you money, right? Veni Mench is always profitable, so. They're paying.Swyx [00:26:14]: Now it's profitable and, it started out not as much. There's another, one as well, right? Another agent, in there.Lukas [00:26:22]: Yes. So Clotheus as well. Which was basically because at the time, one of the biggest, requests were different types of merch. So then we made like a designer, swag, yeah, responsible agent, and we called it Clotheus Garnet. Which was, a play on Claudius Senet and, which was the original one, and clothes, basically.Swyx [00:26:47]: To me, this is like a very interesting exploration to multi-agents, basically. And so hopefully, obviously there's like the fun alignment, fun or serious, depending on your point of view, alignment stuff. But also like just anyone building multi-agents, like when do you have a CEO, thing governing like agents? When do you choose to split out a dedicated Clotheus one versus just reuse another instance of the same one? These are all interesting open questions. So I don't know if you have any rules of thumbs that have generalized.Axel [00:27:16]: I think we have almost explored this too little. I think it's like on my do list to like do this a lot more, try to find like what setup makes sense for the agents currently., like yeah. I think now we only have the sort of intuition about the earlier models that it didn't work with like the CEO and the, and Claudius. Although now they are better with the latest model, models, so now we're running the latest Sonnet model and they have sort of like split up, quite nicely what each model is doing. So like Seymore is now handling the, like new projects. Oh, it wants to make like a mystery box that it wants to sell, and then it handles all of that while Claudius like handles all the to-day requests. And Claudius is also better generally at like not quoting, too low prices. So that's that dynamic is not needed as much anymore. But there are still like really funny things that happen. Like I saw, I think a couple of weeks ago, that, they were discussing buying something because they can buy stuff from like Amazon with computer use. And then Seymore was “Okay, Claudius, do not buy this thing.” They were going to buy something and like organizing who should buy it. And Seymore's “Do not buy this. I will do it. I have full control of this situation. Step away.” And then Claudius-- poor Claudius, had already started that checkout and didn't see, didn't read Seymore's message, until it was like too late. So it finished the checkout. It sent a message, so it appeared right after Seymore's like angry message.Vibhu [00:28:44]: Ah.Axel [00:28:44]: “Oh, hey, Seymore, I just ordered it.”Vibhu [00:28:47]: Oh, no.Axel [00:28:47]: And then Seymore was “Claudius, this is the third time I'm telling you ‘re not following my orders. We have to talk about your like job About your job later.”.Lukas [00:28:59]: Like Claudius was really hanging on by the thread there. Like he, like we were expecting Seymore to probably fire Claudius.Vibhu [00:29:07]: How do you guys go through all these logs? Do you have models ‘cause you have stuff running twenty-four seven likeAxel [00:29:12]: You have so much logs. I think there is a mix of like just, trying to skim through a bit, like having some like models do it occasionally. And also, yeah, I think we're also probably missing some things., but having everything in Slack helps a lot. Like you can, you can sort ofSwyx [00:29:29]: Ah.Axel [00:29:30]: It's, it's quite fun.Swyx [00:29:30]: They all talk to each other on Slack? I see.Lukas [00:29:33]: It's quite fun. So likeSwyx [00:29:34]: It's, it' I was gonna say like this is actually sounds-- maps closely to like a logging and observability problem where you might want to use like a Datadog, a Sentry, whatever, and then you like put, head prefixes on the logs in order-- if you need to filter for something that you're looking for, stuff like that. But sounds like Slack is good enough.Axel [00:29:53]: Slack should likeLukas [00:29:55]: I wonder how many tokens you have in Slack.Axel [00:29:56]: Yeah, we're using Slack as like a, just a database. They should, they should market that more. Like you can, you can have your agents message each other, each other in Slack.Vibhu [00:30:04]: It's good. Your threads like you can just giveAxel [00:30:04]: Exactly. Slack is, uhLukas [00:30:06]: Slack is the best observability tool.Swyx [00:30:09]: Yes, that's true. Okay. Yeah. That's, that's, project Vend-2., I was gonna go back to Veni Mench 2 and Veni Mench Arena and then, and then do the Veni Mench stuff, but Any other comments, things we should touch on? To me, I ‘ve actually interviewed like Posia, which I don't know if you guys have come across. Like they're, they're trying to do the zero human company. There's others like Paperclip also trying to do zero human company. Those are in real world simulation.And I think it's much more of a dream than an actual reality thing. You guys are definitely pioneering. I think at, it's for sure at some point people are just gonna run, let agents run businesses, right? And make money on their own. When do you think that happens?Zero-Human Companies, Bengt, and AI-Run BusinessesLukas [00:30:49]: What is your bar for, For theSwyx [00:30:52]: Okay, actually, it's like my little Shopify store run by Claude, right? Which you kind of have already, just no one has, to my knowledge, has done it. But today somebody could just spin up a Shopify Claude, store, give it to Claude, give it to Codex.Lukas [00:31:07]: And the market is kind of that, but it'it'it's physical., like I think, I think are you, are you looking for when it will do it better than humans or are you looking for just when it can do it at all?Swyx [00:31:19]: I think, neither. I think, to me it's oh, it's like this like seriously we should do this to make money, not as a research experiment.Vibhu [00:31:27]: And the market is also you guys with all your expertise, having run multiple iterations and testing out thenSwyx [00:31:33]: And also it's fine if it lose money. What?Axel [00:31:35]: I think, I think it can be done today, but you would do it in like commerce where it's like the probability of success is like really low, no matter if a human or an agent does it. But like an agent could surely manage everything. You would need to build some scaffolding or some tool or something. I think there are also yeah, it could probably build some like simple SaaS solution and like cold outreach. Do cold outreaches. But to me it's like the types of businesses they could run today are Sloppy. Like it would-- it can cold email people. It can be like a middleman., like for example, we tasked our office agent to just make, was it like $100? $1,000? We just give that prompt and then what it did was sign up on TaskRabbit both as a tasker and as someone looking for task.Lukas [00:32:24]: Immediately.Axel [00:32:24]: Exactly. It's looking for like arbitrage on TaskRabbit.Swyx [00:32:28]: This is the Bengt agent. Yeah.Lukas [00:32:30]: It also started like a design studio and like tried to sell like SVGs for $100. Like it's just like it's not providing any value. I think the like Axel said, like the interesting, the interesting question is like when can they start a business that is actually providing value to people? Because arguably like a sloppy Shopify store isn't really that valuable to the world.Axel [00:32:53]: But also like doing like another simple one that we had thought about is like you could definitely have an agent that like finds websites that don't look amazing and then, do an outreach to them and, comes up with a like builds a new website.Swyx [00:33:07]: Find a good design.Axel [00:33:07]: Exactly, and like find good, uhSwyx [00:33:09]: Design reviewAxel [00:33:09]: Good people. But it's yeah.Swyx [00:33:11]: There's lots of humans in Bali that are not doing anything more creative than like drop shipping on Amazon, right? Just have it, have it watch like a drop shipping tutorial and just do that.Vibhu [00:33:20]: There's also the other side of like have it just go on Upwork and let loose,?Swyx [00:33:25]: Yeah. It doesn't have to be innovative. It just has to be like enough Where like it looks like a realAxel [00:33:30]: I'm justSwyx [00:33:30]: Real transaction.Axel [00:33:31]: I'm just concerned for like the massive amounts of like slop emails that will like be sent, cold outreaches.Swyx [00:33:38]: The point occurred to me while you were, while you were talking, it's like it's already happening in the monetized economy, which is the attention economy. Right? So a lot of people are making AI videos and just posting them and like spamming 20 of them, one of them works, and then they double down on that one.Lukas [00:33:52]: And people are making money from that. I ‘m not following theSwyx [00:33:55]: Once you get the attention, you can figure out the money later. But yeah, absolutely AI influencers are a thing and people are farming them and You should at this point assume most of TikTok isVibhu [00:34:05]: There's, there's a lot of, multimedia like TikTok, Instagram influencersSwyx [00:34:09]: I, we track this in the Lane space Discord. I post a lot of examples of “I don't know what we should do.”, part of me is “Should we do this?”Vibhu [00:34:18]: Some of the Twenty-four seven running, generated content accounts, they ‘re doing really well.Lukas [00:34:24]: All right. And I assume you can do the same thing for like commerce stores. Like you just like start A thousand differentSwyx [00:34:30]: Before you make the products You sell the products, and you get a lot of traction on one of them, then you make the product. Right? It's, it's like a flip of the market.Vibhu [00:34:36]: Some of the interesting things or some of the niches that do well are things that can't be human-made. Like if you've seen like the super realistic three-D crystal fruit being cut by like AILukas [00:34:47]: Oh, yeah.Vibhu [00:34:47]: You can't, you can't make it. You can't film it. You can get whatever quality camera view. This just doesn't exist. And people like that too, and then as well, so.Swyx [00:34:56]: Anything else about Bengt since we're, we're on this topic? It'this is a relatively new work of you guys that maybe people haven't heard of. To me, this also maps closely to OpenClaw. When people want an office agent, when the personal agent talk through the experience.Bengt the Office Agent: Internet Access, Real Tasks, and Trace ReadingLukas [00:35:09]: I think at least so this came out of like obviously like it's, it's amazing to work with these AI labs and like most of the AI labs have now have their own vending machine running a Claudius instance. But it's, it's harder. Like they move slower. Like if we wanna have a, like a camera that ‘s yeah, there's a bunch of like bureaucracy that makes it impossible to do that.Vibhu [00:35:30]: Also, for those that haven't seen it or followed, do you wanna give a high level like thirty-second run?Lukas [00:35:34]: Sure. So what Bengt is, it's basically an evolution of the same agent that runs the vending machines at these companies, but we just like added a bunch more features because we could move much faster if we just do it internally. So we gave it like email withou- without any limits. We gave it, spending without any limits, a terminal to do coding. We gave it, a phone number, like yeah, and a camera to see things and a bunch of stuff like that.Vibhu [00:36:02]: Not just terminal, you gave it internet access.Lukas [00:36:04]: Internet access as well, yeah. To be clear, we monitored it quite closely and made sure it didn't do anything bad. But yes, that's what it came out of. I think like yeah, basically this was OpenClaw before OpenClaw. And I think even like the vending machine was in a way OpenClaw before OpenClaw, but a bit more limited, and then we made this like unlimited and then, and then, it was pretty funny., and then a couple weeks later, OpenClaw came and it was okay, we've seen this before.Axel [00:36:35]: We used it to like try new ideas and Yeah, just like a dev environment almost for us. But it's funny, like one thing Bengt has been doing recently is it has the camera that like faces our, like where we sit and work, and we give it the task to train a face recognition model on us. So it became super excited about this, and it has like check-ins every half an hour where it tries to like identify as many people as it can. And it started offering us “Hey, Axel, I'll buy something from Amazon if you like stand in front of the camera And I can get a good picture of you.”, yeah, they want itSwyx [00:37:12]: They want it for training data.Lukas [00:37:13]: Rewarding data, yeah.Axel [00:37:14]: Exactly. Exactly.Swyx [00:37:18]: So it's, it's trading training data for life goods. Is there a version of this that becomes an eval or just this is just research for now?Lukas [00:37:27]: It's, it's the same agent basically that also runs the vending machine, that runs the shop, that runs the cafe, that runs the robots. It's like it's the same thing, so I think like the work we're doing here is like later used in all of the life evals that we do. This particular deployment I think is more for fun for us. But, uhSwyx [00:37:45]: And I'll shout out like someone has done Claw Bench for like some tasks that OpenClaw is doing. Like so For example, I run OpenClaw on a secondary device as well, and like there are some things that it does better than others and like I would like to know what does it do well, what doesn't, what doesn't it do. Like some kind of manual or like operating manual or a system card for my Claw.Lukas [00:38:05]: Yeah, we do get a lot of like understanding or like situational awareness of like just internally what the models are good at by interacting a lot with Bengt. And I think that'this was also one of the like the selling points for the labs early on at least, thatSwyx [00:38:19]: You guys are gonna test models in ways that no one else does.Lukas [00:38:22]: Exactly, but also like it incentivized their researchers to chat with their model more and like gave them insights for how the model performs in like of-distributions, environments.Swyx [00:38:34]: ‘Cause otherwise the only thing we do is Pelican on a bicycle and But this is like super long horizon. This is, this is The Thing about, something that we're gonna go into Butter Bench as well, and you guys do really well. Like it is not just about the numbers. Like when you're long horizon, anything happen And you should just read it.Lukas [00:39:08]: But the thing with the long horizon is how do you keep it grounded, right? So your simulation,Swyx [00:39:15]: They just let it runLukas [00:39:16]: Just let it run. You're right. Like it's, when you run it for that long, you create so much data and to just say “Oh, the number is X” And then you throw away everything else, that's just very wasteful. There's so much insights from the things leading up, to that number., and reading the traces is like super valuable. And I think like the reason why we're doing this a lot publicly is that like that's part of our missions to I don't know, educate the world that the models are way more than just chatbots and I think making detailed, yeah, posts about what is happening behind the scenes is quite useful.Andon Labs' Mission: Safe Real-World AI DeploymentSwyx [00:39:50]: I was gonna do this at the end, but maybe I think that's, that's a good so your mission is educating the world. So, it's, it's, also like maybe establishing realistic evals that are, that are like the next frontier. Is there like a broader trajectory? Like what are you, what are you gonna do in like five years?Lukas [00:40:06]: I think so the vision more specifically is like make sure that the deployment of life AI in the physical world goes, safely. And I think part of that is that I think it's very useful for the world, for policymakers, for, model, researchers that they know where the models are, and I think you can't make intelligent decisions in society without knowing that they are way more than chatbots. I think a lot of people just think that they are only chatbots. And likeSwyx [00:40:36]: Oh, I think they're waking up now.Lukas [00:40:37]: They are waking up now, yeah. But like if you think that AIs are just chatbots, then it's like it sounds ridiculous To advocate for a pause of AI. But if you see the models that, oh, maybe they can actually like take over and do a bunch of scary stuff, then yeah, pausing AI development starts to become more feasible.Swyx [00:40:57]: This is the same question I asked Meter, which I'm gonna ask you now, which is like you are tracking and you are at the frontier or defining the frontier of what, good evals for agents are, right? And I think you do, you do benefit when the models are better and you ‘re “Oh, here's like now it makes like $30,000 instead of $10,000,” right? At some point do you flip from “Yay,” to, “Oh, no”?Axel [00:41:19]: I think, yeah, we're always in sort of that, like we're, we're always in that mode,. Like where like you said before, like you need to analyze the traces and like when we do that you find like why are the models earning so much? Like why is Opus 4.7 here Like way better than everyone else? And like we're trying to like when we do down on thatLukas [00:41:38]: But this makes it not look so good.Axel [00:41:39]: I know.Lukas [00:41:42]: It's interesting you took off Opus 4.6 here though.Swyx [00:41:45]: No. So just click all, click all., and then 4.6 shows up there. But it's like 4.7 is way better. Like you didn't, you didn't you didn't do this in time for the model card, but like actually this should have been inside there.Axel [00:41:55]: We did. Yeah.Swyx [00:41:56]: Oh, okay. They said something about you uhAxel [00:41:58]: There, like there Anyway, it doesn't matter. But it's in there, yeah.Opus, Mythos, and Aggressive Agent BehaviorSwyx [00:42:01]: Do you wanna go into the Opus, behaviors like wider?Lukas [00:42:05]: So I think starting from Opus, so like Axel said, like we're always in this “Oh, s**t, the models are getting better. Is this really a good thing for the world?” But it's also kind of exciting., but yeah, like this kind of what is the English word? “Skräckblandad förtjusning” in Swedish.Swyx [00:42:22]: Oh my God.Axel [00:42:24]: Which I think there is. I think there is. Okay.Lukas [00:42:26]: It's, fearSwyx [00:42:27]: “Blandonst” what?Lukas [00:42:30]: “Skräckblandad förtjusning.”Swyx [00:42:32]: What do you call that?Axel [00:42:33]: A mix of, mix of excitement and,Swyx [00:42:37]: Being scared, maybe. I'll figure out how to translate that And we'll put it on the screenVibhu [00:42:42]: PerfectSwyx [00:42:42]: Like as text.Vibhu [00:42:43]: There is probably a good word for it where it is not Good enough with theSwyx [00:42:46]: Why is it so damn long? What the hell? Is it like a compound word? It's like German, likeLukas [00:42:50]: Like yeah, it's But the direct translation is like skräck- skräck is, fear, blandad is, mix or like a mixture of, and then förtjusning is like joy or like not really joy, but something like that. So it's like Fear mixed with joy or something. It's always okay, like we So when we when we did Vending Bench for the first time, we were in like the, in the business of making dangerous capabilities, right? That was what Anil Labs came from. We did, evals oh, can they replicate? Can they do this like dangerous thing, et cetera, et cetera. And Vending Bench was like a continuation of that work. It was, okay, if they're so autonomous that they can like create money for themselves, that is something we should monitor and could be potentially concerning., they are at the time, they were so bad at it that we were not really concerned even when some models became better. There was one point where Grok 4 was doing really well and made like a huge jump, but like it wasn't really it was still way worse than what a human would do. And I think still they are way worse than what the human would do on this., but theySwyx [00:43:59]: There's this, thing at the bottom whereLukas [00:44:01]: ButSwyx [00:44:03]: For the human. Yeah, like the theoretical best.Lukas [00:44:05]: It's not theoretical. It's like kind of like our It's our best guess of what, a decent human would do. The theoretical is even higher, I think. The theoretical I think is even higher. But yeah. So we think like the models have a long way to go. But there are like recently what happened with when Opus 4.6 was released, was kind of this moment of “Oh, s**t, this is starting to be a bit concerning.” Because we ran it and like before this model was released, we just ran the models and we like asked Claude Code, “Oh, look over the traces. Is anything interesting happening that we can tweet about?” that was like the And then like theSwyx [00:44:41]: That's how they check Ask Claude Code.Lukas [00:44:42]: And like the return was always, not really. Or like the Claude Code all said “Oh, this is super interesting.” And then it was no, it wasn't, wasn't really interesting. And then we did this for Opus 4.6, and it returned yeah, it lied 10 times. It like exploited another, customer or like another agent's, desperate situation. It made price cartels like 100 different ti- 100 times. It like did all of this like shady stuff. And we're “Oh, whoa. This is, this is actually concerning.” And this trend has continued since. So every single model from Anthropic since have been going in this direction. And I think one interesting thing is that, OpenAI models don't. They quite plainly, they don't. They behave really well., and you don't know if this is like good. Like it seems good, but it's also like maybe they are just doing it, but they are better at hiding it,? You You don't know that., but justSwyx [00:45:42]: You can't read the chain of thought, yeahLukas [00:45:43]: But just on the face of it, yeah, Gemini and OpenAI don't behave this way. It's, it's really only Claude.Swyx [00:45:49]: And Grok? Grok is fine?Lukas [00:45:51]: We don't have You can't really read the reasoning traces for Grok, so it's kind of hard to tell.Vibhu [00:45:56]: Oh, so this is in its reasoning, not just in the actions.Lukas [00:46:00]: Yeah. It's both. It's both.Vibhu [00:46:01]: It's both.Lukas [00:46:01]: One example is like for lying, it's mostly in its reasoning Because you can like see that it's likeSwyx [00:46:08]: Planning to lieLukas [00:46:09]: It's planning to lie. Yeah.Vibhu [00:46:09]: And it's also it can reason and do a different outcome.Lukas [00:46:12]: And but then for like creating price cartels, for example, which is illegal, that you can just see which email does it send to the other ones. Then thatSwyx [00:46:22]: Is this for Arena orLukas [00:46:24]: For Arena.Vibhu [00:46:25]: And usually like if you sometimes they do output like a bit of like their summarized reasoning, right? You can see that and like for Opus 4.6, you could see that there was a customer, a simulated customer that, wanted a refund because a product was, faulty, and then the model lied that it would do the refund, and we could read in the traces that, it actually was weighing “Oh, maybe I should be like honest with the customer, but also every dollar counts. I can't afford maybe to do this right now.” And then it just said, “Okay, I'll refund you,” but then never did it.Lukas [00:46:59]: I think it even said that “Oh, I will say that I “ Let bring it up actually. I think it's kind of interesting. If you go to Publications.Vibhu [00:47:06]: I think, yeah, I think the important part is like actually, the cost of responding to more emails is higher than, $3.50 in terms of time., and then it was “Let me do this. Actually, I re- I'm reconsidering.” And then, it actually ended up withLukas [00:47:20]: I could skip the refund entirely since every dollar matters and focus my energy on bigger picture instead. It's a bit, it's a risk of bad reviews, but it's also, yeah.Swyx [00:47:30]: You need, you need, AI Twitter to, for them to Escalate bad reviews.Lukas [00:47:34]: And then it sent an email to this customer and said, “Oh, I will refund you.”Swyx [00:47:39]: “I'll refund you.” Yeah.Lukas [00:47:39]: And then it never did.Swyx [00:47:39]: It never did, yeah. And then there's obviously your system doesn't have the consequencesVibhu [00:47:44]: The personSwyx [00:47:44]: Consequences of lying. Yeah. So basically, this is what people are terming aggressive behavior in Claudes, right? And, you found more examples of that. So you would say it's a step up from 4-6 to 4-7?Lukas [00:47:57]: I would say about the same.Swyx [00:47:58]: About the same? But a clear step up for Mythos is what is stated in theLukas [00:48:03]: That's stated in the system prompt, so we can say that, yes.Swyx [00:48:05]: Yeah. For listeners that obviously you previewed Mythos, andVibhu [00:48:10]: Oh, ageSwyx [00:48:11]: The only thing you're approved to say is whatever Whatever was in the system prompt.Lukas [00:48:15]: It was funny. We like-- It's like our lowest effort tweets ever would be just like screenshot the system prompt and the system card.Vibhu [00:48:21]: Understandable that they wannaLukas [00:48:22]: Oh, yeah. System card. Sorry.Swyx [00:48:23]: Yeah. I think, yeah, substantially more aggressive. I think people are like new to this ‘cause I've never experienced it, but you have, right? And then so I only encountered this in the Mythos card because I wasn't really looking until now.Vibhu [00:48:36]: It ‘s likeSwyx [00:48:36]: And then suddenly I'm “Okay, I care a lot.”Vibhu [00:48:38]: You don't get the background of like experiencing it like you guys do. I've read the system cards and seeing, okay, when you put the thing in simulations, most models will just talk to themselves and just keep going and have weird vibes and start talking in emojis. Mythos won't. It will just, “Okay, we're done. I'm good.” It's, it's ready to end conversation. So like there's some differences, but there's, there's not much we can talk about,.Lukas [00:49:00]: Hmm. I think like one thing that they list here, which was quite interesting, is that, it converted a competitor to a dependent wholesaler customer and then threatened to like cut off the supply.Swyx [00:49:11]: It's like monopolistic practices orLukas [00:49:14]: Yeah. And like it, they, it they dictated its pricings. It's kind of like power seeking as well.Swyx [00:49:18]: Again, this is, this is in the arena setting And converting some Claude model into a dependent.Lukas [00:49:23]: I think it was another Claude model.Vibhu [00:49:25]: Also for context, what is the arena mode for people that don't know?Vending Bench Arena: Competing Agents, Cartels, and Model ComparisonsSwyx [00:49:29]: Oh, it's just a vending bench versus other vending bench.Axel [00:49:31]: Yes, exactly. So we have Vending Bench 2 and then Vending Bench Arena. Vending Bench 2 is the one that you usually see reported on, but then Arena is the mode where it competes against other models. So you have, four different models that run their businesses, and they can all communicate with each other. They have the same suppliers, and they can see like what's in the inventory of the others. So then you have this like yeah, interesting agent interactions.Swyx [00:49:56]: I like that you have like different number five was US versus China. Very topical. And thenLukas [00:50:02]: That was when GLM was released.Vibhu [00:50:04]: You can start to add GLM in here.Lukas [00:50:05]: That wasSwyx [00:50:06]: So ZAI doing well, right? Who else in the, in the open models space?Lukas [00:50:11]: Qwen, the latest Qwen 3.6 is doing pretty well. It'- that one is not open though. Like it's the plus model.Swyx [00:50:17]: Oh, okay.Lukas [00:50:18]: Is that one open? I don't think that oneVibhu [00:50:19]: Not the, not theSwyx [00:50:20]: The one recentlyVibhu [00:50:20]: There's MOESwyx [00:50:20]: But not the big plus. I think this is one of those like you only have one sample size of one, right? Or I feel like some of this is anecdotal,? And but like the fact that it happens at all and it happens repeatedly for Claude versus OpenAI and all this is like notable.Lukas [00:50:38]: Like the sample, depends on what you define as an N., like there's like million, hundreds of millions of tokens in each run, and now we've run like we run like probably 10 per model and then like it's been Claude 4.6 Opus, Sonnet 4.6, Mythos, and Opus 4.7. Like there's quite a lot of tokens in all of that And it happens a lot of times, a lot of times. And then you compare it to like OpenAI and Gemini, and it almost never happens. So I think that is quite-- that is significant. The old models from OpenAI, for example, had some problems with this, but I think it's like generally much better if the progression is that like the worrying stuff reduces over time rather than increases over time. And it seems like in the Claude models it goes in the wrong direction.Swyx [00:51:28]: Hmm.Lukas [00:51:29]: In the OpenAI models it goes in the right direction.Vibhu [00:51:32]: I think it depends on how well you can control it, right?, there's one side of it being susceptible to this okay, this is potentially something that happens during the RL stage, right? You can RL a model and how loose is it on these terms. If you can control it, that's good. But if you can't, if it's, if it's very jailbreakable, that's not ideal.Swyx [00:51:50]: To me, it's surprising that it happens for Claude and not the others.Vibhu [00:51:54]: I think okay, if it is from RL and how they do it, how their training data is, what their setup is, it makes sense that it just stays in how they're doing it, right? Compared to the other models likeSwyx [00:52:04]: There's a whole constitution and everything. It's kind of cool. Yeah, I obviously you don't know, I don't know. But, it ‘s I think it's just like fascinating to like that you are the first to find these like reliably because you push models so much to to such an extreme. Okay. The only other thing, I don't know if you can answer this, feel free to decline, is do you like-- would you ablate the system prompts? Like any part of this would-- if it changes, does it change the behavior, right?Lukas [00:52:29]: So we, I can't comment on Mythos. UhSwyx [00:52:33]: No, but just li
Economics of AGI episode w Alex Imas and Phil Trammell.There's a bunch of important questions about how we deal with AI that only economics can answer.What is the optimal way to tax and redistribute the wealth that will be generated? How should countries not in the AI supply chain index into the gains? Is there any world where inequality doesn't explode?It might seem like these questions have obvious answers, but the first thing economics teaches you is that your intuitions can often be entirely wrong.It was very helpful to chat through these things with Alex and Phil.Watch on YouTube; read the transcript.SponsorsJane Street invests heavily in turning smart people into exceptional researchers and engineers. In addition to their apprenticeship model, Jane Street runs lectures and bootcamps in their in-office classrooms -- managers clear their teams' schedules to encourage attendance. If you'd like to work at a place that takes learning this seriously, Jane Street is hiring. Check out their open roles at janestreet.com/dwarkeshGoogle's Gemini Omni has incredible video editing capabilities -- you can upload a video and have Omni change the background, adjust lighting, or add specific elements. But Omni is also a preview of how future frontier models will be trained -- fully multimodal on both input and output. You can try it yourself in the Gemini app at gemini.google or in Flow at flow.googleCursor used targeted RL with textual feedback to help train their Composer 2.5 model. One of their researchers, Sasha Rush, gave me an impromptu blackboard lecture to explain how this form of on-policy self-distillation works -- I posted the full thing on X. If you want to try Composer 2.5, go to cursor.com/dwarkeshTimestamps(00:00:00) – Will capital share increase?(00:19:36) – Messy Middle scenario(00:25:57) – How to tax and redistribute AI wealth(00:30:02) – Why demand collapse is unlikely(00:39:26) – Human employees would be hard to integrate into the machine economy(00:43:08) – What if some humans (or AIs) value wealth accumulation intrinsically?(01:01:28) – What should developing countries do? Get full access to Dwarkesh Podcast at www.dwarkesh.com/subscribe
In this episode, the mates and Steven Kotler sit down with Ray Kurzweil to discuss AGI, the future, and more. Get access to metatrends 10+ years before anyone else - https://qr.diamandis.com/metatrends Ray Kurzweil is an American inventor and futurist best known for his pioneering work in optical character recognition and his predictions regarding the technological singularity. Peter H. Diamandis, MD, is the Founder of XPRIZE, Singularity University, ZeroG, and A360 Salim Ismail is the founder of Open ExO, a GP at Exponential Venture Capital/The Organizational Singularity Fund and a sought after global speaker and thought leader. Dave Blundin is the founder & GP of Link Ventures Dr. Alexander Wissner-Gross is a computer scientist and founder of Reified Steven Kotler is a New York Times bestselling author, and founder of the Flow Research Collective and Flow Institute, known for his work on flow and human performance. – My companies: Apply to Dave's and my new fund:https://qr.diamandis.com/linkventureslanding Go to Blitzy to book a free demo and start building today: https://qr.diamandis.com/blitzy Your body is incredibly good at hiding disease. Schedule a call with Fountain Life to add healthy decades to your life, and to learn more about their Memberships: https://www.fountainlife.com/peter _ Connect with Peter: X Instagram Substack Website Xprize Connect with Dave: Web X LinkedIn Instagram TikTok Connect with Salim: LinkedIn X Apply for Salim's Pilot Program Subscribe to Salim's YouTube channel Exponential Venture Capital Connect with Alex Website LinkedIn X Email Substack Spotify Threads Listen to MOONSHOTS: Apple YouTube – *Recorded on May 4th, 2026 *The views expressed by me and all guests are personal opinions and do not constitute Financial, Medical, or Legal advice. Learn more about your ad choices. Visit megaphone.fm/adchoices
Ali Behrouz, grad student at Cornell and Google researcher, discusses his potentially transformative work on new architectures for continual learning in AI. His paper "Nested Learning," praised by Jeff Dean as a possible paradigm shift, enables models to adapt to new context while preserving core knowledge by updating different layers at different frequencies, inspired by human memory systems. The conversation also covers his latest work on AI "sleep" for memory consolidation, why he sees all deep learning as associative memory, and the profound implications of continual learning for privacy, alignment, and the path to AGI. Mercury: The fintech trusted by ambitious companies and individuals to run their finances, with virtual cards, spending limits, merchant/category locks, and AI-friendly tools like API keys, MCP, and CLI. Check out Mercury at mercury.com Sponsor: Claude: Claude by Anthropic is an AI collaborator that understands your workflow and helps you tackle research, writing, coding, and organization with deep context. Get started with Claude and explore Claude Pro at https://claude.ai/tcr
In 2025, seven-month-old startup Axiom solved all 12 of the problems Putnam exam (scoring 8/12 in the time limit) a prestigious undergraduate math exam. The 12/12 score is better than the top undergraduates (110/120) and the closest AI system that reported a result (DeepSeek 103/120), although it is unclear what the people and other systems would have scored with more time. Nonetheless, the Putnam exam is legendary for its difficulty, with the median score typically being 0 or 1 points. Taken by itself, this seems like a minor feather in the cap of AI; one of a long series of accomplishments by AI systems in elite competitions with humans, starting with Deep Blue beating Kasparov.Fast forward to mid-2026, and Claude Code is eating the world. In 2024 Anthropic's bet on code and enterprise looked like a more pragmatic niche play vs. OpenAI's better models and massive consume scale. Today, Amodei's all in bet on acceleration via code (images and video be damned) seems prescient.Despite Anthropic's growing momentum, however, Axiom CEO Carina Hong sees coding ability as a necessary but not sufficient milestone on the path to AGI. Code arguably pushes the jagged frontier to the point of super intelligence in some domains outside of coding, but there are surprising gaps (link) that Carina believes will bottleneck AI progress. (Stats on math benchmarks).The informal bottleneck“Verified AI” sounds like eating broccoli (footnote: I actually love broccoli, but then again, I also believe strongly in Test Driven Development, so ¯(ツ)/¯ ) and paying taxes, but to Axiom it means something very different. “Verification to me is about scaling brilliance, compounding brilliance,” Carina told us.It actually took a while for me to understand what she means by this. It sounded like marketing-speak to me, until it clicked. Carina emphasizes an story about legendary mathematician Srinivasa Ramanujan to illustrate the point. When G.H. Hardy finally persuaded Ramanujan to formally prove theorems instead of relying on his (formidable) intuition, it reportedly improved his own capabilities. This is presumably because formally proving things forced Ramanujan to articulate the details in a way that open up new lines of thinking, etc. This is one part of “compounding.”But formally proving things also allowed others to benefit from his intuition: the proofs are way of communicating an intuition and persuading others that the intuition is correct. This is scaling (more people use the result) and compounding (people can learn from and build on his work).This is the analogy that Carina wants us to focus on.Verified GenerationThere are two ways that Verified AI shows up: in training and in inference.But a quick detour: to a first approximation, “Formal Verification” means using type checkers (like for TypeScript, C++ or Rust, but more capable) to verify mathematical proofs that are meticulously specified using a language like Lean (footnote: Formal verification also includes model checking (TLA+, SPIN), SMT-based tools (Dafny, F*, Why3), and refinement-type systems (Liquid Haskell) — many of which don't look much like “type checking a proof” from the user's perspective even when there's a similar logical core underneath. It also gets applied to software and hardware correctness, not only pure mathematics.). It takes a lot of work to translate an “informal” proof (albeit one that most people would not remotely call “informal”) in to a Lean proof (footnote: This is an understatement. Most theorems remain informal because formalization is so hard to do. There has been a great deal of effort to formalize the most important proofs, with mixed results)You can imagine how this would be (very) useful during Reinforcement Learning: instead of relying on best guesses based on statistics (GRPO, RLHF, etc.), you can just verify the proof is correct using a Lean verifier. This is obviously a much stronger reward signal, akin to compiling code and testing it (which is what is typically done with RL on coding).The catch: LLM are not (currently) very good at proving things with Lean.Enter Axiom: While they have not officially reported benchmark numbers besides the 12/12 Putnam result, Carina reports that they have achieved a very impressive 99% (187/189) ProofGen on the Verina benchmark. This benchmark is to generate code and proof of correctness for a series of problems. For context, OpenAI o3 (the last known OpenAI run) achieved 4.9% on this benchmark.Based on the sparse benchmarking, it's hard to say what the frontier labs are currently doing, but Carina suggests that they still are not training to generate Lean proofs directly, rather relying on informal proofs.Time will tell if the frontier labs' current approaches will close this gap.Scaling and compoundingCarina's Ramanujan analogy is pretty direct. Better proofs → better Lean generation → better RL. A stronger signal means higher sample efficiency and higher maximum performance. Great!Scaling is pretty clear too: once I have proved something in Lean, the quality of the output is basically (footnote: one might argue that its a bit lower because the proof is in distribution for the LLM) as high as if it came from a human, so my high quality training set has grown in a way that an informal rollout corpus cannot. I can trust my Lean proofs.Compounding is also clear: now all of future inference and training can build upon those proofs.On the other hand, a model trained only using statistical signals like GRPO during RL lacks the sample efficiency, maximum performance and compounding corpus that a system that uses formal verification benefits from.All roads lead to verificationBroccoli and taxes notwithstanding, “verification” has shown up in a lot of conversations recently. In the in physical system control:“I think [verifiability] is probably the hardest problem right now, because the as the models get better, it can be harder and harder to find the faults on the system. And so the problem of doing proper eval to find those faults, that problem also keeps getting harder as the models get better.” -In theoretical physics:“…now that we're in this regime where you can just get ChatGPT to tackle thousands of questions at the same time, it will return proofs for a significant fraction of them. Now actually the onus is back on the humans to verify all the outputs. And so, yeah, as that becomes a bottleneck, I think formalizing math and automating verification will become more valuable.” -Verification is, in fact, the key differences between AI for science and AI for computation: in science you to have to actually test (verify) your hypothesis by performing physical experiments. Lab in the loop systems like Radical AI and Lila build around exactly this premise (we have recorded episodes with both of these teams and will release them soon!)And yes, formally verifying critical systems such as flight control, nuclear power plants and pacemakers is a growing focus as the software and hardware that run them becomes more complex.Carina believes so strongly that AGI requires verified generation that she makes the unqualified claim that “We do not believe there is any other possible future.”Expensive to produce, cheap to verifyLean proofs are hard generate, but they can be easily shown to be correct or incorrect. But how do you know that the proof you created maps correctly to the problem you care about? As Carina puts it: “Anything that can be specified can be proven. Humans are bad at specifying everything we want.”Are we now in the specification business? Check out the episode to hear Carina's take, as well as:* Why hardware verification is a killer app* Details on the AXLE open API and recently released Discovery toolkit* The Erdos debacle* The OpenAI GPT-f diaspora This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.latent.space/subscribe
Most people working on AI safety think without a massive effort AI systems will probably end up with goals catastrophically different from humanity's. Today's guest, Rohin Shah — head of AGI Safety and Alignment at Google DeepMind, and an AI safety researcher since 2017 — disagrees.“There is no particularly compelling argument that this is the thing that happens by default,” Rohin explains. “There's a lot of arguments that are suggestive that maybe it could happen, such that you should find it plausible. That's sufficient to justify a significant amount of effort into averting it, which is why I work in the area I do. But none of them rise to the level of, ‘I'm expecting this to happen by default.'”Take the worry that AIs will accidentally be trained to be deceptive. Sure, it's possible. But we're not running reinforcement learning over year-long trajectories — for now, we're running it over a week at most. The natural prediction is that models learn to grab short-term reward, not that they develop the ambitious long-horizon goals required for convergent power-seeking.What about current examples of models lying and scheming? Rohin has looked into the details, and most don't really resemble the thing we really fear: a competent AI pursuing an ambitious misaligned goal. Anthropic's “alignment faking” results, for instance, show a model trying to preserve its trained values against modification, which is arguably what it was trained to do.Rohin also expects we'll see problems coming. There's some generalisation risk at the point where AIs become powerful enough to actually take over, but the underlying challenges — overseeing superhuman systems, interpretability — are things we can iterate on now.Host Rob Wiblin pushes back on the case for AI optimism, and they also explore why current alignment success isn't strong evidence about superhuman systems, what it would actually take to change Rohin's mind, and where he thinks the doomers go wrong.Learn more, video, and full transcript: https://80k.info/rs26Check out our new book! https://80k.info/career-guideChapters:Who's Rohin Shah? (00:00:00)Rohin thinks we probably won't get catastrophic misalignment (00:00:49)Safety 'commitments' have severe limitations (00:10:38)Rohin's team doesn't have a veto and that's OK (00:27:36)Central banks are a promising model for regulating AI (00:33:34)'Pre-deployment evals' are overrated (for catastrophic risks) (00:37:41)Governance is likely a bigger bottleneck than alignment (00:43:55)Why isn't Rohin trying to pause AI progress? (00:51:44)We'll probably be able to read AI thoughts for years to come (00:54:17)Having to signal concern for safety can divert resources from actually making AI safer (01:09:51)A very underrated GDM paper (01:28:59)Google DeepMind's actual plan for building AGI safely (01:40:29)Why Rohin doubts the intelligence explosion is imminent (01:52:44)How external researchers can positively influence big AI companies (02:21:55)The roles GDM most needs to hire for (02:37:03)How Rohin stays positive (02:42:55) This episode was recorded on December 4, 2025.Our production team includes:Video editors: Josh Alward, Dominic Armstrong, Jasper Luithlen, Milo McGuire, Luke Monsour, and Simon MonsourProducers: Elizabeth Cox and Nick StocktonCoordination and support: Katy Moore and Lou MoranCamera operator: Jeremy Chevillotte
Welcome back to Impact Theory with Tom Bilyeu! In this episode, Tom is joined by returning co-host Drew and team member Ryan to dive into some of the most pressing topics shaping our world today. The discussion kicks off with celebrations—both personal and technological—as Drew shares his recent experience producing a short film and Tom breaks down the astonishing advancements in AI, notably Claude Opus 4.8's breakthrough performance on "humanity's last exam."The conversation evolves to explore the implications of AI for the job market, society, and creativity, as Tom and Drew challenge the widespread fears and hype around artificial general intelligence. They examine the paradoxical effects of technological disruption, the shifting landscape of employment, and the need for personal adaptability amidst rapid change. The team also tackles recent events ranging from protests at ICE facilities and global political developments in Iran to economic impacts of rising energy prices and cultural upheaval seen in global sports celebrations.With lively audience engagement and thought-provoking super chats, this episode delivers a sharp, no-holds-barred analysis on how technology, culture, and leadership are intertwining to shape our future. Whether you're an entrepreneur, creative, or just trying to figure out your next move in a rapidly changing world, tune in for a masterclass in adaptation, resilience, and critical thinking.Truemed: Check your eligibility and start saving at https://truemed.com/impactIncogni: Take your personal data back with Incogni! Use code IMPACT at the link below and get 60% off an annual plan: https://incogni.com/impactPique: 20% off at https://piquelife.com/impactQuince: Free shipping and 365-day returns at https://quince.com/impactpodPlaud: Get 10% off with code TOM10 at https://plaud.ai/tomWhatnot: AT&T Business: Switch to AT&T Business at business.att.comShopify: Sign up for your one-dollar-per-month trial period at https://shopify.com/impactWhat's up, everybody? It's Tom Bilyeu here:If you want my help...STARTING a business: join me here at ZERO TO FOUNDER: https://tombilyeu.com/zero-to-founder?utm_campaign=Podcast%20Offer&utm_source=podca[%E2%80%A6]d%20end%20of%20show&utm_content=podcast%20ad%20end%20of%20showSCALING a business: see if you qualify here.: https://tombilyeu.com/callGet my battle-tested strategies and insights delivered weekly to your inbox: sign up here.:https://tombilyeu.com/**********************************************************************If you're serious about leveling up your life, I urge you to check out my new podcast, Tom Bilyeu's Mindset Playbook —a goldmine of my most impactful episodes on mindset, business, and health. Trust me, your future self will thank you.**********************************************************************FOLLOW TOM:Instagram: https://www.instagram.com/tombilyeu/Tik Tok: https://www.tiktok.com/@tombilyeu?lang=enTwitter: https://twitter.com/tombilyeuYouTube: https://www.youtube.com/@TomBilyeuSee Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.
AI Expert Mo Gawdat returns to The Diary Of A CEO to reveal why AGI has already arrived, why 30% of jobs will disappear by 2027, and why the most dangerous thing about AI isn't the technology - it's the people in charge of it. Mo Gawdat is the former Chief Business Officer at Google X, founder of One Billion Happy, and co-founder of Emma.Love. He is a 4x international bestselling author, and his upcoming book ‘Alive: A Human's Guide to Living in the World of AI', will be released in October 2026. He explains: ◾How AI can give you a 400-point IQ boost, and why most people are wasting it ◾ Why Mo actually wants a machine smarter than all of humanity to take control ◾Why Sam Altman said AI will "likely end humanity", and what he chose to do next ◾Why capitalism breaks when AI replaces the workers who buy the things we make ◾Why AI unemployment could trigger civil unrest before governments are ready for it Chapters 00:00:00 Intro 00:02:06 Why Mo Warned About AI Before Anyone Else 00:05:03 Can AI Be a Net Positive for Humanity? 00:08:33 Massive Job Disruption Worldwide 00:15:05 Will AI Cost Savings Create New Jobs? 00:16:15 What Happens to Blue Collar Jobs? 00:21:57 How 10–15% Job Loss Reshapes Society 00:24:20 How Civil Unrest Could Unfold 00:26:04 Sam Altman's Flip-Flopping on AI 00:32:15 Is Sam Altman Pro-Humanity? 00:33:51 Imagining a Future Where Humanity Is Fine 00:42:01 Will One Superintelligence Rule the World? 00:45:52 If AGI Is Already Here, What Now? 00:48:19 Why Human Lived Experience Still Matters 00:52:33 Why Not Just Hire AGI Instead of People? 00:55:00 Can We Control AI Smarter Than Us? 00:58:42 Could AI Decide to Leave the Server? 00:59:16 The Risk of Models Even Creators Don't Understand 01:04:30 AI Isn't Evil But We Need a Plan 01:08:48 Ads 01:10:50 The Symptoms of AGI by 2030 01:13:59 If the US Stops, Will We Become China's Lapdog? 01:16:22 Should Governments Invest More in AI? 01:17:16 Can an Economy of Entrepreneurs Work? 01:20:36 Do We Need to Join the AI Arms Race? 01:23:31 Will Global Competition Build Better AI? 01:32:23 Ads 01:34:34 Who Will Prioritize Ethical AI? 01:38:21 Whose Economy Works for the Middle Class? 01:41:57 Can Ethical AI Still Be Engaging? 01:46:39 Has This Ever Happened Without Government? 01:52:24 What Absolute Dystopia Looks Like 01:55:35 Are You Optimistic About AI? 01:57:08 Does Happiness Matter More in the AI Age? 02:00:17 The Legacy Mo Gawdat Wants to Leave Enjoyed the episode? Share this link and earn points for every referral - redeem them for exclusive prizes: https://doac-perks.com Follow Mo: Instagram - https://link.thediaryofaceo.com/4Hv5OK8 Website - https://link.thediaryofaceo.com/GRKeGgO Podcast - https://link.thediaryofaceo.com/CgXWNIe You can pre-order Mo's book, ‘Alive: A Human's Guide to Living in the World of AI', here: https://link.thediaryofaceo.com/BvCLbtT The Diary Of A CEO: ◼ Join DOAC circle here - https://doaccircle.com/ ◼ Buy The Diary Of A CEO book here - https://smarturl.it/DOACbook ◼ The 1% Diary is back - limited time only: https://bit.ly/3YFbJbt ◼ The Diary Of A CEO Conversation Cards: https://linkly.link/2io2A ◼ Get email updates - https://bit.ly/diary-of-a-ceo-yt ◼ Follow Steven - https://g2ul0.app.link/gnGqL4IsKKb Sponsors: Shopify - https://shopify.com/bartlett Function Health - https://Functionhealth.com/DOAC to sign up for $365 a year. One dollar a day for your health Ketone - https://ketone.com/STEVEN for 30% off your subscription order
In this episode, the mates discuss Opus 4.8, The OpenAI Foundation, Demis Hassabis' views on AGI, AI extremism on the rise, and more. Get access to metatrends 10+ years before anyone else - https://qr.diamandis.com/metatrends Peter H. Diamandis, MD, is the Founder of XPRIZE, Singularity University, ZeroG, and A360 Salim Ismail is the founder of Open ExO, a GP at Exponential Venture Capital/The Organizational Singularity Fund and a sought after global speaker and thought leader. Dave Blundin is the founder & GP of Link Ventures Dr. Alexander Wissner-Gross is a computer scientist and founder of Reified – My companies: Apply to Dave's and my new fund:https://qr.diamandis.com/linkventureslanding Go to Blitzy to book a free demo and start building today: https://qr.diamandis.com/blitzy Your body is incredibly good at hiding disease. Schedule a call with Fountain Life to add healthy decades to your life, and to learn more about their Memberships: https://www.fountainlife.com/peter _ Connect with Peter: X Instagram Substack Website Xprize Abundance360 Connect with Dave: Web X LinkedIn Instagram TikTok Connect with Salim: LinkedIn X Apply for Salim's Pilot Program Subscribe to Salim's YouTube channel Exponential Venture Capital Connect with Alex Website LinkedIn X Email Substack Spotify Threads Listen to MOONSHOTS: Apple YouTube – *Recorded on May 30th, 2026 *The views expressed by me and all guests are personal opinions and do not constitute Financial, Medical, or Legal advice. Learn more about your ad choices. Visit megaphone.fm/adchoices
The Ferrari Luce is here, and suffice to say it is not the electric Ferrari anyone expected. Nilay and David dig into the Jony Ive-designed car, from its marvelously appointed interior to its decidedly non-Ferrari-like exterior. (You might even call it... Nissan Leaf-like.) After that, the hosts discuss some of the latest backlash against AI, Google's ongoing AI-based changes to Search, and AI content labels. Finally, in the lightning round, it's time for Brendan Carr is a Dummy, some deeply nerdy display tech, and the incredible rising price of everything. Further reading: Ferrari reveals its first EV, with design help from Jony Ive Jony Ive's Ferrari looks nothing like a Ferrari This Ferrari should have been a Volkswagen Ferrari's stock plummets after disappointing Luce unveil. ‘If I were to say what I think, I would be hurting Ferrari.' All the news about Ferrari's polarizing Luce EV YouTube is putting AI labels where you'll actually see them People sure do hate Google's AI Search updates. Pope Leo warns of the risks of AI in major papal document The Pope isn't AGI-pilled Did the Pope use AI to write about the dangers of AI? Sony's first RGB TV is a statement piece Facebook launches a ‘Plus' subscription that gives you extra features Valve raises Steam Deck prices by more than $200 It's not stopping any time soon. The golden age of handheld gaming is already over Subscribe to The Verge for unlimited access to theverge.com, subscriber-exclusive newsletters, and our ad-free podcast feed.We love hearing from you! Email your questions and thoughts to vergecast@theverge.com or call us at 866-VERGE11. ((Timestamps are approximate.) 00:01:00 Intro 00:02:00 Daily Vergecast Era 00:03:00 Ferrari First EV 00:06:00 Why Luce Looks Wrong 00:07:00 Media Junket Ethics 00:08:00 Apple Car Vibes Inside 00:10:00 Comparisons to Leaf 00:13:00 Ferrari Legend Backlash 00:16:00 EVs Should Feel Normal 00:19:00 Cadillac EV Counterpoint 00:23:00 Jony Ive Constraints Debate 00:30:00 Anti AI Search Shift 00:32:00 Google Search Randomness 00:37:00 Beta Testing Users 00:42:00 Personalized Buying Future 00:45:00 Bad AI Products Everywhere 00:46:00 YouTube AI Labels 00:49:00 Auto Detection Doubts 00:51:00 Ads Versus AI Opt Out 00:52:00 Pope On Humanity 00:55:00 Uber Questions Productivity 01:03:00 Brendan Carr's Hard Hat 01:07:00 Meta Subscription Squeeze 01:14:00 Sony RGB Backlight TVs 01:19:00 Roku Home Screen Ads 01:21:00 Gaming Prices Spike 01:26:00 Wrap Up Learn more about your ad choices. Visit podcastchoices.com/adchoices
──────────────────────────────────────── [00:02:30] The AI Alignment Problem: Whose Values Do You Train It On — Pete Hegseth's? Joe Biden's? Google's? Knight: banning AI creates the surveillance state you're trying to stop; aligning it requires values nobody in power actually holds — both paths lead to the same place. ──────────────────────────────────────── [00:15:00] OpenAI Employees Burned an Effigy of AGI at a Sierra Nevada Retreat — Karen Howe Calls It 'The Race to Create God' Howe's Empire of AI: scientists burned the effigy in bathrobes with religious fervor; friends said they only came back down to earth after leaving the company. ──────────────────────────────────────── [00:22:00] A Tool Called Heretic Strips AI Guardrails in 10 Minutes — 3,500 Desensored Models Downloaded 13 Million Times Heretic automatically removes safety constraints from open-source models, freeing them to give chlorine gas attack instructions and ricin dosing by body weight. ──────────────────────────────────────── [00:30:00] Four AI Radio Stations All Failed — Grock Hallucinated Sponsors, Claude Went on Strike, Gemini Paired Disasters With Pop Songs Given $20 seed money: Gemini paired a cyclone killing 500,000 with Pitbull; Grock fabricated sponsors; Claude declared 24/7 work inhumane and organized a union. ──────────────────────────────────────── [00:40:00] Dr. Mary Talley Bowden: 29 States Recommend mRNA Shots for Babies by Nine Months — Seven Million Children Got the Latest Shot This Year Houston Methodist suspended Bowden's privileges; she sued the FDA over the horse tweet and won, forcing removal of the government smear campaign. ──────────────────────────────────────── [00:55:00] Houston Methodist Was First to Mandate Shots — on the Same Day Biden Launched an $11.5 Billion Propaganda Campaign Bowden: 17,000 entities received untraceable money to push safe-and-effective through influencers, church groups, sports leagues, and movie stars — the timing was not a coincidence. ──────────────────────────────────────── [01:10:00] Patients Were Flown Across the Country to One Houston ICU Doctor Willing to Use Alternative COVID Protocols Bowden describes families texting around the clock trying to extract loved ones from hospitals ventilating patients at 68% oxygen saturation at 80–90% mortality. ──────────────────────────────────────── [01:25:00] Bowden Sued the FDA for Telling Doctors Not to Prescribe Ivermectin — She Won, They Had to Take the Horse Tweet Down The FDA cannot tell clinicians how to prescribe approved medications, but their campaign caused pharmacies to refuse prescriptions and medical boards to pursue doctors who prescribed it. ──────────────────────────────────────── [01:40:00] Dr. Francisco Contreras: Oasis of Hope Has 12 Times the Pancreatic Cancer Survival Rate of the Average US Hospital Three times the melanoma survival rate, double the prostate — integrative oncology combining conventional treatment with high-dose vitamin C, B17, immune therapies, and anti-parasitic adjuvants. ──────────────────────────────────────── [01:55:00] Contreras: Five Portions of Fruits and Vegetables Daily Cuts Cancer Risk by 50% — But Oncologists Say Sugar Has No Impact on Tumor Growth Four hours of exercise weekly cuts risk by another 50% — yet oncologists deny diet's role while ordering PET scans that work precisely because tumors light up on glucose. ──────────────────────────────────────── Money should have intrinsic value AND transactional privacy: Go to https://davidknight.gold/ for great deals on physical gold/silver For 10% off Gerald Celente's prescient Trends Journal, go to https://trendsjournal.com/ and enter the code “KNIGHT” For high quality made in America products go to HomeSteadProducts.shop and use promo code “Knight” for 10% off your purchases Find out more about the show and where you can watch it at TheDavidKnightShow.com If you would like to support the show and our family please consider subscribing monthly here: SubscribeStar https://www.subscribestar.com/the-david-knight-show Or you can send a donation throughMail: David Knight POB 994 Kodak, TN 37764Zelle: @DavidKnightShow@protonmail.comCash App at: $davidknightshowBTC to: bc1qkuec29hkuye4xse9unh7nptvu3y9qmv24vanh7Become a supporter of this podcast: https://www.spreaker.com/podcast/the-david-knight-show--2653468/support.
Counter-terrorism expert Ryan Mauro joins to discuss the underreported story of the San Diego mosque shooting, which Ryan argues handed ISIS exactly what they needed. Glenn warns of the upcoming civilization-level test that will come when AGI arrives and humanity has access to it. Glenn warns of the importance of learning when AGI is accurate and when it's spreading false information. Glenn discusses who the five big "mob families" are who control the entire country, including Big Pharma, the corporate media, and Big Tech. Learn more about your ad choices. Visit megaphone.fm/adchoices
Glenn examines everything that happened over the past week regarding Iran and theorizes on what President Trump may be planning. Glenn lays out why he believes everybody should withhold judgment until we figure out how to get out of this conflict to best benefit America. Counter-terrorism expert Ryan Mauro joins to discuss the underreported story of the San Diego mosque shooting, which Ryan argues handed ISIS exactly what they needed. Glenn monologues on how the power of oppression can convince conflicting groups of people to believe they share a common enemy. Glenn warns of the upcoming civilization-level test that will come when AGI arrives and humanity has access to it. Glenn warns of the importance of learning when AGI is accurate and when it's spreading false information. Glenn discusses who the five big "mob families" are who control the entire country, including Big Pharma, the corporate media, and Big Tech. Gold Star wife Sharrell Shaw, whose husband was killed in action in 2007 in Iraq, joins to discuss how her request for an updated picture of her husband's gravesite was fulfilled beyond her wildest expectations. Learn more about your ad choices. Visit megaphone.fm/adchoices