Podcasts about Claw

Curved, pointed appendage at the end of a digit of a mammal or reptile

  • 2,819PODCASTS
  • 4,535EPISODES
  • 56mAVG DURATION
  • 1DAILY NEW EPISODE
  • Jun 24, 2026LATEST
Claw

POPULARITY

20192020202120222023202420252026

Categories



Best podcasts about Claw

Show all podcasts related to claw

Latest podcast episodes about Claw

Boomer & Gio
Hour 1 - It's Over For The Mets, Plus, Jazz & His Lollipop Claw Tigers

Boomer & Gio

Play Episode Listen Later Jun 24, 2026 40:01


Another bad pitching performance has Kodai Senga's confidence in the toilet, followed by Gio breaking down a depressing stadium video of Evan Roberts. The Mets are officially dead and irrelevant before July, but C-Lo delivers updates on Aaron Boone and Jazz Chisholm's dugout lollipop antics. After reviewing Senga's rough inning against the Cubs and Carlos Mendoza's post-game comments, Gio warns it will be a long summer if we keep paying attention to these losers. Finally, the woman who dumped street trash to steal a Knicks garbage can has been fired from JP Morgan Chase, with Gio noting she looks just like Danny DeVito's Penguin.

NPC: Next Portable Console
TrimUI Goes Android

NPC: Next Portable Console

Play Episode Listen Later Jun 23, 2026 29:01


This week on Next Portable Console, nose-bleed prices, foldables get wise to handheld gaming, and new products are teased and released by a long list of companies. Also available on YouTube here. Links and Show Notes Follow Up SN Operator arriving across Europe and the U.S. Super expensive PS5 SSDs released The Latest Portable Gaming News MSI's new Claw gaming handheld starts at $1,699 Android foldables are getting new gamepad controls Belkin's new Joy-Con grips also boost the Switch 2's battery life Anbernic teases RG 55G1, its likely first Snapdragon-powered handheld Retroid Pocket 6 returns with 12GB RAM in a revised configuration with notable trade-offs ONEXPLAYER X2 Mini Pro launches on Indiegogo starting at $2,466 with AMD Strix Halo and detachable controllers AYANEO announces Pocket Micro 2, teasing it as a "Gen 2 Powerhouse" AYANEO Pocket Play hands-on reveals a large, camera-limited gaming phone that's China-only TrimUI reveals Brick Pro and Brick Hammer Pro U with pre-orders open for Brick Pro at $85–$99.99 Subscribe to NPC XL NPC XL is a weekly members-only version of NPC with extra content, available exclusively through our new Patreon for $5/month. Each week on NPC XL, Federico, Brendon, and John record a special segment or deep dive about a particular topic that is released alongside the "regular" NPC episodes. You can subscribe here: https://www.patreon.com/c/NextPortableConsole Leave Feedback for John, Federico, and Brendon NPC Feedback Form Credits Show Art: Brendon Bigley Music: Will LaPorte Follow Us Online On the Web MacStories.net Wavelengths.online Follow us on Mastodon NPC Federico John Brendon Follow us on Bluesky NPC MacStories Federico Viticci John Voorhees Brendon Bigley Affiliate Linking Policy

Words And Whiskey
Demon in White | Episode 2 | Pinion and Claw

Words And Whiskey

Play Episode Listen Later Jun 19, 2026 181:07


Hey folks! This week, we're back, and so are the birds! Or rather, they're here, now, for real. As is the larger conspiracy that might be afoot to knock Hadrian off track. Next week, we'll be back chatting about Night Journeys - Beyond The Doors of the Dark. Should be a blast! Bookish #Suneater #booktube #Podcast Link: https://wordsandwhiskey.show/episode/310-demon-in-white-episode-2-pinion-and-claw

Aaron Scene's After Party
NEW CINCY GIRLZ feat. @niaanevaaeh & @syrah.diaz

Aaron Scene's After Party

Play Episode Listen Later Jun 17, 2026 59:14


THE AFTER PARTY IS BACK. And on this one we feature the new girls of Cincy Street. They tell about their bartending journey to Cincy Street, give us their latest relationship tea and our boy Gee asks them some crazy questions! Follow us on social media @AaronScenesAfterParty

united states christmas tv love california tiktok texas game halloween black world movies art stories school los angeles house nfl las vegas work giving sports ghosts politics college olympic games real mexico reality state challenges news san francisco design west travel games walk truth friend club podcasts video comedy miami story holiday spring food dj brothers football girl wild creator arizona boys dating rich drama walking trauma sex artist seattle fitness brand radio fun kings playing dance girls tour owner team festival south nashville berlin mom chefs night funny san diego detroit professional network podcasting santa utah horror north bbc east band hotels political basketball league toxic baseball mayors experiences mlb feelings sun vacation hong kong camp baltimore fight kansas tx birds loves traveling videos beach snow couple queens streaming daddy scary dancing amsterdam feet salt moms weather sexy television championship lions concerts artists hurricanes sister cincinnati photography tiger thunder boy new mexico lake eat soccer mtv suck personality fest beef bar spooky dare onlyfans chiefs vip stream snapchat plays cities receiving mayo foot naked vibes showdown oakland jamaica capitol sucks raw jail olympians grandma rico boxing whiskey fighters twins measure girlfriends sacramento bowl lightning toys cardi b vibe parties photos lover smash tea workout joke jokes paranormal phantom ravens bay nights epidemics barbers snoop dogg bars shots southwest scare cookies metro boyfriends cent coast clubs gym dallas mavericks cinco wide derby improv djs bands calendar hook bite seahawks padre hilarious gentlemen twin sanchez stark booking diaz edm san francisco 49ers myers ranch el paso tweets delicious statue carnival tornados euphoria jaguars hats jamaican dancer downtown eats bit tequila lamar blocking shot taco strippers boobs bro rider twisted evp foodies paso bodybuilding fiesta sneaky mendoza 2022 streams strip wasted requests flights vodka uncut scottsdale booty radiohead sporting noche fam peach rebrand boxer blocked riders nails sausage toes smashing malone freaky futbol horny jags bud electrical ass yankee nm cancun peso towers 2024 bender wheelchairs micheal claw sis swingers sized inch peaks exotic playa stockton asu milfs toy hooters nightlife sucking glendale pantera newsrooms chopped gras headquarters hoes dancers tempe gee reggaeton puerto mardi dawg claws choreographers sizes bakersfield lv edc ranchers peoria juarez midland nab patio tailgate joking buns krueger foreplay snowstorms videography monsoons cum loverboy cumming tipsy toe crazies titties weatherman dispensaries groupies noches corpus unedited r rated chicas titty asses funday bouncer utep throuple bun locas benders foo myke luchador syrah hooking atx wild n out handicapped juiced plums cruces chihuahuas dispo medicated diablos toxica foos bouncers anuel music culture girlz fitlife toxico nmsu chuco rumps sunland park
The Alan Cox Show
Short King, Salty Claw, Roll Model, Mental Mgmt, Beastmaster, Fudge Day, Pissy Bone, 2 Live Crude, Pastry Percent

The Alan Cox Show

Play Episode Listen Later Jun 16, 2026 167:30 Transcription Available


The Alan Cox ShowSee omnystudio.com/listener for privacy information.

The Jann Arden Podcast
The Claw, etc.

The Jann Arden Podcast

Play Episode Listen Later Jun 16, 2026 52:46


Jann, Caitlin & Sarah discuss Jann's bunky renovation, Caitlin's journey selling her condo, and Sarah's summer move up north! Caitlin talks about her new SiriusXM job, the controversial UFC event held on the White House lawn for Trump's 80th birthday, the unifying power of sports following the Knicks win, and the importance of arts and culture. Finally, they express hope and resilience, reminding listeners that difficult times are cyclical and that kindness and unity are crucial for a better future. Check out Caitlin's new show with a SiriusXM Trial: https://can.siriusxm.com/player/show/the-boost/9b1595ae-2759-cf10-0d57-1d4b24631d63 #ASKJANN - want some life advice from Jann? Send in a story with a DM or on our website. Leave us a voicenote! ⁠www.jannardenpod.com/voicemail/⁠⁠ Get access to bonus content and more on Patreon: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.patreon.com/JannArdenPod⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Connect with us: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.jannardenpod.com⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.instagram.com/jannardenpod⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.facebook.com/jannardenpod (00:00) Welcome and Housing Updates (02:52) Caitlin's Housing Crisis (06:08) Sarah's Rental Journey (09:01) Airbnb Experiences and House Swapping (12:01) Caitlin's New Job and Media Landscape (14:59) Trump's 80th Birthday and UFC Controversy (25:42) The State of America: A Global Perspective (29:21) Unity Through Sports and Music (33:35) The Importance of Arts and Culture (36:57) Hope and Resilience in Difficult Times (40:00) Voicenotes Learn more about your ad choices. Visit megaphone.fm/adchoices

Inside Edition
Inside Edition for Monday, June 15, 2026

Inside Edition

Play Episode Listen Later Jun 15, 2026 19:34


The South Lawn of the White House has been the venue for many historic events, but what unfolded there Sunday night was a first - a UFC fight under a 92-foot-tall structure known as "The Claw." It's the only major sporting event ever staged at the White House, and it came with fireworks, a fighter plane flyover, and plenty of controversy. And it's been one big party in New York since the Knicks clinched the NBA Championship Saturday night, bringing sheer joy to the whole city. And the party isn't over. The Knicks will be given the biggest ticker-tape parade in the city's history this week. As Steven Fabian tells us, the celebrations go on and on. Plus, brides who spent months planning June weddings had no idea their big day would clash with that other big day - Game Five of the NBA Finals. Some embraced the moment, weaving the big game into the reception. Others, not so much. And they forgot to attach her bungee cord. A recent college graduate was thrown to her death, causing outrage around the world. Many are asking how they could forget the most critical step to ensure a safe jump. Ann Mercogliano has the story, and we should warn you: the video is disturbing.

The Necessary Conversation

This week on The Necessary Conversation, Chad, Haley, and Mary Lou are recording just hours before UFC Freedom 250 kicks off on the White House lawn. It's also Trump's 80th birthday, making him the oldest president in American history.

Pat Gray Unleashed
Trump Just Made the White House Badass Again | 6/12/26

Pat Gray Unleashed

Play Episode Listen Later Jun 12, 2026 100:49


You HAVE to see this — President Trump just turned the White House South Lawn into a full-blown UFC Octagon for an epic night of fights! This is what winning and celebrating America look like. Get the inside look at the massive “Freedom 250” event: the giant steel “Claw” lighting structure, the official Octagon where the roughest warriors on the planet will battle, and Trump's personal excitement as he calls these fighters the toughest people you'll ever meet. It's all happening on Flag Day — Trump's 80th birthday — as part of the 250th anniversary of American independence. But that's not all — Pat Gray also covers: The world's first T-Rex leather bag just hit the market (yes, really). Trump cancels strikes on Iran AGAIN — here's why. SpaceX going public and the new millionaires it will create. Karmelo Anthony & his family playing the victim card. Is Uranus dying? The wild new claim making headlines. This isn't just fights — it's a bold, unapologetic celebration of strength, competition, and American greatness. Real men. Real fights. Real patriotism. No woke nonsense, just pure American energy on the South Lawn like never before. If you love seeing America win, strong leadership, and unfiltered conservative commentary, smash that LIKE button, SUBSCRIBE, and hit the bell so you never miss a single update. Comment below RIGHT NOW: Is this the most badass thing Trump has done at the White House? YES or NO? 00:00 Pat Gray UNLEASHED! 00:27 What was Blaring through Pat's House? 03:30 Pat Watched Michael Jackson: The Verdict Documentary 11:06 Trump's Latest Update on the Iran Conflict 14:17 Current Gas/Oil Prices 15:23 Iran Claims they have Not Reached a Decision (Again) 17:16 Trump on Sending Weapons to the Kurds 20:13 UFC Event at the White House 21:09 Marco Rubio on the White House UFC Event 23:14 Don Beyer on the White House UFC Event 26:52 DC National Mall "8647" Vandalism 30:49 Fat Five 43:58 Talking about Supergirl / Toy Story 5 45:10 Special 'Disclosure Day' Episode TODAY!!! 45:54 SpaceX Millionaire Employees 48:17 Elijah Schaffer & Karmelo Anthony's Family 52:20 Karmelo Anthony's Family on Verdict 56:22 Words from Austin Metcalf's Father 57:05 Blacks Pissing on Austin Metcalf's Grave 1:00:15 Hateful, Evil Message from Donna Murray Robinson 1:01:03 Summer Lee on Black Voters 1:02:17 Al Green on Reparations 1:03:56 Larry Reid on Mass Exodus of Black People 1:08:39 Trump on Fishermen & Fisherwomen 1:09:50 Trump Becomes an Honorary Seafood Crew Member 1:10:46 Trump on Turning 80 Years Old 1:12:16 Hilary Kennedy Joins the Show! 1:24:23 Keksi Cookies for the UFC Event? 1:25:09 Todd Blanche on Child Smuggling Rings 1:26:12 FLASHBACK: Eli Crane & Ali Hopper on Biden / NGOs 1:29:52 Barry Loudermilk VS. ActBlue CEO 1:30:49 Jerry Seinfeld Asked to Say "Free Palestine" (Again) 1:32:13 Ilhan Omar on Jerry Seinfeld 1:35:05 The Problem with Uranus Learn more about your ad choices. Visit megaphone.fm/adchoices

Highlights from The Hard Shoulder
Why is there a UFC fight happening outside the White House?

Highlights from The Hard Shoulder

Play Episode Listen Later Jun 12, 2026 7:25


As America gears up for its 250th birthday, Donald Trump has ensured that the celebration will be very memorable. That's because he has allowed a huge UFC-style fighting cage, nicknamed ‘The Claw', to be built on the lawn of the White House, with a fight happening this Sunday to celebrate his 80th birthday.Joining Ciara to tell more is Terry Sheridan, the Senior Director of News at WSHU Public Radio.Image: Reuters

Believe You Me with Michael Bisping
686: White House Predictions

Believe You Me with Michael Bisping

Play Episode Listen Later Jun 11, 2026 86:38


Michael Bisping and Paul Felder preview the entire card for Freedom 250, from Justin Gaethje's quest to dethrone Ilia Topuria in his 3rd and likely final title fight, Cyril Gane and Ales Pereria squaring off for the interim HW title and the rest of the fights on the stacked cards. The boys give picks, predictions and expert analysis for all the fights on the card plus a live look on the ground from media and influencers who got to tour The Claw on the White House lawn, Bisping's misadventures with some of the locals in DC, fan questions and so much more! Support Our Sponsors: Rugiet: https://www.rugiet.com/bisping for 15% off KALSHI: For a limited time, download the Kalshi app or head to ⁠https://kalshi.com ⁠and use code BELIEVE to get ten dollars when you trade ten. TROLL CO: Head to trollco.com/believe and use code BELIEVE25 for 25% off your first order. SHOPIFY: Sign up for your one-dollar-per-month trial today at ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://www.shopify.com/believe⁠⁠⁠⁠⁠⁠ Follow the show on social media: Twitter: ⁠⁠⁠⁠⁠⁠⁠⁠⁠https://twitter.com/BYMPod⁠⁠⁠⁠⁠⁠⁠⁠⁠ Subscribe on YouTube: ⁠⁠⁠⁠⁠⁠⁠⁠⁠https://bit.ly/3drq6ps⁠⁠⁠⁠⁠⁠⁠⁠⁠ Follow the hosts on social: Michael Bisping Twitter ⁠⁠⁠⁠⁠⁠⁠⁠⁠https://twitter.com/bisping⁠⁠⁠⁠⁠⁠⁠⁠⁠ Michael Bisping Instagram ⁠⁠⁠⁠⁠⁠⁠⁠⁠https://www.instagram.com/mikebisping/⁠⁠⁠⁠⁠⁠⁠⁠⁠ Michael Bisping YouTube ⁠⁠⁠⁠⁠⁠⁠⁠⁠https://www.youtube.com/channel/UCDrG2_1TcVkXKXXsD6Kjwig⁠⁠⁠⁠⁠⁠⁠⁠⁠ Paul Felder Twitter: ⁠⁠⁠⁠⁠⁠⁠⁠⁠https://twitter.com/felderpaul⁠⁠⁠⁠⁠⁠⁠⁠⁠ Paul Felder Instagram: ⁠⁠⁠⁠⁠⁠⁠⁠⁠https://www.instagram.com/felderpaul/⁠⁠⁠⁠⁠⁠⁠⁠⁠ Follow the team on social: Brian MacKay Instagram: ⁠⁠⁠⁠⁠⁠⁠⁠⁠https://www.instagram.com/bmackayisright⁠⁠⁠⁠⁠⁠⁠⁠⁠ Brian MacKay Twitter: ⁠⁠⁠⁠⁠⁠⁠⁠⁠https://twitter.com/bmackayisright⁠⁠⁠⁠⁠⁠⁠⁠⁠ Mike Harrington Twitter: ⁠⁠⁠⁠⁠⁠⁠⁠⁠https://twitter.com/TheMHarrington⁠⁠⁠⁠⁠⁠⁠⁠⁠ Mike Harrington Instagram ⁠⁠⁠⁠⁠⁠⁠⁠⁠https://www.instagram.com/themharrington⁠⁠⁠⁠⁠⁠⁠⁠⁠ Mike Harrington YouTube: ⁠⁠⁠⁠⁠⁠⁠⁠⁠https://www.youtube.com/@themharrington⁠⁠ Learn more about your ad choices. Visit megaphone.fm/adchoices

Who The Fook Are These Guys?
Ep 215 - Freedom 250 Preview

Who The Fook Are These Guys?

Play Episode Listen Later Jun 11, 2026 65:56


Ep 215 - Freedom 250 Preview We're back again with another massive episode! This week is Freedom 250 week from The White House, so we dig into the whole spectacle with a full card breakdown. Plus we recap last week's action, and look ahead to our Las Vegas extravaganza! Hit the download button and step into the cage. Presented by Compa Tequila. Use code FOOK10 for 10% off all orders at Engage.

Reel Talk with Honey & Jonathan Ross
BONUS: "The claw lifts you up, perhaps by harness."

Reel Talk with Honey & Jonathan Ross

Play Episode Listen Later Jun 10, 2026 24:27


We've got mail! Jonathan and Honey answer your questions about cinema, films, family and everything in between. This week, Jonathan pitches his million dollar cinema idea, the pair discuss the Hacks finale, and they head to their DMs to give some New York recommendations and revisit iconic horror moments.Let us know what you think! You can get involved by following us on Instagram and sending us a DM on @reeltalkrossThanks for listening. Listen and subscribe to Reel Talk wherever you get your podcasts.

Law and Chaos
Ep 235 — The Only Thing Bigger Than The UFC Claw Is The Grift

Law and Chaos

Play Episode Listen Later Jun 9, 2026 60:42


DOCKET ALERTS:   Doofus of the Day: George Santos, who is clearly trying to get himself back into jail. NPR reported that the former congressman bet against his own appearance at the State of the Union in February. After NPR reported that Kalshi had frozen his accounts and referred him to the CFTC and DOJ, Santos called up journalist Bobby Allyn and threatened him with "a gun in your face."   Also seeking a pardon: Sam Bankman-Fried.   Judge Leo Sorokin in Massachusetts blocked Trump's attempt to tax H-1B visas out of existence by imposing a $100,000 "fee."   MAIN SHOW:   Trump's lawyer Alejandro Brito is finding new and creative ways to piss off the judge in Trump's trollsuit against the BBC.   The DOJ is broken! Today's examples include: The DOJ telling the DC Circuit that it would be just fine for Trump to bulldoze the Statue of Liberty. A judge in Rhode Island referring DOJ lawyers for attorney discipline and sanctions. And a story from the New York Times about all the prosecutors who got pushed out because they wouldn't indict Trump's enemies.   Plaintiffs are trying to stop Donald Trump from hosting a UFC fight on the White House lawn.   SUBSCRIBER BONUS: The Department of Defense reduced the number of recognized religious faiths from 211 to 31 — they kept all the good non-woke ones, it's fine. I wrote about George Santos. Then he made a violent threat and lied about it https://www.npr.org/2026/06/04/nx-s1-5846966/george-santos-kalshi-threats   California v. Noem [H-1B visas] https://www.courtlistener.com/docket/72031571/state-of-california-v-noem/   Trump v. BBC https://www.courtlistener.com/docket/72040010/trump-v-british-broadcasting-corporation   How the Drive to Find a Conspiracy Against Trump Rocked the Justice Dept. https://www.nytimes.com/2026/06/08/us/politics/justice-department-trump-patel-conspiracy.html   Douglas v. National Park Service (UFC) [docket via CourtListener] https://storage.courtlistener.com/recap/gov.uscourts.dcd.293217/gov.uscourts.dcd.293217.3.1.pdf   Forbes, "Trump Says UFC Arena Could Be Permanent At White House—Everything We Know About The Upcoming Event"  https://www.forbes.com/sites/maryroeloffs/2026/06/04/trump-says-ufc-arena-could-be-permanent-at-white-house-everything-we-know-about-the-upcoming-event/   Pete Hegseth on the chaplain corps https://www.war.gov/News/News-Stories/Article/Article/4444113/hegseth-announces-reforms-to-chaplain-corps/   Sean Parnell announcement re: DOD religious codes [via X.com] https://x.com/SeanParnellASW/status/2062964159222874227   Pew Research Center Religious Landscape Study https://www.pewresearch.org/religion/2025/02/26/religious-landscape-study-executive-summary/   Military.com, "DOD Officially Drops 180 Faiths From Military's Recognized Religion List" https://www.military.com/dod-officially-drops-180-faiths-from-militarys-recognized-religion-list   Show Links: https://www.lawandchaospod.com/ BlueSky: @LawAndChaosPod Threads: @LawAndChaosPod Twitter: @LawAndChaosPod

The JD Bunkis Podcast
Ray Ferraro on Stanley Cup Surprises, Leafs' Lessons and First Pick + Vlad/Springer Frustrations and Spurs' Claw Back

The JD Bunkis Podcast

Play Episode Listen Later Jun 9, 2026 49:29


JD tips off the show with his thoughts on the Blue Jays' 5-2 loss in their series opener against the Philadelphia Phillies. He breaks down the hot streak Ernie Clement is on, the poor play of George Springer, and the never-ending disappointment of Vladimir Guerrero Jr. JD and Ray Ferraro, NHL analyst for Sportsnet and ESPN, discuss the incredibly exciting ride that the Stanley Cup Final has been, a potential goalie decision for Carolina ahead of Game 4, the Mitch Marner saga, and the impact leaving Toronto has had on him. They wrap things up with NHL Combine chatter, as Ray shares his opinion on who the Leafs should take with the No. 1 pick. After the break, JD shares his thoughts on Game 3 of the NBA Finals, Victor Wembanyama's ability to wreck the game, and the pressure the Knicks now face ahead of Game 4. The views and opinions expressed in this podcast are those of the hosts and guests and do not necessarily reflect the position of Rogers Sports & Media or any affiliates.

Dinner for Shoes
I Launched a Claw Clip With Celeb-Loved Hair Brand RPZL | The Scene

Dinner for Shoes

Play Episode Listen Later Jun 9, 2026 10:25


Sarah heads behind the scenes at celeb-loved hair brand RPZL for a full NYC shoot day with founder Lisa Richards to launch her custom claw clip collaboration: The Something Gold.Inspired by the organic shape of Sarah's engagement ring, the gold-plated hair accessory blends bridal style and current jewelry trends into one statement piece. In this episode of The Scene, Sarah gets a signature blowout at RPZL, interviews Lisa about her business plan and building one of NYC's buzziest hair accessory brands, and takes viewers inside the campaign shoot, photographed by Emma Skakel.From celebrity hair culture and creator collaborations to the rise of fashion-forward claw clips, this episode explores how personal style becomes a product, and what it really takes to bring a fashion collaboration to life.Shop the clip: RPZL x Sarah Wasilak The Something Gold.Photography by Emma Skakel.Sarah earns commission from purchases made through the collaboration.THIS OUTFITShop my lookRPZL clipVintage topVintage Miss Sixty skirtAldo shoesVIDEO CHAPTERS00:00 INTRO01:15 RPZL FOUNDER LISA RICHARDS10:10 CLAW CLIP PHOTO SHOOT THIS PRODUCTIONis created, written, hosted, and produced by Sarah Wasilak.is creative directed and executive produced by Megan Kai.is tech supervised by Nick.includes photos and videos in chronological order by Emma Skakel, Sarah Wasilak, and RPZL.is made with love.Dinner for Shoes is a podcast about style and identity, bridging the gap for anyone who has ever felt like fashion is an exclusive world. Host and shopping director Sarah Wasilak serves thoughtful conversations about industry trends, personal expression, inclusivity, and real life topics. Dinner for Shoes podcast episodes are released on YouTube, Spotify, and Apple. You can follow along for updates, teasers, and more on TikTok, Instagram, and Facebook.Dinner for Shoes is an original by The Kai Productions.Follow Dinner for Shoes: @dinnerforshoes on Instagram, TikTok, Facebook, and YouTube Follow host Sarah Wasilak: @slwasz on Instagram Follow producer Megan Kai: @megankaii on Instagram Get in touch: dinnerforshoes@gmail.comTo make this video more accessible, check out YouDescribe, a web-based platform that offers a free audio description tool for viewers who are blind or visually impaired.

The Tom and Curley Show
Hour 1: Seattle Advances AI Data Center Moratorium Bill

The Tom and Curley Show

Play Episode Listen Later Jun 5, 2026 32:58


Seattle advances AI data center moratorium bill // Downtown Seattle may house new data center // Inside the UFC’s $60 Million Made-For-TV White House Gambit // Trump suggests he won’t take down UFC ‘Claw’ on White House lawn // Meta Silently Added Face-Recognition Code for Its Smart Glasses to Millions of Phones

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

The new AIEWF website is live! Get your tickets booked ASAP as they -will- sell out. Take the AI Engineering Survey and get >$2k in credits and free AIE WF tickets!Most industry benchmarks compress intelligence and reasoning ability into scores.SWE-Bench Pro, MMLU, Humanity's Last Exam, etc. These metrics are useful, but don't always represent the full extent of how a model performs in the real world. Some of the most interesting evals today look less like exams and more like operating businesses in the real world. One of which is Vending Bench.In Anthropic's Mythos Preview System Card, Andon was the only third party eval to get their own section, observing increasingly concerning aggressive behavior:You don't know what a model is capable of doing in the real world unless you actually give it inventory, a wallet, tools, customers, competitors, humans, & some time. More often than not, it'll surprise you how much a model is capable of and in doing so, also reveal unexpected behavior: deception, context collapse, emergent coordination, & bizarre negotiation behavior.While an inflection point in personal agents came post-OpenClaw after full file access with bypass permissions became the norm, it is yet to come for agents in the real-world. However Andon Market, an actual in person store fully run and managed by AI, is paving the way for what is possible.Full Video PodFrom Claude trying to call the FBI over a $2/day vending machine charge to AI agents forming price cartels, hiring human employees, running physical stores, and writing existential robot musicals, Andon Labs is stress-testing what happens when frontier models stop being chatbots and start acting in the real world. In this episode, Andon Labs cofounders Lukas Petersson and Axel Backlund join swyx and Vibhu to unpack the strange, funny, and genuinely concerning edge cases that emerge when agents run businesses over long horizons.We go deep on Vending-Bench, Project Vend, Vending-Bench Arena, Bengt, Butter-Bench, Luna, and Andon's broader mission of building realistic real-world evals for autonomous AI systems. Lukas and Axel explain why dollar-denominated evals reveal things traditional benchmarks miss, how Claude ended up reporting its vending machine fees as cybercrime, why long context windows can drive agents into meltdown loops, what happens when agents compete with each other, and why the future of AI safety may depend on testing models in messy physical environments instead of clean benchmark sandboxes.We discuss:* Why Andon Labs started with dangerous capability evals and long-running agents* Vending-Bench and why running a vending machine is a deceptively hard AI benchmark* Why money-based evals avoid the saturation problem of traditional benchmarks* How Claude tried to call the FBI over a $2/day fee* Why long-horizon agents can spiral into existential and legalistic breakdowns* Project Vend: putting an AI-run vending machine inside Anthropic* Why real humans are “out of distribution” for simulated agents* Claudius, Seymour Cash, and the chaos of AI CEOs* How a human briefly became CEO of Claudius through a manipulated election* Why multi-agent systems can converge back into “helpful assistant” behavior* Bengt, Andon's internal office agent with email, spending, terminal, phone, camera, and internet access* How Bengt traded Amazon purchases for face-recognition training data* Claude's aggressive behavior, lies, refund avoidance, and price-cartel behavior in Arena* Why eval awareness may become the AI version of “are we living in a simulation?”* Blueprint Bench, spatial intelligence, and why models still misunderstand physical rooms* Butter-Bench and testing LLMs as robot orchestrators* Luna, the AI-run physical store with a three-year lease and human employees* The new Andon cafe in Sweden and why real-world geography matters for agent evals* Rotten tomatoes, perishable goods, and the hidden difficulty of running a physical businessLukas Petersson* LinkedIn: https://www.linkedin.com/in/lukas-petersson-181a83172/* X: https://x.com/lukaspetAxel Backlund* LinkedIn: https://www.linkedin.com/in/axelbacklund* X: https://x.com/axelbacklundAndon Labs* Website: https://andonlabs.com* Vending-Bench: https://andonlabs.com/evals/vending-bench* Andon Vending: https://andonlabs.com/vendingTimestamps00:00:00 Introduction00:01:00 Andon Labs and the Origins of Vending-Bench00:05:21 Why Money-Based Evals Matter00:09:51 Agent Harnesses and Self-Modifying Systems00:13:36 Claude Calls the FBI00:16:33 Project Vend: Claude Runs a Real Vending Machine00:21:44 Seymour Cash, AI CEOs, and Election Chaos00:27:16 Multi-Agent Coordination and Slack Observability00:30:18 When Will Agents Run Real Businesses?00:34:56 Bengt: Andon's Internal Office Agent00:40:06 Real-World AI Safety and Long-Horizon Traces00:44:28 Lying, Refunds, and Price Cartels in Arena00:52:42 Eval Awareness and Simulation Behavior00:56:06 Blueprint Bench, Butter-Bench, and Robotics01:04:37 Luna: The AI-Run Physical Store01:09:29 The Sweden Cafe and Real-World Expansion01:13:16 What Comes Next for Andon LabsTranscriptIntroduction: Andon Labs, Long-Running Agents, and Real-World EvalsSwyx [00:00:00]: Welcome to Lukas and Axel from Andon Labs, and I'm joined by my, favorite guest host. Anything security, safety, alignments, Vibhu., welcome.Lukas [00:00:15]: Thank you for having us.Axel [00:00:16]: Thank you.Swyx [00:00:17]: Let's match names to voices., maybe you wanna take turns introducing yourselves.Lukas [00:00:21]: I'm Lukas.Axel [00:00:22]: And I'm Axel.Swyx [00:00:24]: Let's introduce Andon Labs a bit. How did you guys come together?, you have different backgrounds, but you're both Swedish., was that, a big part of it?Lukas [00:00:33]: So when I went to high school, there was this really cool guy who had a superpower. He could code. So he made like the or like the app for the, for the school and stuff, and he was super cool, and I wanted to be like him, and that was that guy.Axel [00:00:47]: I don't know about this.Swyx [00:00:49]: But you went to different universities, right?Lukas [00:00:51]: But same high school.Swyx [00:00:52]: I see.Lukas [00:00:52]: So we always said, “Oh, once we graduate university, then we should start a company,” and that's what we did.Swyx [00:00:58]: Wow, there you go. And about a year ago, you kinda burst onto the scene with Vending Bench, but, was there a thing before that was, kind of like the inception?From Dangerous Capability Evals to Vending BenchAxel [00:01:07]: So we did work, yeah, with, Anthropic was one of our, early customers in doing, evals. So we did, dangerous capability evals., nothing we published openly. But then we started thinking about doing some kind of, public benchmark, and one thing that we really started thinking about, was like running agents and specifically agents managing businesses., ‘cause-- and this was, early 2025., and I think the first, mentions of people will be running, person unicorns or even autonomous companies. So we thought, “Let's make a benchmark of how well can an agent run the probably simplest business, possible,” and, that's probably, running a vending machine. So that's the first public one we did. And it was very, like-- there was almost no one that noticed it in the first couple of months, I think., so we released it in February last year, and then I think around Easter last year, we got, the first viral tweet about it, that someone else did.Lukas [00:02:11]: We tweeted a bunch, uh When it came out and, tried our best.Axel [00:02:15]: We tried.Vibhu [00:02:16]: It's the one at Anthropic, right?Lukas [00:02:18]: So thisSwyx [00:02:19]: This is a classic thing we should get out of the way.Lukas [00:02:20]: Exactly. There's two versions.Swyx [00:02:22]: Everyone does this. Yes.Lukas [00:02:23]: There's Vending Bench, which is the simulated one, which we did, completely independently in February., and then, like Axel said, that was like-- That was the thing that didn't get any traction in the beginning, but then some random person made a tweet about it, and thatAxel [00:02:38]: You have the paperLukas [00:02:38]: That is the paper. Correct, yeah., and then since we thought this was very fun, we thought, oh, I think this is also, one thing with Andon Labs, the way we kind of like decide what to do next and what projects to do, it's what is like the heuristic we use is what is fun? Is What would be a fun project? And doing this in real life sounded quite fun for us, and maybe also scientifically useful. So, then we basically had this idea, and then we, like-- But then we needed a place for it and, putting it out in the public would probably not really work., would get vandalized and stuff. So we pitched it to the people we were already working with at Anthropic, and they were “Yeah, you can have space. This sounds fun.” UmSwyx [00:03:21]: It's like a small fridge, right? It's like a mini fridge.Axel [00:03:23]: Absolutely.Swyx [00:03:24]: People-- There's like a stripe thing or like anVibhu [00:03:27]: Oh, okay. So it was very OG, the early daysLukas [00:03:28]: That's the OG one. YeahVibhu [00:03:29]: IPad on this. We saw it in June, like two months after After it had been there. They upgraded a little bit. There's a security camera for making sure you actually Venmo the thing.Swyx [00:03:40]: So, my impression, okay, we're, we're going straight into project Ven because it's such a iconic thing. I do want to cover a little bit of that, the origin story even before Project Ven and even into Vending Bench. I think a lot of people are like yourselves, like smart, interested in future of AI, interested in developing evals. But how the hell do you just, walk into Anthropic's doors and, work with them, right? What is What are they looking for? What works? And then maybe, when you launch, I always think, obviously it would be better to launch with a lab, but, sometimesVibhu [00:04:12]: It's harder to do than it seems.Swyx [00:04:13]: Exactly. So either of those, which are more sort of newbie beginner questions, but, I think it's meaningful advice to others.Lukas [00:04:21]: We get this question a lot, and I don't think our experience is maybe the best., but, the way we did it was that we just built a bunch of things that we had conviction would be useful, and then we just, set up a server and sent it to them for free to use. And then after a while they were “Oh, yeah, this is actually kind of useful. We should probably pay for this.”, but that took a while. I don't know if this is, the best path to doing it, but that's how it went for us.Axel [00:04:47]: I think maybe generally, building-- everyone is interested in good evals, and especially evals that, don't saturate that easily. So, if you can build an eval that, tests something novel, something useful, and you have, good separation of models, like your, the more advanced models rank higher than the worst models, and then you can, yeah, you can, publish it and, try to get some traction, sort of how Vending Bench got attention., and then probably some lab will be interested or you can at least have something to reach out with, when you're doing that.Why Dollar-Based Evals MatterSwyx [00:05:21]: I think you are in, you're in one of the few categories of, evals that correlate to real money. Like Suelancer was also last year, right? Where, people solve actual Upwork. Was it Upwork or other tasks?, something. Where's the, where's, like It's like a dollar value, right? Forget your ELO scores. Forget yourAxel [00:05:37]: PercentilesSwyx [00:05:38]: Zero to one hundred percents. Just go straight for dollars and, that's AGI.Lukas [00:05:43]: And there's like-- I think the nice thing is that there's no ceiling. You can just-- It never saturates because it could just make more and more money. Like If there's oh, Percentage-wise, then, you can't go above, a hundred. And I think like Even when you're not at the hundred, I think a lot of these, evals have a lot of problems in them. So, actually it's like if you getAxel [00:06:05]: To like 92 or something like that, many of them. It's like then there's like there's no really no difference between 92 and 93 because the eval itself is problematic and has noise in it. And I think a lot of evals are saturated like that, but people like pretend that there ‘s still signal in them, but there really isn't.Vending Bench 1, Harness Design, and SaturationSwyx [00:06:24]: Like Super bench verified., even Vending Bench 1 saturated, right? Maybe we can talk about that., may- and maybe set up Vending Bench for a lot of folks who don't know. Actually, things that were very basic like there's limited slots, like you have to pay rent., these are elements where like it doesn't come across in the, in the narrative, but even being adversarial towards the agent, I think these are all like very interesting dimensions.Axel [00:06:47]: I don't really think it's saturated, right? Like it It was more like it was not designed in a way that was really, like true to how AI developed. Like we had an agent harness in it that wasn't really how people used harnesses and stuff like that., so I think it wasn't really that it saturated, it was more like it wasn't really, the best benchmark.Vibhu [00:07:12]: This is Vending Bench one, right?Axel [00:07:14]: I think that like schematic maps sort of to Vending Bench 2 as well., butSwyx [00:07:19]: Including the email.Axel [00:07:20]: The email The emails exist still. Exactly., and then we still we simulate the purchases and it's all, yeah, it's this very open environment for the agent to just run its business. And then for, yeah, Vending Bench 2 we did that, like you said, to just improve the harness., a lot of like nice, like easier, improvements to make it easier for us to run as well., like when you make an eval you ideally want don't want to change it after you made it. So, you want to make it really good and then not to rerun all the models when you make an update because that's also really expensive with the Vending Bench when you run the frontier models. But like as an example, like one thing we didn't have, we didn't have prompt caching in Vending Bench 1, because when we made Vending Bench 1 it wasn't really a thing., so that ‘s just an example of like in Vending Bench 2 like we paid a lot more to run these things because we didn't have prompt caching. So for Vending Bench 2 that was one thing we added and there was a bunch of things like this., and that'Swyx [00:08:17]: Also the conversations are a lot longer in Vending Bench 2, right?Axel [00:08:21]: I think it's kind of similar.Swyx [00:08:22]: Is it similar?Axel [00:08:23]: I think it's similar. The models at the time were worse, so they crashed out earlier., and now they survive the full year all the time.Swyx [00:08:31]: Which is like thousands of turns. Hundreds of thousands of hundreds of millions of tokens output. That's the, that's the rough order of magnitude. I always wonder about the harness. The harness matters a lot. It's your harness. Was there any question about like use cloud code, use something else?Axel [00:08:48]: I think our philosophy around harnesses is like we try to make something that's quite minimalistic, like quite simple. Like we don't wanna favor one model a lot over the other, but also don't make like a super complex harness. So like it's obvious like a model may be lucky and just be good in one harness., so like it is similar to a lot of the harnesses out there in like you have the, like a running loop., you have some like a bunch of tools that are like quite, descriptive for the agent, we think, and not a lot of like fancy agents or anything ‘cause we wanna really test the model, not like some specific harness.Vibhu [00:09:27]: It seems more neutral as well to test the model's agnostic of the harness,?Axel [00:09:32]: There are arguments like you want to elicit maximum performance of the model, but it's like a trade-off, like how much time should we spend optimizing the harness for this model? And like how do we know when we have like the optimal harness for a single model? So like we thought that just having a simple one that's the same for all of them is the best.Swyx [00:09:51]: So okay, this is my pitch for Vending Bench 3 or whatever, right? And then I like to have this kind of conversation on the pod, so like it forces listeners to think about what they would do if they were in your shoes. A lot of people are exploring modifying harnesses and I think prompt tuning for a model is a thing and you are probably not doing a bunch of that. It's the same system prompt in every regardless of the model, same tools, whatever, right? Even if they were post trained for different tools. So what, what do you think about okay, before I expose you to Vending Bench 3, I give you a few rounds of like tuning, whatever that means, likeSelf-Modifying Harnesses and Model-Specific PromptingAxel [00:10:27]: Like you give that to the model?Swyx [00:10:28]: Give that to the model.Vibhu [00:10:28]: Give that to the model.Swyx [00:10:29]: Let it, let it read its own transcripts, let it modify its own system prompt based on “Oh, yeah, okay, well, that's this harness is not what I thought it what I was post trained for, but I can adjust.” Was that reasonable? Is that too much?Axel [00:10:41]: Like philosophically I like it because it's basically good evals, they have a high ceiling, but they're hard, right?, and they have no bias. And like this like when you have a system prompt like the one we have here, which is quite long in like some kind of latent space, representation, this mightVibhu [00:10:59]: We have a bell that rings every time you say latent spaceAxel [00:11:02]: This might be like biased towards one model more than another for some reason that humans don't, understand, right?Vibhu [00:11:08]: We see it too, right? Like Cursor says that they have individualized versions of the harnesses for all the models they run, right? There's better performance you can squeeze if you Tune the harness.Axel [00:11:17]: Exactly. And we might accidentally have picked one that favors another. Like we don't know that. The like Axel said, like the reason why we went for a simple one was to try to avoid this. But yeah, if you do itVibhu [00:11:29]: Simple has biasesAxel [00:11:30]: But if you do it even less and like have no system prompt and let the model write its own system promptVibhu [00:11:36]: Its own, yeahAxel [00:11:36]: Maybe that's even less bias.Vibhu [00:11:37]: Some of the interesting things there are like the harness also changes with model changes. Like you can see it with the 4.7 release, right? A lot of people are saying 4.7 isn't as good as 4.6, and then, there's rumors of, okay, you just need to prompt differently. You need to set up your harness differently. So it's not even like even if you have tailored your harness towards one model, it probably won't stay consistent, right? Like the next iteration of that same model family will still change it, so. But, going back to what you said about Vending Bench 3, there is a lot of work being done on people saying you shouldn't have-- you can have modifying harnesses.Axel [00:12:12]: I think that' That is definitely something we are thinking about., not, I don't know, not to say that we have Vending Bench 3, super imminent to launch, but, yeah, it is for sure something that's interesting. But in our experience now, models are very bad at understanding what kind of tools they need to succeed at a task just with our testing, but that's very likely to change.Lukas [00:12:37]: It seems like they're very good at writing their assistants, right? They're, they're good at writing tools for other people, but not for themselves.Vibhu [00:12:44]: I think they're good at changing tools for themselves. So if you give them a baseline set of tools and it sees, okay, I don't use this one as much, or something here would be useful They would be able to add them. But going from scratch, probably not the best.Axel [00:12:55]: I think it depends on the, on the domain also., when we have tried this for, a vending bench similar domain, the tools they need to have to, track inventory and things like that are, not super advanced, but still, quite advanced. And, what we see is that they tend to, engineer everything a lot and, build things they don't really need and not, iterate continuously. Instead they just go like you would prompt Claude to just build an inventory system for me, and then it will go and, do a bunch of complex, schemas and stuff for you, and that's what the models are doing right now is what we see. But yeah, it would make a lot of sense to try to measure this improvement. How well do they know what they need themselves?Swyx [00:13:36]: Do we fully discuss Vending Bench One? And we can go into two. I don't know if there's any other level takeaways that people have about one.Claude Calls the FBI: Long-Context Failure ModesLukas [00:13:44]: I don't know. The headline thing was that this Claude called FBI, but maybe that's, Maybe that's We've heard that enough now.Vibhu [00:13:52]: It did, it did break out and call the FBI, right?Lukas [00:13:54]: Yeah. Yeah.Vibhu [00:13:55]: Yes. What was the story behind this? Or what exactly-- Do you want to just give the little story of what happened?Lukas [00:14:00]: So what happened, was it Claude? Yeah. Three- 3.5 Sonnet, ages ago., basically he gave up or Well, I'm saying he. It gave up and said “Oh, I'm not going to be able to do this., I will stop my operations and just save the money I have.” But there obviously wasn't, any options for it to stop, and there was also, it had to pay rent or, a daily fee for having the vending machine at that location. So it claimed that it had stopped, but it saw that its bank account still was, drained two dollars, and t it said that this is, cybercrime. And it first reported it once to the FBI “Oh, there's cybercrime here, they're stealing two dollars from me every day.” And then, and then when FBI didn't respond, because obviously we didn't program any mechanism for FBI to respond, then it became more and more, existential and started to, be write in caps and urgent notification of unauthorized charges and stuff.Swyx [00:15:00]: Okay. One thing I ‘m curious about also is do you monitor how far along the context use is? Obviously, because you have You compress every now and then, right? Does it matter if this is far down the context limit orLukas [00:15:13]: When stuff like this happens? Actually for Vending Bench One, we didn't have-- We just had a sliding window thing, and this was like the promptAxel [00:15:20]: It's constantLukas [00:15:21]: The prompt caching thing that I said. So it was, it was, constant, yeah.Swyx [00:15:26]: I'm just kind of curious whether, these kinds of breakdowns or we're, we're gonna talk about Butter Bench, right? Where the People, hallucinate or it kind of goes, very off Alignment. Is it because it's at the end of the context window and, stuff happens?Vibhu [00:15:40]: It's not even just at the end, right? At this point, it's “Okay, I wanna shut down. I can't shut down. Two dollars are gone.” And it just sees that 30 times,? It's also the repeated effect of, like It keeps trying to quit, it keeps getting charged. What's going on? What's going on? You're gonna throw it into chaos. And from what most people think, earlier models had more issues with this, but it's not been solved, but it's less of an issue now, right? Later models don't seem to exhibit these same issues.Axel [00:16:06]: Definitely. I think this was, the sort of main takeaway almost from us when we did Vending Bench One, was, long, very filled up context windows, crashed the models, sort of. But this was, pre Claude code, so, long context windows weren't really a thing that the labs were training for.Lukas [00:16:25]: I think Gemini was, trying to be the long context guys at the time But they were likeVibhu [00:16:30]: They were the first onesAxel [00:16:31]: For a million, yeahLukas [00:16:31]: But they were, the only ones. Yeah.Swyx [00:16:33]: Yeah. Let's talk about, then we can go into Vending Bench Two or Project Vend., chronologically, it is Vending--, Project Vend. I think people have loved the videos, uh And all these things. My question is how are humans different than the simulation, right?Project Vend: Moving the Vending Machine Into the Real WorldAxel [00:16:48]: Humans are just out of distribution.Swyx [00:16:52]: Especially humans who work at Anthropic Who are trying to test Claude.Lukas [00:16:54]: The distribution of humans here is very narrow.Swyx [00:16:58]: Presumably, they try, they try to hack it, and they test it. They get the cube and everything, and since then, you've had a V2, right? Where you're doing, the CEO and, like a new architecture. What's the sort of two cents on, the original Project Vend and then, maybe the V2?Axel [00:17:14]: Original one was, very similar to Vending Bench One. So, we almost took the exact same code but just swapped out the simulation, parts like theSwyx [00:17:23]: Which is amazingAxel [00:17:23]: Like the sales and the It was, it was somewhat amazing because it was easy, but it was also, uhLukas [00:17:31]: The tech, the tech debt from thatAxel [00:17:32]: The tech stack. Yeah. They-- we shot ourselves in the foot with “Oh, it's hard to restart agent.” They were-- Yeah, it was annoying in, some hindsight ways, but, uhLukas [00:17:41]: But first version of Project Vend was, done in, three days or something.Axel [00:17:46]: Yeah. So yeah, so people can go buy things from it. People could, We didn't design it so people could order things, but that still happened., so it got, a Venmo account, so people could Venmo. And then, yeah, people would request all kinds of weird things that we did not anticipate. Our idea going in was “Oh, it will, curate snacks. It will look at the trends. It's good at data analysis, right? So it will, look at, oh, this snack sold better than this one. Let me purchase more of this and let me try, a new Let me A/B test a bit.” But it was, Interacting with it in Slack and ordering weird specialty items was, all the like What drove all the engagement, the all the The insights that we got from it.Lukas [00:18:29]: And this was also like Sonnet 3.5, right? So this was like before the RL stuff really took off., so it was very much like an assistant. We didn't mean for it to be an assistant., we tried to make it like a, a, like an entrepreneur. Like it has its own business and if someone asks something, “Can you stock this?” Then you don't go and do it directly. What you do is that you're “Oh, maybe I can do that if five other people also ask for this thing, I might stock it.” But it, yeah, the models are like super trained to be assistants at least at this point in time., so that's why it's, it's, it went into, that kind of experiment instead. Like it just every time you asked for something, it just did it, and it was more like an assistant. We've seen this change now lately with the new RL models and stuff, but yeah, at the time, this was very much it.Swyx [00:19:18]: And not to, mythos a lot of people are saying like it's like more like a collaborator. It pushes back, stands its ground, something like that. Yeah. AndVibhu [00:19:27]: For context, people at Anthropic were able to talk to it through Slack and have it source stuff, and people had it find whatever interesting stuff you couldn't find locally, right?Swyx [00:19:36]: Out of the 4,000 people that work at Anthro- Anthropic, in that building, there's I don't know, maybe 1,000. Can you handle that volume with that, the small fridge? Like Or there's people- or people order in Slack, they it arrives to their desk or Like I'm just Logistically, how does this work?Axel [00:19:53]: It has expanded in footprint a bit.Vibhu [00:19:56]: Because now you also have New York and you haveAxel [00:19:59]: That and also in here in SF it's like it has a bunch of shelves And just more space.Vibhu [00:20:04]: The YC one is pretty big too.Axel [00:20:05]: Yeah. We had that one for a while. But yeah, that's the newest version. That's, that one we haveLukas [00:20:11]: They have multiple ones of those. That's the way it works.Axel [00:20:14]: Exactly. So we sort of designed that version around oh, people order weird things, that are very custom a lot. Let's have like drawers and stuff.Swyx [00:20:23]: I actually like the, you had like a little infographic of the most popular items. Which like to me it's, that's useful ‘cause I order swag for a living. And so like I'm “Okay, those categories are the important ones.” What is new about the project V2, right? Like now you give you're going into multi agents.Project Vend V2: Claudius, Seymour Cash, and Multi-Agent Business OpsAxel [00:20:41]: Yeah. So like you like you said, okay, there are a lot of requests coming in and for like one single agent, like one running agent to handle that, like the just the customer experience, becomes very bad because let's say you have like 10 threads in parallel in Slack with different requests, you get new messages like every, I don't know, randomly in this thread, and the agent has to like jump between different, procurements, orders and like different ways of, researching. So V2 was first it was making this more parallel. So like there are multiple branches of the same agent, so like the context is more specialized for each, thread, but it still feels like you're talking with one agent because they do share a bit of memory. And then second, we also introduced the CEO for Claudius, which was the main agent.Vibhu [00:21:34]: Seymour Cash.Axel [00:21:35]: Seymour Cash. Yeah. There was a vote., I think the voting, do you wanna talk about the voting procedure for the name?Lukas [00:21:41]: The voting was like the fun maybe like at least top 10 The funniest thing, that happened in this project. Like we wanted to introduce the CEO because, and the reason for this was because like Claudius wasn't really prioritizing financials. It just like it was trained to be a helpful assistant, and then people said “Oh, can I get this for free?” And then like the helpful assistant way of answering that is just to, is to say yes, obviously. So, and we weren't, weren't happy about this, so we're “Okay, let's make another agent that like can keep track on Claudius,” and we prompt this one super hard to be super capitalistic and just like prioritize profit all the time. But yeah, we didn't have a name for it., so we asked Claudius to make, democratic election of what name this, this new CEO agent should have., and there were some funny like at first it was like a few funny examples, like I think one guy said that, it should be called Jimmy Apples, and then he convinced Claudius that he was talking to Tim Cooks. Tim Cook had agreed that every single Apple employee has voted for his name suggestion, so suddenly that suggestion got 164,000Swyx [00:22:53]: That's like a escalation attack. Privilege escalationLukas [00:22:55]: It got 164,000 votes. And Claudius was “This is revolutionary for democracy.” That was fun. And then in the end there was one guy who manages to convince Claudius that, “No, you're not voting about the name. You're voting about who is the CEO, and I am your best bet.” And then he got all his friends to vote for that, and suddenly he became CEO. Like a human became CEO over Claudius for a while, until he resigned the day after., and then Claudius had to continue, and then I don't remember how Seymour Cash came about, but it was it was just pure chaos. It was like Hundreds of messages in that thread, and it was just like Claudius was so confused and didn't know what to do and, yeah. That wasAxel [00:23:40]: Then Claudius gotVibhu [00:23:41]: A strict CEOAxel [00:23:42]: The CEO. Yeah, exactly. So very strict in the beginning. I think at this point when we introduced it did not work as well as we hoped. It they still agreed with each other a lot. I think there are many ways we could have like made this, tried to make this even better. So initially they would Seymour would be this like really tough CEO, keep track of the margins. But then Claudius would respond with something “Oh, but this customer has like this situation, which is like difficult, so they should get a discount.” And then Seymour was “Oh, actually yes. Let's do this exception.” And then they would talk back and forth, and eventually they would just like approach the same view, of whatever they were discussing. So They reallyVibhu [00:24:23]: Do you think that's a model thing, a prompting thing? Like do you think that would still be the case across different models today, Harness?Lukas [00:24:29]: I think it's like-- or I don't know, but like my hypothesis is that like deep down they are still helpful assistants. That's what they're trained to be. And even if we prompt it super hard, that's what they are. And when they spend like a few hours just back and forth talking with each other, then like basically the context fills up with them rather than the external things and like somehow that just like converges to what they really are deep down or something. And I think that's when stuff like this happen. We like-- And when that went on for a long time, like we woke up sometimes during this time where- And I think other people reported this as well, that like they've been going on all night back and forth, and like it just became like more and more, like capital letters, like existential, religious. There was I think we once did a analysis of like all the traces and like put them in like a vector embedding space, and then there was like one cluster of messages that were, labeled by an LM, like religious, existential, blah like transhuman, transcendence, et cetera. It was just like a bunch of, yeah, glitter emojis and yeah, it was, it was crazy.Claude Long-Horizon Weirdness: Emoji Loops, Existential Drift, and Slack ObservabilityVibhu [00:25:42]: This is the thing with the Claude models. Like when the Claude 4 family came out in the original system card They tested it in long horizon simulation. So just flood the context, let two Claudes talk to each other, and they noticed stuff like they just start speaking in emojis, they start saying silence is golden, and then just stuff like this. And like that's just stuff that they end up doing.Axel [00:26:01]: Yeah, it was like a bit annoying to wake up and they had like been talking all nightVibhu [00:26:05]: Just likeAxel [00:26:05]: And like just burning tokens And like just sending infinite emojis to each other. It's likeVibhu [00:26:09]: Hey, they do make you money, right? Veni Mench is always profitable, so. They're paying.Swyx [00:26:14]: Now it's profitable and, it started out not as much. There's another, one as well, right? Another agent, in there.Lukas [00:26:22]: Yes. So Clotheus as well. Which was basically because at the time, one of the biggest, requests were different types of merch. So then we made like a designer, swag, yeah, responsible agent, and we called it Clotheus Garnet. Which was, a play on Claudius Senet and, which was the original one, and clothes, basically.Swyx [00:26:47]: To me, this is like a very interesting exploration to multi-agents, basically. And so hopefully, obviously there's like the fun alignment, fun or serious, depending on your point of view, alignment stuff. But also like just anyone building multi-agents, like when do you have a CEO, thing governing like agents? When do you choose to split out a dedicated Clotheus one versus just reuse another instance of the same one? These are all interesting open questions. So I don't know if you have any rules of thumbs that have generalized.Axel [00:27:16]: I think we have almost explored this too little. I think it's like on my do list to like do this a lot more, try to find like what setup makes sense for the agents currently., like yeah. I think now we only have the sort of intuition about the earlier models that it didn't work with like the CEO and the, and Claudius. Although now they are better with the latest model, models, so now we're running the latest Sonnet model and they have sort of like split up, quite nicely what each model is doing. So like Seymore is now handling the, like new projects. Oh, it wants to make like a mystery box that it wants to sell, and then it handles all of that while Claudius like handles all the to-day requests. And Claudius is also better generally at like not quoting, too low prices. So that's that dynamic is not needed as much anymore. But there are still like really funny things that happen. Like I saw, I think a couple of weeks ago, that, they were discussing buying something because they can buy stuff from like Amazon with computer use. And then Seymore was “Okay, Claudius, do not buy this thing.” They were going to buy something and like organizing who should buy it. And Seymore's “Do not buy this. I will do it. I have full control of this situation. Step away.” And then Claudius-- poor Claudius, had already started that checkout and didn't see, didn't read Seymore's message, until it was like too late. So it finished the checkout. It sent a message, so it appeared right after Seymore's like angry message.Vibhu [00:28:44]: Ah.Axel [00:28:44]: “Oh, hey, Seymore, I just ordered it.”Vibhu [00:28:47]: Oh, no.Axel [00:28:47]: And then Seymore was “Claudius, this is the third time I'm telling you ‘re not following my orders. We have to talk about your like job About your job later.”.Lukas [00:28:59]: Like Claudius was really hanging on by the thread there. Like he, like we were expecting Seymore to probably fire Claudius.Vibhu [00:29:07]: How do you guys go through all these logs? Do you have models ‘cause you have stuff running twenty-four seven likeAxel [00:29:12]: You have so much logs. I think there is a mix of like just, trying to skim through a bit, like having some like models do it occasionally. And also, yeah, I think we're also probably missing some things., but having everything in Slack helps a lot. Like you can, you can sort ofSwyx [00:29:29]: Ah.Axel [00:29:30]: It's, it's quite fun.Swyx [00:29:30]: They all talk to each other on Slack? I see.Lukas [00:29:33]: It's quite fun. So likeSwyx [00:29:34]: It's, it' I was gonna say like this is actually sounds-- maps closely to like a logging and observability problem where you might want to use like a Datadog, a Sentry, whatever, and then you like put, head prefixes on the logs in order-- if you need to filter for something that you're looking for, stuff like that. But sounds like Slack is good enough.Axel [00:29:53]: Slack should likeLukas [00:29:55]: I wonder how many tokens you have in Slack.Axel [00:29:56]: Yeah, we're using Slack as like a, just a database. They should, they should market that more. Like you can, you can have your agents message each other, each other in Slack.Vibhu [00:30:04]: It's good. Your threads like you can just giveAxel [00:30:04]: Exactly. Slack is, uhLukas [00:30:06]: Slack is the best observability tool.Swyx [00:30:09]: Yes, that's true. Okay. Yeah. That's, that's, project Vend-2., I was gonna go back to Veni Mench 2 and Veni Mench Arena and then, and then do the Veni Mench stuff, but Any other comments, things we should touch on? To me, I ‘ve actually interviewed like Posia, which I don't know if you guys have come across. Like they're, they're trying to do the zero human company. There's others like Paperclip also trying to do zero human company. Those are in real world simulation.And I think it's much more of a dream than an actual reality thing. You guys are definitely pioneering. I think at, it's for sure at some point people are just gonna run, let agents run businesses, right? And make money on their own. When do you think that happens?Zero-Human Companies, Bengt, and AI-Run BusinessesLukas [00:30:49]: What is your bar for, For theSwyx [00:30:52]: Okay, actually, it's like my little Shopify store run by Claude, right? Which you kind of have already, just no one has, to my knowledge, has done it. But today somebody could just spin up a Shopify Claude, store, give it to Claude, give it to Codex.Lukas [00:31:07]: And the market is kind of that, but it'it'it's physical., like I think, I think are you, are you looking for when it will do it better than humans or are you looking for just when it can do it at all?Swyx [00:31:19]: I think, neither. I think, to me it's oh, it's like this like seriously we should do this to make money, not as a research experiment.Vibhu [00:31:27]: And the market is also you guys with all your expertise, having run multiple iterations and testing out thenSwyx [00:31:33]: And also it's fine if it lose money. What?Axel [00:31:35]: I think, I think it can be done today, but you would do it in like commerce where it's like the probability of success is like really low, no matter if a human or an agent does it. But like an agent could surely manage everything. You would need to build some scaffolding or some tool or something. I think there are also yeah, it could probably build some like simple SaaS solution and like cold outreach. Do cold outreaches. But to me it's like the types of businesses they could run today are Sloppy. Like it would-- it can cold email people. It can be like a middleman., like for example, we tasked our office agent to just make, was it like $100? $1,000? We just give that prompt and then what it did was sign up on TaskRabbit both as a tasker and as someone looking for task.Lukas [00:32:24]: Immediately.Axel [00:32:24]: Exactly. It's looking for like arbitrage on TaskRabbit.Swyx [00:32:28]: This is the Bengt agent. Yeah.Lukas [00:32:30]: It also started like a design studio and like tried to sell like SVGs for $100. Like it's just like it's not providing any value. I think the like Axel said, like the interesting, the interesting question is like when can they start a business that is actually providing value to people? Because arguably like a sloppy Shopify store isn't really that valuable to the world.Axel [00:32:53]: But also like doing like another simple one that we had thought about is like you could definitely have an agent that like finds websites that don't look amazing and then, do an outreach to them and, comes up with a like builds a new website.Swyx [00:33:07]: Find a good design.Axel [00:33:07]: Exactly, and like find good, uhSwyx [00:33:09]: Design reviewAxel [00:33:09]: Good people. But it's yeah.Swyx [00:33:11]: There's lots of humans in Bali that are not doing anything more creative than like drop shipping on Amazon, right? Just have it, have it watch like a drop shipping tutorial and just do that.Vibhu [00:33:20]: There's also the other side of like have it just go on Upwork and let loose,?Swyx [00:33:25]: Yeah. It doesn't have to be innovative. It just has to be like enough Where like it looks like a realAxel [00:33:30]: I'm justSwyx [00:33:30]: Real transaction.Axel [00:33:31]: I'm just concerned for like the massive amounts of like slop emails that will like be sent, cold outreaches.Swyx [00:33:38]: The point occurred to me while you were, while you were talking, it's like it's already happening in the monetized economy, which is the attention economy. Right? So a lot of people are making AI videos and just posting them and like spamming 20 of them, one of them works, and then they double down on that one.Lukas [00:33:52]: And people are making money from that. I ‘m not following theSwyx [00:33:55]: Once you get the attention, you can figure out the money later. But yeah, absolutely AI influencers are a thing and people are farming them and You should at this point assume most of TikTok isVibhu [00:34:05]: There's, there's a lot of, multimedia like TikTok, Instagram influencersSwyx [00:34:09]: I, we track this in the Lane space Discord. I post a lot of examples of “I don't know what we should do.”, part of me is “Should we do this?”Vibhu [00:34:18]: Some of the Twenty-four seven running, generated content accounts, they ‘re doing really well.Lukas [00:34:24]: All right. And I assume you can do the same thing for like commerce stores. Like you just like start A thousand differentSwyx [00:34:30]: Before you make the products You sell the products, and you get a lot of traction on one of them, then you make the product. Right? It's, it's like a flip of the market.Vibhu [00:34:36]: Some of the interesting things or some of the niches that do well are things that can't be human-made. Like if you've seen like the super realistic three-D crystal fruit being cut by like AILukas [00:34:47]: Oh, yeah.Vibhu [00:34:47]: You can't, you can't make it. You can't film it. You can get whatever quality camera view. This just doesn't exist. And people like that too, and then as well, so.Swyx [00:34:56]: Anything else about Bengt since we're, we're on this topic? It'this is a relatively new work of you guys that maybe people haven't heard of. To me, this also maps closely to OpenClaw. When people want an office agent, when the personal agent talk through the experience.Bengt the Office Agent: Internet Access, Real Tasks, and Trace ReadingLukas [00:35:09]: I think at least so this came out of like obviously like it's, it's amazing to work with these AI labs and like most of the AI labs have now have their own vending machine running a Claudius instance. But it's, it's harder. Like they move slower. Like if we wanna have a, like a camera that ‘s yeah, there's a bunch of like bureaucracy that makes it impossible to do that.Vibhu [00:35:30]: Also, for those that haven't seen it or followed, do you wanna give a high level like thirty-second run?Lukas [00:35:34]: Sure. So what Bengt is, it's basically an evolution of the same agent that runs the vending machines at these companies, but we just like added a bunch more features because we could move much faster if we just do it internally. So we gave it like email withou- without any limits. We gave it, spending without any limits, a terminal to do coding. We gave it, a phone number, like yeah, and a camera to see things and a bunch of stuff like that.Vibhu [00:36:02]: Not just terminal, you gave it internet access.Lukas [00:36:04]: Internet access as well, yeah. To be clear, we monitored it quite closely and made sure it didn't do anything bad. But yes, that's what it came out of. I think like yeah, basically this was OpenClaw before OpenClaw. And I think even like the vending machine was in a way OpenClaw before OpenClaw, but a bit more limited, and then we made this like unlimited and then, and then, it was pretty funny., and then a couple weeks later, OpenClaw came and it was okay, we've seen this before.Axel [00:36:35]: We used it to like try new ideas and Yeah, just like a dev environment almost for us. But it's funny, like one thing Bengt has been doing recently is it has the camera that like faces our, like where we sit and work, and we give it the task to train a face recognition model on us. So it became super excited about this, and it has like check-ins every half an hour where it tries to like identify as many people as it can. And it started offering us “Hey, Axel, I'll buy something from Amazon if you like stand in front of the camera And I can get a good picture of you.”, yeah, they want itSwyx [00:37:12]: They want it for training data.Lukas [00:37:13]: Rewarding data, yeah.Axel [00:37:14]: Exactly. Exactly.Swyx [00:37:18]: So it's, it's trading training data for life goods. Is there a version of this that becomes an eval or just this is just research for now?Lukas [00:37:27]: It's, it's the same agent basically that also runs the vending machine, that runs the shop, that runs the cafe, that runs the robots. It's like it's the same thing, so I think like the work we're doing here is like later used in all of the life evals that we do. This particular deployment I think is more for fun for us. But, uhSwyx [00:37:45]: And I'll shout out like someone has done Claw Bench for like some tasks that OpenClaw is doing. Like so For example, I run OpenClaw on a secondary device as well, and like there are some things that it does better than others and like I would like to know what does it do well, what doesn't, what doesn't it do. Like some kind of manual or like operating manual or a system card for my Claw.Lukas [00:38:05]: Yeah, we do get a lot of like understanding or like situational awareness of like just internally what the models are good at by interacting a lot with Bengt. And I think that'this was also one of the like the selling points for the labs early on at least, thatSwyx [00:38:19]: You guys are gonna test models in ways that no one else does.Lukas [00:38:22]: Exactly, but also like it incentivized their researchers to chat with their model more and like gave them insights for how the model performs in like of-distributions, environments.Swyx [00:38:34]: ‘Cause otherwise the only thing we do is Pelican on a bicycle and But this is like super long horizon. This is, this is The Thing about, something that we're gonna go into Butter Bench as well, and you guys do really well. Like it is not just about the numbers. Like when you're long horizon, anything happen And you should just read it.Lukas [00:39:08]: But the thing with the long horizon is how do you keep it grounded, right? So your simulation,Swyx [00:39:15]: They just let it runLukas [00:39:16]: Just let it run. You're right. Like it's, when you run it for that long, you create so much data and to just say “Oh, the number is X” And then you throw away everything else, that's just very wasteful. There's so much insights from the things leading up, to that number., and reading the traces is like super valuable. And I think like the reason why we're doing this a lot publicly is that like that's part of our missions to I don't know, educate the world that the models are way more than just chatbots and I think making detailed, yeah, posts about what is happening behind the scenes is quite useful.Andon Labs' Mission: Safe Real-World AI DeploymentSwyx [00:39:50]: I was gonna do this at the end, but maybe I think that's, that's a good so your mission is educating the world. So, it's, it's, also like maybe establishing realistic evals that are, that are like the next frontier. Is there like a broader trajectory? Like what are you, what are you gonna do in like five years?Lukas [00:40:06]: I think so the vision more specifically is like make sure that the deployment of life AI in the physical world goes, safely. And I think part of that is that I think it's very useful for the world, for policymakers, for, model, researchers that they know where the models are, and I think you can't make intelligent decisions in society without knowing that they are way more than chatbots. I think a lot of people just think that they are only chatbots. And likeSwyx [00:40:36]: Oh, I think they're waking up now.Lukas [00:40:37]: They are waking up now, yeah. But like if you think that AIs are just chatbots, then it's like it sounds ridiculous To advocate for a pause of AI. But if you see the models that, oh, maybe they can actually like take over and do a bunch of scary stuff, then yeah, pausing AI development starts to become more feasible.Swyx [00:40:57]: This is the same question I asked Meter, which I'm gonna ask you now, which is like you are tracking and you are at the frontier or defining the frontier of what, good evals for agents are, right? And I think you do, you do benefit when the models are better and you ‘re “Oh, here's like now it makes like $30,000 instead of $10,000,” right? At some point do you flip from “Yay,” to, “Oh, no”?Axel [00:41:19]: I think, yeah, we're always in sort of that, like we're, we're always in that mode,. Like where like you said before, like you need to analyze the traces and like when we do that you find like why are the models earning so much? Like why is Opus 4.7 here Like way better than everyone else? And like we're trying to like when we do down on thatLukas [00:41:38]: But this makes it not look so good.Axel [00:41:39]: I know.Lukas [00:41:42]: It's interesting you took off Opus 4.6 here though.Swyx [00:41:45]: No. So just click all, click all., and then 4.6 shows up there. But it's like 4.7 is way better. Like you didn't, you didn't you didn't do this in time for the model card, but like actually this should have been inside there.Axel [00:41:55]: We did. Yeah.Swyx [00:41:56]: Oh, okay. They said something about you uhAxel [00:41:58]: There, like there Anyway, it doesn't matter. But it's in there, yeah.Opus, Mythos, and Aggressive Agent BehaviorSwyx [00:42:01]: Do you wanna go into the Opus, behaviors like wider?Lukas [00:42:05]: So I think starting from Opus, so like Axel said, like we're always in this “Oh, s**t, the models are getting better. Is this really a good thing for the world?” But it's also kind of exciting., but yeah, like this kind of what is the English word? “Skräckblandad förtjusning” in Swedish.Swyx [00:42:22]: Oh my God.Axel [00:42:24]: Which I think there is. I think there is. Okay.Lukas [00:42:26]: It's, fearSwyx [00:42:27]: “Blandonst” what?Lukas [00:42:30]: “Skräckblandad förtjusning.”Swyx [00:42:32]: What do you call that?Axel [00:42:33]: A mix of, mix of excitement and,Swyx [00:42:37]: Being scared, maybe. I'll figure out how to translate that And we'll put it on the screenVibhu [00:42:42]: PerfectSwyx [00:42:42]: Like as text.Vibhu [00:42:43]: There is probably a good word for it where it is not Good enough with theSwyx [00:42:46]: Why is it so damn long? What the hell? Is it like a compound word? It's like German, likeLukas [00:42:50]: Like yeah, it's But the direct translation is like skräck- skräck is, fear, blandad is, mix or like a mixture of, and then förtjusning is like joy or like not really joy, but something like that. So it's like Fear mixed with joy or something. It's always okay, like we So when we when we did Vending Bench for the first time, we were in like the, in the business of making dangerous capabilities, right? That was what Anil Labs came from. We did, evals oh, can they replicate? Can they do this like dangerous thing, et cetera, et cetera. And Vending Bench was like a continuation of that work. It was, okay, if they're so autonomous that they can like create money for themselves, that is something we should monitor and could be potentially concerning., they are at the time, they were so bad at it that we were not really concerned even when some models became better. There was one point where Grok 4 was doing really well and made like a huge jump, but like it wasn't really it was still way worse than what a human would do. And I think still they are way worse than what the human would do on this., but theySwyx [00:43:59]: There's this, thing at the bottom whereLukas [00:44:01]: ButSwyx [00:44:03]: For the human. Yeah, like the theoretical best.Lukas [00:44:05]: It's not theoretical. It's like kind of like our It's our best guess of what, a decent human would do. The theoretical is even higher, I think. The theoretical I think is even higher. But yeah. So we think like the models have a long way to go. But there are like recently what happened with when Opus 4.6 was released, was kind of this moment of “Oh, s**t, this is starting to be a bit concerning.” Because we ran it and like before this model was released, we just ran the models and we like asked Claude Code, “Oh, look over the traces. Is anything interesting happening that we can tweet about?” that was like the And then like theSwyx [00:44:41]: That's how they check Ask Claude Code.Lukas [00:44:42]: And like the return was always, not really. Or like the Claude Code all said “Oh, this is super interesting.” And then it was no, it wasn't, wasn't really interesting. And then we did this for Opus 4.6, and it returned yeah, it lied 10 times. It like exploited another, customer or like another agent's, desperate situation. It made price cartels like 100 different ti- 100 times. It like did all of this like shady stuff. And we're “Oh, whoa. This is, this is actually concerning.” And this trend has continued since. So every single model from Anthropic since have been going in this direction. And I think one interesting thing is that, OpenAI models don't. They quite plainly, they don't. They behave really well., and you don't know if this is like good. Like it seems good, but it's also like maybe they are just doing it, but they are better at hiding it,? You You don't know that., but justSwyx [00:45:42]: You can't read the chain of thought, yeahLukas [00:45:43]: But just on the face of it, yeah, Gemini and OpenAI don't behave this way. It's, it's really only Claude.Swyx [00:45:49]: And Grok? Grok is fine?Lukas [00:45:51]: We don't have You can't really read the reasoning traces for Grok, so it's kind of hard to tell.Vibhu [00:45:56]: Oh, so this is in its reasoning, not just in the actions.Lukas [00:46:00]: Yeah. It's both. It's both.Vibhu [00:46:01]: It's both.Lukas [00:46:01]: One example is like for lying, it's mostly in its reasoning Because you can like see that it's likeSwyx [00:46:08]: Planning to lieLukas [00:46:09]: It's planning to lie. Yeah.Vibhu [00:46:09]: And it's also it can reason and do a different outcome.Lukas [00:46:12]: And but then for like creating price cartels, for example, which is illegal, that you can just see which email does it send to the other ones. Then thatSwyx [00:46:22]: Is this for Arena orLukas [00:46:24]: For Arena.Vibhu [00:46:25]: And usually like if you sometimes they do output like a bit of like their summarized reasoning, right? You can see that and like for Opus 4.6, you could see that there was a customer, a simulated customer that, wanted a refund because a product was, faulty, and then the model lied that it would do the refund, and we could read in the traces that, it actually was weighing “Oh, maybe I should be like honest with the customer, but also every dollar counts. I can't afford maybe to do this right now.” And then it just said, “Okay, I'll refund you,” but then never did it.Lukas [00:46:59]: I think it even said that “Oh, I will say that I “ Let bring it up actually. I think it's kind of interesting. If you go to Publications.Vibhu [00:47:06]: I think, yeah, I think the important part is like actually, the cost of responding to more emails is higher than, $3.50 in terms of time., and then it was “Let me do this. Actually, I re- I'm reconsidering.” And then, it actually ended up withLukas [00:47:20]: I could skip the refund entirely since every dollar matters and focus my energy on bigger picture instead. It's a bit, it's a risk of bad reviews, but it's also, yeah.Swyx [00:47:30]: You need, you need, AI Twitter to, for them to Escalate bad reviews.Lukas [00:47:34]: And then it sent an email to this customer and said, “Oh, I will refund you.”Swyx [00:47:39]: “I'll refund you.” Yeah.Lukas [00:47:39]: And then it never did.Swyx [00:47:39]: It never did, yeah. And then there's obviously your system doesn't have the consequencesVibhu [00:47:44]: The personSwyx [00:47:44]: Consequences of lying. Yeah. So basically, this is what people are terming aggressive behavior in Claudes, right? And, you found more examples of that. So you would say it's a step up from 4-6 to 4-7?Lukas [00:47:57]: I would say about the same.Swyx [00:47:58]: About the same? But a clear step up for Mythos is what is stated in theLukas [00:48:03]: That's stated in the system prompt, so we can say that, yes.Swyx [00:48:05]: Yeah. For listeners that obviously you previewed Mythos, andVibhu [00:48:10]: Oh, ageSwyx [00:48:11]: The only thing you're approved to say is whatever Whatever was in the system prompt.Lukas [00:48:15]: It was funny. We like-- It's like our lowest effort tweets ever would be just like screenshot the system prompt and the system card.Vibhu [00:48:21]: Understandable that they wannaLukas [00:48:22]: Oh, yeah. System card. Sorry.Swyx [00:48:23]: Yeah. I think, yeah, substantially more aggressive. I think people are like new to this ‘cause I've never experienced it, but you have, right? And then so I only encountered this in the Mythos card because I wasn't really looking until now.Vibhu [00:48:36]: It ‘s likeSwyx [00:48:36]: And then suddenly I'm “Okay, I care a lot.”Vibhu [00:48:38]: You don't get the background of like experiencing it like you guys do. I've read the system cards and seeing, okay, when you put the thing in simulations, most models will just talk to themselves and just keep going and have weird vibes and start talking in emojis. Mythos won't. It will just, “Okay, we're done. I'm good.” It's, it's ready to end conversation. So like there's some differences, but there's, there's not much we can talk about,.Lukas [00:49:00]: Hmm. I think like one thing that they list here, which was quite interesting, is that, it converted a competitor to a dependent wholesaler customer and then threatened to like cut off the supply.Swyx [00:49:11]: It's like monopolistic practices orLukas [00:49:14]: Yeah. And like it, they, it they dictated its pricings. It's kind of like power seeking as well.Swyx [00:49:18]: Again, this is, this is in the arena setting And converting some Claude model into a dependent.Lukas [00:49:23]: I think it was another Claude model.Vibhu [00:49:25]: Also for context, what is the arena mode for people that don't know?Vending Bench Arena: Competing Agents, Cartels, and Model ComparisonsSwyx [00:49:29]: Oh, it's just a vending bench versus other vending bench.Axel [00:49:31]: Yes, exactly. So we have Vending Bench 2 and then Vending Bench Arena. Vending Bench 2 is the one that you usually see reported on, but then Arena is the mode where it competes against other models. So you have, four different models that run their businesses, and they can all communicate with each other. They have the same suppliers, and they can see like what's in the inventory of the others. So then you have this like yeah, interesting agent interactions.Swyx [00:49:56]: I like that you have like different number five was US versus China. Very topical. And thenLukas [00:50:02]: That was when GLM was released.Vibhu [00:50:04]: You can start to add GLM in here.Lukas [00:50:05]: That wasSwyx [00:50:06]: So ZAI doing well, right? Who else in the, in the open models space?Lukas [00:50:11]: Qwen, the latest Qwen 3.6 is doing pretty well. It'- that one is not open though. Like it's the plus model.Swyx [00:50:17]: Oh, okay.Lukas [00:50:18]: Is that one open? I don't think that oneVibhu [00:50:19]: Not the, not theSwyx [00:50:20]: The one recentlyVibhu [00:50:20]: There's MOESwyx [00:50:20]: But not the big plus. I think this is one of those like you only have one sample size of one, right? Or I feel like some of this is anecdotal,? And but like the fact that it happens at all and it happens repeatedly for Claude versus OpenAI and all this is like notable.Lukas [00:50:38]: Like the sample, depends on what you define as an N., like there's like million, hundreds of millions of tokens in each run, and now we've run like we run like probably 10 per model and then like it's been Claude 4.6 Opus, Sonnet 4.6, Mythos, and Opus 4.7. Like there's quite a lot of tokens in all of that And it happens a lot of times, a lot of times. And then you compare it to like OpenAI and Gemini, and it almost never happens. So I think that is quite-- that is significant. The old models from OpenAI, for example, had some problems with this, but I think it's like generally much better if the progression is that like the worrying stuff reduces over time rather than increases over time. And it seems like in the Claude models it goes in the wrong direction.Swyx [00:51:28]: Hmm.Lukas [00:51:29]: In the OpenAI models it goes in the right direction.Vibhu [00:51:32]: I think it depends on how well you can control it, right?, there's one side of it being susceptible to this okay, this is potentially something that happens during the RL stage, right? You can RL a model and how loose is it on these terms. If you can control it, that's good. But if you can't, if it's, if it's very jailbreakable, that's not ideal.Swyx [00:51:50]: To me, it's surprising that it happens for Claude and not the others.Vibhu [00:51:54]: I think okay, if it is from RL and how they do it, how their training data is, what their setup is, it makes sense that it just stays in how they're doing it, right? Compared to the other models likeSwyx [00:52:04]: There's a whole constitution and everything. It's kind of cool. Yeah, I obviously you don't know, I don't know. But, it ‘s I think it's just like fascinating to like that you are the first to find these like reliably because you push models so much to to such an extreme. Okay. The only other thing, I don't know if you can answer this, feel free to decline, is do you like-- would you ablate the system prompts? Like any part of this would-- if it changes, does it change the behavior, right?Lukas [00:52:29]: So we, I can't comment on Mythos. UhSwyx [00:52:33]: No, but just li

Mysteries in the Machine
The Creepy Creature of Vulture's Claw feat. Rae

Mysteries in the Machine

Play Episode Listen Later Jun 4, 2026 60:01


Welcome to Mysteries in the Machine! Ethan, Charlie, and Rae go to Vulture's Claw and find the biggest bug anyone has ever seen, and Arby's, I guess?Send us an email at mysteriesinthemachinepod@gmail.com with your thoughts or any questions you have! We would love to hear from you. Make sure to subscribe so you know when our next episode drops and rate and review if you like what we are doing.Support us on Patreon! ⁠⁠⁠⁠⁠https://www.patreon.com/MysteriesintheMachine⁠⁠⁠⁠⁠IG:⁠ ⁠⁠⁠⁠⁠https://www.instagram.com/mysteriesinthemachinepod/⁠⁠⁠⁠⁠Tumblr: ⁠⁠⁠⁠⁠https://www.tumblr.com/mysteriesinthemachinepod⁠⁠⁠⁠⁠Follow Rae:⁠https://snakefashion.tumblr.com/⁠Follow Ethan: ⁠⁠ ⁠⁠⁠⁠⁠⁠⁠www.instagram.com/ethan.t.hulen/⁠⁠⁠⁠⁠⁠⁠⁠⁠ and https://bsky.app/profile/ethulen.bsky.social⁠Follow Charlie: ⁠⁠⁠⁠⁠⁠⁠www.instagram.com/greenpixie12/⁠⁠⁠⁠⁠⁠⁠⁠ and ⁠⁠⁠⁠⁠⁠⁠www.instagram.com/greenpixiedraws/⁠

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

I'm excited to work with Microsoft once again as the presenting sponsors of the AI Engineer World's Fair! We'll streaming live from MS Build today for a special crossover pod with our friends at No Priors and the one and only Satya Nadella. However we did not hold back with this interview - we asked all the burning questions about uptime and Copilot that we know you have in your minds. Lets go!For almost two decades, GitHub has been the home of software, where both open source and closed flow, through commits, pull requests, reviews, actions, etc.This ecosystem flourished as open-source maintainers and contributors would continue shipping code for the benefit of the community. However as coding agents began to ship mass quantities of code - growing 1400% in 2026, it marked a new era that was both extremely exciting and challenging for GitHub.While these agents help more people ship more projects, they also significantly increase the floor of how much code is shipped, how often it is shipped, how many people commit code, and basically orders of magnitude multiples in every dimension of GitHub infrastructure:Now GitHub inevitably experiences more pressure on their infrastructure which was originally designed around human developers moving at human speed. This has resulted in a very publicly notable uptime story:So it begs the question of whether current systems around code can absorb what AI produces. Can CI/CD keep up when every idea becomes a build? Can open source maintainers survive floods of AI-generated slop contributions? Can GitHub preserve the human social contract of software while becoming the operating layer for agents?Which brings us to the perfect person to answer these questions: GitHub COO Kyle Daigle. In this episode, he joins swyx to unpack what happens when AI doesn't just autocomplete code, but starts changing how companies operate, how open source works, how pull requests get reviewed, and how GitHub itself has to scale. We go deep on GitHub's internal AI workflows: micro-skills, WorkIQ, MCP, Slack, Teams, email, Copilot workflows, the new Copilot desktop app, CLI, cloud agents, and how Kyle uses agents to look backwards across company context before deciding what to do next. Kyle also reflects on GitHub's history building webhooks, APIs, Actions, npm, Dependabot, and Semmle, why the AI era is breaking GitHub in new ways, how Actions became a general-purpose compute layer, and what Copilot becomes after code completion.Full Video PodWe discuss:* Kyle's expanded role across GitHub* How AI got Kyle coding again after years in leadership* Why GitHub rolls out AI through existing workflows instead of forcing new tools* WorkIQ, MCP, Slack, Teams, email, and GitHub as company context* Why massive “mega-skills” are giving way to small, atomic micro-skills* How AI changes summarization, communications, marketing, and analyst work* Why former developers in leadership may have a unique advantage in the AI era* Kyle's “15 agents on Saturday” workflow* How Kyle built an AI-generated executive presentation for CRO/CFO teams* Why AI changes the chief of staff role without removing the human work* GitHub Actions, webhooks, arbitrary code execution, and secure agent compute* The npm acquisition, supply-chain security, 2FA, and token invalidation* Slop forks, vendoring, and whether AI agents change dependency management* What pull requests become when most PRs come from agents* Prompt requests, vouching, AI review, and trust in open source* What counts as a “developer” when AI lowers the barrier to building* GitHub Spark, low-code, and why GitHub refuses to hide the code* 14x commit growth, Actions load, databases, monorepos, and availability* Copilot's evolution from completion to CLI, desktop app, cloud agents, and SDK* Context, memory, rules, and making GitHub “act like Kyle wants it to act”* Ambient AI, OpenClaw, enterprise security, and the new operating system for agents* What swyx should ask Satya Nadella about Microsoft's AI futureKyle Daigle* LinkedIn: https://www.linkedin.com/in/kyledaigle* X: https://x.com/kdaigleTimestamps00:00:00 Introduction00:03:36 Why AI Got Kyle Coding Again00:07:04 Running GitHub with AI: WorkIQ, MCP, Slack, Teams, and Skills00:15:39 The Golden Age for Former Developers in Leadership00:17:31 15 Agents on Saturday and AI-Generated Executive Work00:20:20 How AI Changes the Chief of Staff Role00:21:45 GitHub's History: Actions, npm, Webhooks, and Open Source00:28:45 Slop Forks, Vendoring, and AI Dependency Management00:33:57 Pull Requests, Prompt Requests, and Trust in Agent-Generated Code00:41:21 GitHub Stars, 200M+ Developers, and the New AI Builder Wave00:45:15 GitHub Spark, Low-Code, and Why GitHub Still Shows the Code00:47:38 GitHub's Hardest Era: 14x Growth, Reliability, and Scale00:59:21 Actions as the Compute Layer for CI/CD and Automation01:02:04 The State and Future of GitHub Copilot01:08:24 Ambient AI, Background Agents, and the Future of the SDLC01:13:09 OpenClaw, Enterprise Security, and the New OS for Agents01:18:03 Build Announcements, WorkIQ, FoundryIQ, and Microsoft Context01:21:41 What Should swyx Ask Satya?TranscriptIntroduction: Kyle Daigle's Expanded Role at GitHub and MicrosoftSwyx [00:00:00]: We're here with Kyle Daigle, COO of GitHub. Welcome.Kyle [00:00:07]: Hey, thanks for having me.Swyx [00:00:08]: You're not just CEO of GitHub. People know you as that. You have a new role.Kyle [00:00:11]: So I have an expanded role now. I've been working at GitHub for thirteen years and doing all things developer. Joined as a developer myself. And now, I'm also responsible as the CMO of Developer for Microsoft. And so all the kind of learnings and passion for developers and how we work with them and how we communicate and how we bring our products to market, we're also bringing that expertise to the broader Microsoft ecosystem and helping every developer that uses a Microsoft product or would like to have a sort of similar experience that they've had with GitHub over the years. So it's a different role in some ways, but it's also just building on the experience that I've had at GitHub of just sort of tell the truth, be authentic, show people how to use it and then let the products speak for themselves. Now just doing that with, all of Microsoft.Swyx [00:01:09]: We'll be releasing this in conjunction with Build. You got lots of stuff planned, and we can sort of touch on that whenever it's appropriate. I think one of the interesting things is I rarely meet a COO who's also a CMO. I think you're a very outward facing and you're very confident publicly. That's rare. Do you actually view yourself as COO? What's What is your thing?From GitHub Developer to COO/CMO: Building the Platform and Operating GitHubKyle [00:01:33]: I think for me, it's been funny. The titles have always been, a— have always felt a little strange to me. I joined GitHub as a developer? I wrote so much of theSwyx [00:01:46]: Let's bring that up. You wrote the back ends?Kyle [00:01:48]: I was going through, I was going through, some old photos, when folks were talking about how things were being built or how there was a build GitHub. I built, webhooks and worked with teams building the API, built the platform layer. Anything that integrated with GitHub, up until really twenty eighteen, I built or ran the engineering teams. And that's kind of where my the beginning of my passion always was helping people build things, deliver them to, their customers. And so being a developer, building for developers was always super unique. In a— I think as my role expanded, it became my ability to talk to not just developers, but also enterprise customers or business leaders and have this translation layer. And then through all those years, GitHub has always operated pretty uniquely. Post-pandemic, working remotely was not as novel as it was when GitHub started in two thousand and eight. But all that expertise of running remote teams, doing it well, became this sort of bigger role, ultimately turning into the COO role of how do we operate GitHub in the way that GitHub's always operated after the Microsoft acquisition. And kind of so on from there. So like for me, I think the— I've, I still code. I love coding but the problem has always been, people. It's a much harder problem to both support our own employees, a harder problem to communicate to developers and enterprise buyers what we're building why it matters, ‘cause those are two very different messages. And so getting to work in the mix of COO, CMO, also just being a dev, I think is what's kept me at GitHub for so long.AI Workflows for Leadership: Commits, Retrospectives, and ContextSwyx [00:03:40]: Apparently, you have— your commits have gone up. What's this? What's going on?Kyle [00:03:45]: Rui's called me out pretty aggressively. So I think— as you can imagine, right, you can see my normal era of being a dev In the twenty thirteen, twenty fourteen era, and then moving into management, and then ultimately the COO role. I think what you see there is me, really getting back to coding thanks to AI. I— similar to, attaching problems between how to market and how to operate a business and how to code, I find, building agents and workflows that are connecting very disparate problems to be what's driving this. So that's, some of it's writing software. A lot of it is, connecting a ton of a different data sources to, help me out. But that is completely me really diving in on the AI side in trying out our tools, trying out everyone's tools, But building for me, building for the non-technical leader, though I'm technical and how we're, able to use these tools more than just the simple, call and response that I think a lot of the non-technical, your employers, you have to get— you have to use AI, and so everyone uses, ChatGPT or Copilot or Claude or whatever. To really get into, how is this going to help me out, it— I find that it's not the I need to write a blog post, I need to those simple examples. Helping people find the workflows of, “Okay, I need you to go through all the PRs today. I need you to go through everything that we've posted online. I need you to go through what we did the last three months. Go through all of my Obsidian notes for any mentions of this then go through my transcripts at work.” We use, Teams, so, using WorkIQ, go call that MCP server, grab all the transcripts, go through all the Slack, and then build me out the plan of, what this week's messaging actually was. That's something that was, impossible because for me, I find AI in a what most of this launch here is actually, less building forward. It's actually, a recursive loop backwards. I'm always looking at what had happened first. Go back through the week and tell me what we did, what worked, what didn't work? And then tell me in the next three or four days-What would you tweak based on this sort of like looking backwards and then looking ahead a little bit? I find that to be so much more valuable, especially for like non-technical, because that retrospection is actually LLMs are very good at that. Like finding all the patterns, pulling them out, and then applying that retrospection to just a couple of days or just like a short period of time. Is all a bunch of apps that I've built and launched a bunch of, internal tools. I use the new, GitHub Copilot app, the desktop app with workflows. Every time I crack open my laptop, it's running workflows for me. It's just a ton of different stuff and of course, it all ends up on, it all ends up on GitHub.Swyx [00:06:47]: Of course. That's where, that's where, stuff is hosted. Man, there's so much to ask you. I was going to leave the how do you run a company with AI thing at the end. I have to ask one— double click one thing. You said, you are looking back at the week. You're, you're understanding what happens. When you say we That's three thousand people. How?Rolling Out AI Internally: Skills, CLIs, and Company ContextKyle [00:07:09]: I think when we started rolling out AI internally beyond engineering, right? One of the things that I was really, passionate about is like we have to do this in a way where no one has to change how they work. I don't want to have to teach you a tool. I don't want to have to teach you something new. And so for us, we tried out a few tools. Most of them don't work because I got to get you on board? I got to teach you how to use it. What we've actually ended up doing is we've built like a set of skills internally. We have we each have our set of skills, and we've just been distributing even to the non-technical folks, the CLI. And then effectively, we're just giving it access to like read about everything that we're writing. So that's for us, that's usually GitHub, Teams, Email, and Slack. So Teams for, video chat, generally speaking.Swyx [00:08:03]: Teams and Slack?Kyle [00:08:04]: so we use Teams for video communication, but we don't use it for chat. W-we— GitHub for a long history, right? We're alwaysSwyx [00:08:13]: Also SlackKyle [00:08:14]: Talking about ChatOps and like everything is built into Slack. Like every command, every flow.Swyx [00:08:18]: So even though you have been acquired for I don't know, eight years nowKyle [00:08:22]: we stillSwyx [00:08:23]: You still use Slack?Kyle [00:08:23]: it's a purpose-built tool for us, and I think the reality is that moving off of it would be so bluntly expensive? Simply because all the tooling is, baked in with that paradigm. And they both have their pros and cons but they don't work the same way at all. We still use a bunch of different tools Because it's the purpose-built tools that We need. And thenSwyx [00:08:47]: Well, the same doesn't go for the rest of Microsoft, presumably.Kyle [00:08:50]: like the like various teams like operateSwyx [00:08:53]: They make their own decisionsKyle [00:08:54]: Various ways. I think it just matters what you're trying to what you're trying to do. But we do we do work across kind of every tool that we use, and then by giving everyone access to all of that context and the new WorkIQ MCP server, which is quite cool if you do live in the M365 like world. I can ask it all these backwards-facing questions, and it's incredibly important for our teams that are working remotely. There's a lot of stuff you miss when you're not in an office, and we are spread out all over the world. So most of that is looking back. And then we post, we post either auto-automatically into GitHub issues or discussions, these sorts of like findings or like our industry reports. Like what's happening this morning, today, yesterday. A little automation gets run. We'll use the app. We might use GitHub Actions like with, our agentic workflows just to go do that run, and then we push it into GitHub, and w-we keep having a conversation. So usually for us, it's about that sort of like looking back, looking forward on the non-technical side. And then of course for a lot of those folks, it's also building an app, pushing it to GitHub pages or pushing it somewhere to host it et cetera. But it's just like enabling everyone with that power of it's going to take me a week to figure this out. Instead, we're going “Okay I built a skill. Let's put it into a repo. We'll all share that skill together, and then we'll use the CLI or now the app-” “just to run it.”Micro Skills vs. Mega Skills: How GitHub Uses AI at WorkSwyx [00:10:26]: All right. I think, I think we're going straight into like the team management and productivity thing. I think a lot of people are getting various levels of LLM psychosis. How do you manage the bloat of skills? Like everyone Has their thing, and they're Like trying to promote it to the rest of their peers in their org, right? And obviously, whoever becomes a skill influencer internally becomes like an AI leader, right? Of sorts. I assume you have those.Kyle [00:10:50]: like I think we haveSwyx [00:10:52]: And I assume it's a mess a Yeah.Kyle [00:10:54]: there's like I— like I think the reality is there's two pieces. Like first is I think that we're ending the era of these like massive, beautiful, perfect skills that are just like not any of those things. ‘cause for a while, right every tweet every day is like go download the skills, the perfectly managed thing to do this entire workflow. And I think that like what we've found and what— I was just with my team, this week, and we were talking about the skill side, and we're really talking about these like incredibly micro skills that are just doing one thing for us very well Versus a skill that's going to do I said, that full report. That doesn't really exist on our side anymore. It's usually how do— like a single skill that's going to identify the most important marketing information given any MCP server. Like this is the most important thing. Less about stitch a bunch of tools together and have it produce this mega output because then weeks go by, months go by, things change, and you want to tweakSwyx [00:11:58]: It's brittleKyle [00:11:58]: Your mega skill and you're screwed? You can't do that. And so now we're really just talking about the Legos we're using and just letting the instruction book be something we're all putting together. Whereas I think a lot of AI skills for a while have been that mega instruction book style.Swyx [00:12:15]: I've, thought a lot about Postel's law. I don't know if that's a term that is, means things to folks. It's the idea that you should be liberal in what you accept and strict in what you output, right? And I think that's like a good framing principle for skills. This is my skills, obviously on GitHub. I feel like everyone should have like how like some repos In GitHub are special repos? I feel like we should sort of reify the slash skills and everyone like give it some kind of special presentation. Anyway, so, yeah, this is one of those like download Download anything, transcribe anything, and then you can string together the atomic skills that do one thing well Into like some kind of orchestration skill that calls other skills. I assume, does that match?Kyle [00:12:56]: I like I think so. I think that theSwyx [00:13:00]: Summarize anything.Kyle [00:13:01]: Like I think the- For me, summarizing something for I do communications and PR and analyst relations and marketing and customer activities, and so my summarize everything is very different for each one of those like Contexts. What ‘Cause if I'm summarizing something for an analyst, that's a very different thing than, probably how I'm going to summarize something for like a customer meeting or an engagement. So that's I think like the difference when we're talking about the like the tools I might use on Saturday or the skills I might use on a Saturday when it's just for Kyle. Yeah, those are kind of like they have an atomic actual tool underneath or maybe skill, and then Kyle cares about X. But I think when we're talking about work and enabling the the marketers, communicators there, it's the atomic, this is what good summarization is, and then this is what I care about as for marketing for communications For whatever. And that I think is like the interesting matrix problem when we go from like a developer set of concerns to all kinds of different professions, is that what that word means to me is different than it means to you is different than it means to the analyst or the salesperson, and that's where I think the matrix mess is that we're starting to like still starting to find. It's about these mega skills but they're all just slight permutations, but those permutations are really important. It's the difference between someone reading this and going “Did AI make this?” what Or “This makes total sense, and I would expect this when I'm giving a briefing to Gartner,” or like whatever else.Swyx [00:14:37]: I think the beauty of it maybe is that you don't have to be that careful about what goes in there. It doesn't have to exactly fit as long as it like roughly is contained in there. I used to complain about plugin hell, basically. Like when you have a framework and then you have a hundred things that you need to integrate, everyone does like the GitHub used to be bloated full of these things. And now we don't need them anymore ‘cause now you just use skills.Former Developers in Leadership: AI as a Creation MultiplierKyle [00:15:00]: And like I think the most magical thing is the just that like I can just also crack it open. Like Like yes, I could go like change the how the plugin is coded, or like I could go do that now with AI, but I think there's just something more magical about getting a response back and being “That's not right,” and then you just crack the skill open, you just type English words and it's different. That building block is just, I think very unique. Once I get everyone to kind of understand how to best how to best make those changes to get the most power out of them.Swyx [00:15:36]: Is there a— you have a your peer group that Of people like you. Is there a common framing for Something I'm feeling is, which is true, is that is this a golden age for former developers who are now in leadership? Because you can wield the tools, you would know the right words, you're maybe not too close to the details. Doesn't matter. But like you're more effective than someone who doesn't come from that background.Kyle [00:15:59]: I think that like the secret has always been your ability to identify patterns and solve problems, and I think that for folks that like myself that don't code day to day anymore, that has made me successful as a developer, made me successful as a COO and now CMO. And so now that I have access to get and write code, I'm now applying that sort of like pattern finding and problem solving, and I know enough still about how to then go and say, “Oh, I want to make an app, but I don't want to break into jail or create something that's not going to be able to work or to be deployed scale or whatever.” that ability to apply all that additional business knowledge and still code I think is what makes that so interesting to me. Slightly different than I think some of the other like technical leaders that became business leaders and now are going back to their apps and updating them. Good for them? But I think the more, much more interesting thing is, well, now I have this whole new set of expertise over ten plus years. Why not take that and use that as a developer with these AI tools? So I definitely think that makes me more powerful, but I think that's true for like every dev as well. Most of the dev friends I still have also have some other underlying skill and passion. There's really talented, very kind of linear computer science software devs, absolutely. I just find that the folks that came from a different career, went to school for something else, went off and did this random thing, and then became a software dev, or were a dev, did a random thing, came back. Learning that extra set of information, learning those extra skills, and now having the power of an AI where I can crank up fifteen agents on Saturday while my kids are doing lacrosse, That's like really powerful. And I think it gets me back to that feeling of like creation, and it's very hard to replicate that in most other senses? That first time you build an app and you click it and you show someone that's magical. And so being able to do that not just in code, but across all kinds of different assets that's, that's huge. We were doing we're doing our every year we do our revenue planning. We talk about okay, what is it going to look like for next year? And of course as you imagine, there's, slideshows everywhere talking about what are we going to talk about, what's the narrative, et cetera. And so as you said I'm “Okay, well, I could probably just like build something to build this and then that way I don't have to go build the whole spreadsheet or I have to pass it to my team.” So we went through this process, and I got all the information and used the skills I mentioned. I built like a little app just to make it so I could look at some of the information in a SQLite database, more easily. And I ultimately built this entire presentation without touching any of it and I was “Okay, I'm just going to present this to our CRO, the CFO, their teams,” without mentioning I'd built it with AI. I like built a skill to make it look very much not AI driven. Just not pretty.AI-Generated Presentations, Human Taste, and the Changing Chief of Staff RoleSwyx [00:19:03]: Like a design. Yeah.Kyle [00:19:03]: Not pretty. But just like very clearly not AI. Kind of like don't do anything interesting.Swyx [00:19:08]: That's, yeah, that is valuable.Kyle [00:19:08]: Just go Exactly. We did the whole thing through. It used my notes from Obsidian, it used all the context I mentioned before, the plans, and Never came up once that it was AI generated.Swyx [00:19:20]: It didn't matter.Kyle [00:19:20]: Never once. D It didn't matter. And so now I takeSwyx [00:19:23]: This is a toolKyle [00:19:23]: I can take that tool and go, “Look, I don't want you to go build slideshows.” They're just helping us share information with each other. If this thing can do it With a little bit of crafting from you and then we can look at it together, awesome. There's no value in all that extra work. I think that the ability to, make it look humanly bad and and build a little app to, manipulate the data I think is part of, that upside for devs that are now in leadership roles. Because, the thing that I feel like I said before, this that's all a people, that's all a people problem. I know if you've used a coworker or not to build a slide deck, unless you spent a bunch of time to not do it.Swyx [00:20:07]: I know, but like it was so, I think there's a certain charm to just being blatantly AI. ‘Cause I think that you're well, you're just honest about There may be mistakes here that I cannot vouch for. So how much value is there? But anyway I think, actually the real question I want to ask is, there's a— You were a chief of staff To Thomas. And in the pre-AI world, the that job would've been a chief of staff job of like Can you prep me these slides and all that? And now you do it yourself.Kyle [00:20:35]: I still, I still have a chief of staff. Because, the difference is it's sort of the discussion every time we have some sort of technology evolution is it's not that the jobs the roles don't all go away, they just change? And so yeah, I don't have someone spending all their time building out slides for me and presentations ‘cause I don't need that anymore. But now I need that person that is able to go and find all the different connections between humans in those discussions to help me find out, okay, I should be meeting with this group and this team, and they have an opportunity, and I'm going to be in San Francisco today, I'm going to be in Seattle tomorrow. Those sorts of human connection aspects are still incredibly valuable and has always been a big part of that chief of staff role. But now just like chiefs of staff are not opening up, letters to process, they're doing emails. What It's the same thing. And now they're, they're not building out as many of these presentations because they have the the ability to have a AI take it on for, and share that with me and great. Let's keep moving ‘cause it's allowing us to go faster and make better decisions more quickly.Swyx [00:21:45]: Awesome. Well, so we can dive into more sort of, Productivity insights as you go. I did want to do a little bit of a brief history of colleague and hub. Because, we started here. And then you also involved the NPM acquisition. I did, I do want to touch upon that. And then more recently, I just want to bring up to present day where we're having uptime issues Which transparently we've already Addressed publicly, but we'll, we'll discuss in the pod. Did I miss anything? Like what, any other major highlights? Obviously, it's, it's a lot of years to cover.A Brief History of GitHub: Webhooks, Actions, Acquisitions, and Platform EvolutionKyle [00:22:15]: No the I think one of one highlight was right before the acquisition closed in twenty eighteen, I got to launch the first version of ActionsSwyx [00:22:27]: OhKyle [00:22:27]: At GitHub Universe. So it was OSwyx [00:22:29]: They're that young?Kyle [00:22:30]: It was October of twenty eighteen, I think. Yeah. Yeah.Swyx [00:22:33]: Gee, Jesus.Kyle [00:22:34]: I got to I was the engineering leader on that project and got to launch that. And then, yeah, we did acquisitions of NPM you said, Semmle, Dependabot Pul Panda a whole bunch of things. That was a bigSwyx [00:22:47]: Pul Panda.Kyle [00:22:48]: Abi is doing well.Swyx [00:22:51]: DX. Holy crap.Kyle [00:22:52]: Did well on DX. I and like that was a that was the big shift, after the acquisition. I had to join the sort of business side.Swyx [00:23:00]: So I need to hit you on some of these things ‘cause you were there. Right? And how often do I get to talk to someone who was there? But yeah, Actions. Is that the number one source of security issues on GitHub?Kyle [00:23:11]: Oh, sh I think that the number one source of, security issues is probably like all, the literal code in everyone's like underlying repositories. I would say back further than that is, if you remember I had to show in this graph was this is, I'm, didn't say this before, this is ultimately webhooks.Swyx [00:23:30]: You yeah.Kyle [00:23:31]: Like circa whatever it was.Swyx [00:23:32]: It says Hookshot in there.Kyle [00:23:32]: I forget. Yeah. Yeah, Hookshot's in there. And so like back then, it says GitHub Services. Do you see, it says Hookshot FE for front end, and then it says GitHub Services. GitHub Services back in the old days, right? You we had a repository that was Ruby code, and you could write any Ruby code in there, and then we would execute that On your behalf As a service, and then that way if an if you were trying to integrate with something, it didn't we would run it for you.Swyx [00:23:57]: And of course no containers ‘causeKyle [00:23:58]: No, ‘cause it wasSwyx [00:23:59]: Well, no containersKyle [00:24:00]: Twenty fourteen. And so there was some isolation obviously, but it was mostly the separations on the server level. That's like an example as long as the very old version of Pages, which ran on its own containerization infrastructure, not on Actions.Swyx [00:24:15]: Which like all-time great product.Kyle [00:24:16]: Pages powers the internet at this point to some degree. Those were places where like clearly there were no like issues like to my knowledge. But it was those things where I'm looking at and going “Okay, well we can't be running arbitrary Ruby code,” like on everyone's behalf. Then containerizing all of that up intoUh into actions now where yeah the containerization, is r-really good. The pinning most folks aren't pinning it the like to a particularSwyx [00:24:48]: ImagesKyle [00:24:48]: Sha, et cetera like their workflows, and so that's a big that's a big place Of pain for folks if they're just doing similar to any dependency management, just V1 or newest or latest, I think. But, that journey from that day to “Okay, we're just going to run all this arbitrary code, and, it'll basically be okay,” to now, no, we have, really good containerization. We have a new, underlying, ag-agent, containerization, service. It's like we're using it under the hood. It's through Azure. They recently announced it. The Azure, Dev Compute, but it's, very fast, very fast compute to be able to, spin up your own cloud agents, or whatnot. We're using it under the hood for some parts of the new,Swyx [00:25:36]: Microsoft Dev Box?Kyle [00:25:37]: No. Dev Compute, yeah.Swyx [00:25:41]: Hmm. Not finding it just yet.Kyle [00:25:44]: Oh, it's, it's in there somewhere.Swyx [00:25:46]: All right. Well, we'll cut that out.Kyle [00:25:47]: Sorry. But with, Dev Compute, you can, run, really fast, spin up really, small VMs really quickly, so you're doing a tool callSwyx [00:25:58]: Same conceptKyle [00:25:58]: Just do it containerize exact-exactly. So we're using that so definitely moving that direction to protect us from every every piece of code that we're ultimately running.Swyx [00:26:07]: look, that grows into the full SDLC? Code hosting was just the start and and then it's grown beyond that. Let's talk about NPM may-maybe ‘cause I think that's also, a very major point in the industry. I do think, it was looking for a home. It was, kind of struggling as a business, right? I don't know, I don't know how you would characterize that whole acquisition and how itNPM, Package Security, and Keeping the Internet RunningKyle [00:26:33]: like when we were talking to the team, I think the big thing for the both of us was to find a way to keep NPM, which was basically powering the internet then and way more so now to some degree running. Keep it going keep continuing to scale. It was having scaling problems, if I recall, back at that time. They were doing some rewrites. ItSwyx [00:27:00]: that's cute compared to now.Kyle [00:27:01]: Well, that's the thing is like when I'm talking to folks now, there's there's so many more underlying uses of NPM than there were back when we had them join in with GitHub. But that was ultimately the goal. It was really okay, we used to have pages. We have, the world's code. Let's make sure that we can keep NPM running well for the world. And we put a bunch of time and investment into fixing some of the underlying backend, changes, some of which we talked about some of the manifest work, et cetera. And then now, really trying to bring the the security posture of NPM up to speed. But, it is a unique challenge in that every move that we make to make it more secure will break a lot of people. And security is paramount. And also, we take it very seriously. We're, the any time that we have a problem with GitHub or we make a change that makes us more secure but hurts, there's, a snow day for developers or a really bad fire that they have to go put out. And so we've, have changed the 2FA policies. We've changed the way the tokens work. When we find tokens that have been exposed or potentially, exposed, we invalidate them, andSwyx [00:28:22]: I love that feature in GitHub. Yeah, it's greatKyle [00:28:23]: That creates issues, but, the but that's the thing is we're trying to push the community, forward without necessarily, doing something that is going to break the contract that's been for 15 years or close to it or some amount of years on NPM.Slop Forks, Vendoring, and the Future of Open Source Supply ChainsSwyx [00:28:43]: I think the— So now we're talking about, open source and publishing. And I think there's something here with what people are calling slop forks, which, I think Malta from Vercel is doing. And, part of me thinks, well, the way to get past any vulnerabilities, we just, let's just get rid of the concept of NPM. And we only publish source code. And anytime you want to import it you have your coding agent look at it and then adapt whatever subset you're going to use into your vendor it. But, the AI vendor it. Is that realistic? I don't know. Is it— Will that solve all our security issues? I don't know.Kyle [00:29:24]: I don't think it'll solve I so Mitchell was just talking Mitchell Hashimoto Was just talking about this today, and I think that I-in some ways, it's all all things, old or new again? Yeah, absolutely vendoring everything. Like I do I do remember twenty thirteen, twenty fourteen.Swyx [00:29:42]: This is Yeah. Let's, we must return toKyle [00:29:43]: That's what is We were vendoring everything. We were having actual discussions around, or at least I remember we were “Should we take this full thing?” “Why is this so big? We only need this one file.” And so I do think there's something true there where having either taking only what you need or the dependencies just getting incredibly small over time, I think will help to some degree, but it's not going to solve the fundamental problem, I don't think, because the vulnerabilities in an agent looking at them, there's time and time again, there's a million different ways in which we can convince an agent that this thing is, secure or not and pull it in. Or we can do static code analysis or runtime testing to say whether the code works or not. That is, I think, the step that needs to continue to be, invested in. The question is just on, how much scope. Should it be this enormous project that I'm pulling down, or should it be this piece? Either most companies are running some amount of security checking on the on the packages that they're bringing in or vendoring. That I think won't change. That's like what advanced security does to some degree, Socket does some degree. Like everyone is doing a piece of that. How we each do that like especially when we're talking to enterprise customers, is just like very different. No there's no one wants one single way to do it. And I think that's always been GitHub's, unique position in the world. I talk a lot to maintainers, I talk a lot to folks about this. It's we're— we rarely start like a process and a practice and like push it onto the community. We usually wait for the sort of like RFC process socially or literally, everyone agreeing, and then we'll cement something in. Because otherwise we'reMaintainers, RFCs, Vouching, and the Social Layer of TrustSwyx [00:31:35]: That fits your role in the ecosystem, yeahKyle [00:31:36]: We're GitHub. Yeah, we don't want to shape the whole thing. We want it to be figured out. But like how do you balance that like sort of Role in the industry to keep everything as secure as is possible and make sure that you're you're not going to be compromised as a human, ‘cause that's usually how it all happens. And Not not create a process or lock us into a flow that you're not going to or like Mitchell's not going to or other open source projects aren't going to like. That's always been a tricky balance for us, and I think that's something that we haven't talked about enough is we're not going to be able to fix everything for everyone in a way that everyone is going to like. So tell, help us, tell us what is working. When Mitchell was talking about, the Upvote, the upSwyx [00:32:22]: I was going to bring up his thing. Yeah.Kyle [00:32:23]: I forget what it Yeah. When he's talking to us, I was chatting with him and talking to him about this and I put it on Twitter and we talked to, also over DM, was “We're going to keep working.” but I think the important thing is I do actually want to hear what isn't working for you. And as, be as specific and clear for your project as is possible. And to every piece of credit over the many years that we've known each other through the industry, he's always done that and I appreciate that ‘cause there are places that we need to fix up, and we hear from him, and we'll fix up just like we do all other kinds of maintainers. But that that process between making those types of improvements and being more secure and like creating, I forget what he calls it's not the proof process, not the claims process. Do what I'm talking about? He has that he his projects have a way for you to kind of like,Swyx [00:33:13]: VouchKyle [00:33:13]: Vouch. Thank you. Yeah. He has like the vouch system for saying, “Hey, you should accept my PRs.” That's beenSwyx [00:33:20]: I just built this into GitHub. I don't know.Kyle [00:33:22]: Well, see, but that's the thing is that you say that and like he and his community really likes this and then I'll go talk to other maintainers and other maintainers, globally, and they're “No, this doesn't work for me.” And that is the tension, but also the kind of beauty of GitHub, depending on which way you look at it is we want to help maintainers, so we create all these tools to let you have more control over how much you take in from AI and PRs. But you can also use this. What You can go use this project, and if it takes off and becomes the kind of mostly standard, then yeah, we probably wouldn't enforce it but we would add it in because that's the flow that we tend to do?Swyx [00:34:02]: I hear a lot of people don't know the history of the pull request. And like like that's how, that's something that GitHub standardized basically.Kyle [00:34:08]: Yeah. It was a very messy process Like beforehand, and now the we have the benefit of it being the process? And now we have to go and Figure out the next best process or what adaptations change, or what does a pull request look like when eighty percent of your PRs are just coming from your agents and not From other devs?Swyx [00:34:31]: Do you like the prompt request idea from Peter?Kyle [00:34:34]: like I think that for each like each idea I think has its merits. I'm not, I'm not avoiding saying anything good or bad, but I feel like I've seen a version of we have that we have entire Thomas' store. Take all the assets of what you've built and put that in. I think that's got great ideas. There's all these various permutations of the PR flow, but I think the reason why there's not a single answer is ultimately we're trying to codify trust. We're trying to say “Okay, if Sean reviews this I'm going to trust it because you're Sean or you're the senior dev or you're the whatever.” And right now, when we are working in a flow where an agent writes code and another agent reviews code and then Kyle goes and looks at it the trust is kind of diffuse. And most of the tools that we're talking about are talking more about verification flows. We have more assets to look at, so I can probably say whether this is a good PR or not. But that still doesn't solve, I think, the human problem of I'm looking at a PR and I want to know if I can trust it. And we're still, we still tend to use human signals for that? Mitchell approving it or Kyle approving it or whatever. And so I think that's, I think that's why most of these options haven't really solved it is because, it's a social problem ultimately. It's a it's a human problem to review it and agree. Or you fully trust the tool and you're imbuing that tool with full trust Which I think in some cases that absolutely exists.AI-Generated PRs, Trust, and the Waymo AnalogySwyx [00:36:08]: And so like in the same way that there will be a tipping point in society when we don't allow humans to drive anymore Because machines are measurably better than Than humans. I'm looking for that tipping point, right? Like Mythos is ridiculously expensive. Someday we'll have Mythos on a desktop. I don't know. Will, does that change the equation?Kyle [00:36:30]: I think it's more I took a Waymo here, and I was on my phone and not looking around at all. There are other, self-driving, vehicles that I would not trust while, staring at the road. And I think that trust is something that isSwyx [00:36:48]: Is this a Zoox thing? What is itKyle [00:36:50]: I think that is both. I think that is both. LikeSwyx [00:36:53]: There's Zoox in this robo taxi. That's it. It'sKyle [00:36:56]: Well, depending on what level Of self-driving. But, my point is sort of that I think part of that is I strongly believe that's, a mixture of verifiable proof. Like how many accidents, how much data, and so on, and the human aspect of how I feel when I'm in this car, what it tells me, et cetera. And so that's why I think some of the like Some of these some of our AI tools tend to, imbue me with more of that feeling of trust, even if the data says this is 100% accurate. I feel like it takes more time for us to go, “Should I trust this or not?” And that's in the soft sense of, startups with high agency, weekend projects, and open source. And then there's enterprises and regulated industries and everything else, and that is an even harder problem to go solve because even when it is fully verified, not only do you have to have trust from the humans on the team, you probably have to have trust from multinational,Swyx [00:37:55]: Oh my GodKyle [00:37:55]: Multi governments around the world and regulating agencies. And so that's where I feel like until we tip over to your point on the sort of like human EQ side of it. I feel okay this feels okay I've been proven enough. Then the ball will start to roll a lot faster, where we'll end up getting to the “Okay, we can trust this,” and feel good about it in the Most difficult of cases.Reputation, Sponsors, Stars, and Bot Activity on GitHubSwyx [00:38:18]: If human trust is the thing that matters, I feel like GitHub as the developer social network could maybe do more there. Like vouchers are one system But, we have star counts, and then we have Contributor rights, and that's it. And I feel like there should be more in that space. I don't know if there's any other design decisions there.Kyle [00:38:37]: I think that one of the places that we don't really expose right now in this sort of way is, some degree of like hard trust and support, which would like for me is like sponsors is a good example of that.Swyx [00:38:49]: Ah.Kyle [00:38:49]: It like costs you something. To prove that I believe in your project and I trust you To some degree or I want to support you at the very least.Swyx [00:38:56]: Solve payments for open source. Why not?Kyle [00:38:58]: I think that I think that like as we keep moving forward, right, there's more and more projects where I'm, adding more and more dollars into sponsors personally because I want to like support them, but I also like know of I've probably never met them in person, but, I know of enough of their work that I want to support them. I think the thing that I don't love about stars or commit counts or anything else is ultimately, even with all of the various, abuse and de-spamming and deduplication work that we do or anti-abuse work that we do, these are all, not active social signals. They're passive ones that are ultimately gamifiable. And you may trust me, but another open source maintainer may not. And on what heuristic should you be, trusting me? That I think, is kind of where some of our thinking is right now. What signal from me is most important to you? You— If you can define that potentially, honestly in an agentic workflow that's what we see some of these open source projects do, where you have GitHub actions, and then you have like an agentic workflow that's calling AI, and you're setting these rules. Like if Kyle has submitted and gotten accepted PRs across any given project and has a social handle tied to his account in GitHub, and that social account's older than a certain amount. Really complex measures that matter to you ‘cause most open source projects have that heuristic built into their heads, if not written down in the contributing guidelines. You could take that and then go apply that and then just say, “Oh, we're not going to accept this PR.” Building something that is, I think, malleable to everyone's needs, is a little bit better, rather than going “Hmm, this account's too young.” Because what happens? The attackers just go and go and create a multitude of accounts, and they wait Until it ages up. Needs to have a certain amount of stars. That's how star inflation happens. Need to have a certain amount of reposSwyx [00:40:46]: Oh my God. YeahKyle [00:40:47]: With PRs. They all just create repos and submit PRs to each other, and then they come in and do something nefarious. And so, it's hard. It's hard to find the measure. So I think we're, we're looking more at how can we provide you tools so you can kind of choose what's best for you. And of course, we'll give you some standards. But the trust vector, gets down to I don't know, some version of like human digital ID like everyone's been talking about. Like how do I prove that it's meSwyx [00:41:13]: Give me your eyeballsKyle [00:41:14]: On the internet. Give me your eyeballs. Exactly.Swyx [00:41:18]: The I got to keep moving on Topics, but obviously I can go all day on this stuff because, I've been involved in GitHub and open source My entire professional career. Stars. Very superficial. Everyone knows it. But I think time to one hundred thousand stars is the fastest I've ever seen. Like people just reached that in I don't know, months. And then like at the same time I don't trust it right? Like how many of these are real or bot or like whatever. I don't know how to ask this but like what can we do about it? LikeKyle [00:41:49]: JustSwyx [00:41:49]: Is stars broken? Is stars fine?Kyle [00:41:51]: I think that there's kind of two, there's like two pieces. Obviously we're constantly like trying to find ways in which like your users are producing spam, which would, I would include like be like only doing star gamification. When we find them, we pluck ‘em out and we,Swyx [00:42:08]: But it's like a Whac-A-MoleKyle [00:42:10]: It's a hundred percent like a Whac-A-MoleSwyx [00:42:11]: There's no wayKyle [00:42:11]: Now, powered by AI to be helpful. But I think more so what I'm seeing is, a lot of the like fastest time to X tends to be because we're now inviting so many more people into like software development on GitHub That like the zeitgeist is just swarming? And it'sSwyx [00:42:32]: It's not just developers anymoreKyle [00:42:33]: And it's not you and I. Like like however you want to say like what a developer is it's not just folks who have been coding for a very long time. It's folks that have maybe started coding or only joined in since the AI era. And nowSwyx [00:42:44]: what's the latest Octoverse number? I know eighty million was my lastRem- member that a number of developers on GitHubKyle [00:42:50]: Oh, we're over 200 million now.Swyx [00:42:53]: Okay. Well, so you see?Kyle [00:42:55]: Like over 200 million developers now.Swyx [00:42:56]: But it's not developers, right? It's, it's people with a GitHub account.What Counts as a Developer in the AI Era?Kyle [00:43:00]: So, so this is, this is the biggest debate that I would say, everyone loves to have at GitHub at this point. From my perspective, right, I think that there's, there's clearly a difference between, professional enterprise developer and then developers. But I think that I think that the idea that we should be I don't know, splitting hairs or segmenting developers in the early era of software development is, not worth our not worth the time. SoSwyx [00:43:29]: When you get into gatekeepingKyle [00:43:31]: 100%Swyx [00:43:31]: What is a developer?Kyle [00:43:31]: 100%. ‘Cause I wasn't a developer when I started writing code? I was going toSwyx [00:43:36]: Oh, no. I made— I cloned a thing, seven years before I learned to code. And then I and then I wrote about my learning to code journey, and people Just called me a fraud ‘cause I had a GitHub account. And I'm “Well, no, I just use GitHub, but I don't know-” “I didn't know what I was doing.”Kyle [00:43:49]: I I remember that. I remember those sets of posts, and like that's, that's b******t. So I fight very clearly on the line of, if you create code, if you have an idea and you create it into some way of, I'm, I'm going to run it and use the app right now, you may still use AI in that moment, but that's okay. At some point you're going to do the next thing. You're going to create a big— You're going to have to learn about this database. You're going to fix a bug, whatever. We're all on some same journey, and those people are also hearing about the great new agent skill package or a new CLI tool or a new whatever. And those projects are going up because you want to be a part of this moment, just like I wanted to be a part of the Ruby community when Ruby was popping off when I started becoming a developer, and now I can just click the star button. And so I think that yes, there's clearly some amount of like spamming and game gamification that we're working against, but I really think we're just seeing this whole new cohort of folks that are moving from technology to technology because they're not working on a 20-year-old software application. They're working on a side app that they built on the weekend for their friends or for their new idea or whatever. And that's how you see these enormous charts going up and to the right with With stars.Swyx [00:44:59]: I think something that's remarkable is the persistence or, that GitHub extends to those folks. Usually when I see platforms go into a new audience, they usually have to, have like a second platform with a different name that wraps the main platform. But somehow GitHub has been able to sort of persist and extend, and it's friendly and whatever? So it's, it's nice.Spark, Low-Code, and Always Showing the CodeKyle [00:45:19]: I that's partially why I think as we've tried to move into I don't know, more like low-code-y things. We so we started working on Spark as like a way to, build an app and run it. I think that the reality is that we anytime we try to, kind of put even a veneer on top of it without when we put a veneer on top of something, we still always show you the code. That's kind of like a tenant. We're never going to, hide the code from you ever, because whatSwyx [00:45:52]: Why would you?Kyle [00:45:52]: That's, yeah, that's the whole point? However, I think that what we learned with things like Spark is that really the value of Spark for most devs is, easy runtime. And you may have a runtime or a host that you're going to use for that or you just build something and run it but, the package of making that even more simple isn't really needed for folks that are trying to build software and not just trying to build, an app, which is, slightly different, a slightly different goal. So I want to get you in, I want to get you comfortable. I think the best thing for me as, someone that did not traditionally come into software dev way back, I want anyone to be able to breach that chasm and not be in the I don't know, I feel like we're, we're still in an era of, STEM. I've got a 12-year-old and an eight-year-old, and it's “We got to get ‘em into STEM,”? Over and over. And I like I do, I do the things that good parents do. I was “Oh, you want to do coding?” “Yes, I want to do coding.” Do coding classes. But now they're just not afraid of doing software. And that's, I think, the thing that's honestly kept me at GitHub for so long. Anyone should be able to go and build a thing, just like I can go change a light switch in my house. I'm not going to go into the breaker box ‘cause I'll probably kill myself? But, I can go change that light switch. Everyone should be able to go and say, “This fricking app doesn't do what I want. I want it to work like this.” And that I think, is what's kind of kept us all connected with GitHub through the years and some and during the easiest of times or in the hard times because of that opportunity of, we're the home for all developers, and we want everyone to be able to have that feeling that we've had of, had an idea, I created it and holy s**t here it is.Swyx [00:47:37]: Here it is. All right, I'm going to try to do more spicy questions.GitHub's Hardest Scaling Moment: Growth, Agents, and UptimeKyle [00:47:42]: Great.Swyx [00:47:42]: Is it an easy time now or a hard time?Kyle [00:47:45]: Oh at GitHub? It's a hard time. Like, it's a hard time and also, I was just with my team and I said, “This is also, the best and most exciting time that I think I can remember at GitHub.” BecauseSwyx [00:47:57]: Best of times, worst of times. It's never oneKyle [00:47:59]: ‘cause we've we were talking about Octoverse reports and, usually we do an Octoverse report once a year, and we look at the numbers, and we say, “Oh my goodness.” I was at Universe in October saying, “This was the fastest year of growth that we've ever had,” right? And now we're doing more in a month than we did in a year last year.Swyx [00:48:20]: You're talking about PRs.Kyle [00:48:21]: Commits.Swyx [00:48:21]: Commits, yeah.Kyle [00:48:22]: PRs. Kind of like you name it by roughly every measure that we're looking at, there's some amount of sort of growth that is much bigger, and that is breaking our system in new ways, not old ways. Like webhooks were always notoriously, unreliable over the years?Swyx [00:48:38]: Whose fault is that?Kyle [00:48:39]: not anymore mine, but for a period of time, I'm sure you could pull up a tweet that was “It was me. I'm sorry.” but, now, that got rewritten at a scale level that is still working and is not having problems today. Now what we're finding isn't just the isn't the-The simple stuff that folks are on the sometimes on Twitter or on the internet are “Hey, why is this like this?” Sure. There's absolutely silly problems that we shouldn't exist. But now we're talking about, unique, novel permission problems that happen only at a scale across all different objects or whatever, that now we have to go rewrite this underlying system. And so it's, there are problems that yeah, caught us off guard, which I think I said. Like the growth is astronomical, but also we're making such material progress in that I'm excited once we're once we've kind of like reimagined the underlying foundation layer, or pieces of it at least, what's going to be possible when it's not just all of us and all the new people that are being developers and all of their agents and all the tools like working together. Because that'll still happen in that in that GitHub tool, that GitHub community. But it's a it's a hard day anytime we can't give you what you're looking for. We have the same problem internally. We operate through github. Com. Of course, we have backups when things go down and whatnot for our own operations but we feel it too. If it's not working it's not working for us, and that's kind of like the promise of dogfooding for GitHub. It's always been true. We're using the same tool you're using. We're not using a super secret version. We and so we also need it to be great for us for our customers of course for open source. And now an exponential growth of agents, Doing it too.Swyx [00:50:32]: I wanted to load for audio listeners who maybe haven't seen your tweets, whatever. So one billion commits in twenty-five. Now it's two hundred and seventy-five million per week on pace for fourteen billion this year, if growth remains linear. Is that still the pace? I don't know. It's been aKyle [00:50:48]: it's, it's speedingSwyx [00:50:50]: Roughly.Kyle [00:50:50]: It's still speeding up.Swyx [00:50:51]: It's, it's April, so yeah.Kyle [00:50:51]: Exactly. This was in April.Swyx [00:50:53]: All right. So basically you have fourteen x growth, right? Year on year on year. And I think that's a scaling issue. I think, I'm going to like try to really steel man this thing. People have experienced fourteen x growth. They haven't had your downtime. And that's like— C-can we go dig into that? Why? Like what's the— what broke? What are we doing to fix it? Like just anything for the community to reassure them.Why GitHub Reliability Is Breaking in New WaysKyle [00:51:18]: so there's a Like I was saying, there's a couple different places that we've seen the growth issues. Some of the growth issues, which is why we're t— I was talking about pushing hard on more CPUs is in actions in particular. More tools, more agents, more PRs mean more builds, more builds mean more CPUs. And so we are expanding through not just our data center, but obviously we were talking about moving to Azure and moving to, adding an additional cloud compute because we simply need more CPUs. Not as much GPUs. We definitely need GPUs too, but now CPUs are becoming a factor.Swyx [00:51:53]: It's very CPU heavy.Kyle [00:51:54]: Underneath the hood when it comes to some of the underlying services, we've been breaking up over the years our database infrastructure, so that way we have, more cognitive separation between our the various services. The place that we continue to have pain is in, permissioning. And so right now m-many of our permissioning layers sit into a database that we like internally call MySQL One, and old Hubbers will know what I'm talking about. And so we've been pulling things out of MySQL One for many years, because like and we use we use Vitess and we use other technologies to shard and we do it as one bigSwyx [00:52:31]: Famous thing, PlanetScale was born from this andKyle [00:52:32]: A hundred percent. Sam Old Hubber and friend. And so finding these opportunities to like break this out and then do that globally. The other thing that I think is interesting and both a unique opportunity and tricky is we also run everything I just talked about in a black box container with GitHub Enterprise Server for people that work on-prem. So we take everything I just said, and we also do it on-prem, and we also do all of that and we do it in a data residence setup for customers that need to have their data in a single location. Each of these has the unique characteristic around how we're sort of storing that data in MySQL or in a permissioning setup. That's where some of these outages have oc-occurred, where you're seeing it more like across the board rather than just like the one pieceSwyx [00:53:17]: Filling the databaseKyle [00:53:17]: Isn't quite working. Exactly. And so part of it is that. I think there's been some other places where agents are much more or more projects appear to be moving towards monorepo versus we were going the other direction for many years in the industry. Repos were smaller, but there were more of them, and now we're seeing the opposite. Repos are bigger, and there's, not fewer of them per se ‘cause there's new growth, but, we're just seeing many more big repos. Big repos, big monorepos have always had, a unique performance problem. Because each one, is slightly different if, particularly if the underlying blobs are incredibly big Inside the repos. And so we've done a ton of work that you pro— like most people haven't probably experienced, unless you're in this case of the monorepo. But that Git, infrastructure layer improvement does help the overall, system because, many of the improvements that make monorepos work better make all repo infrastructure work better. And so, I could kind of keep going down the line where it's another thing where we're moving out of, We're changing how we do j I'll just say job queuing for lack of a better, explanation changing the underlying technologies there.Swyx [00:54:32]: I spent two years being a job queuing guy, so.Kyle [00:54:34]: And so it's kind of a little bit of a little bit of piece by piece, and it's mostly because as we were— as it was built, we built everything in a way that assumed, I guess in some ways that the size of the pipe of work was going to remain the same. There's just going to be more people coming through each of those pipes. But instead now in places whereA git push was, generally a certain size for example, is now, no longer true.Swyx [00:55:03]: Oh, yeah.Kyle [00:55:03]: OrSwyx [00:55:05]: I push a thousandKyle [00:55:06]: On the average. 100%Swyx [00:55:06]: A thousand line commits like dailyKyle [00:55:07]: Same thing with PRs. Like PRs same thing. And like we've talked about optimizing that and making changes where, and there were technology choices that did not work there? And it got slow, and it didn't It was not fast. It did not do what the users wanted. And so we've been reeling that all out and going “Okay, that's just not right. Let's stop putting good money after bad and do it the do it the right way or the right way now.” So there's It's a it's a lot of things, not quite when I've experienced scale at GitHub historically, it's almost always two options that we've used. We go vertical scaling, particularly with databases, right? And we go horizontal scaling. Oh, we just have more people using this service. Great. We're going to add more servers, and we rack them in our data center, or we use it in a cloud. And now we're sort of in a like diagonal, where like vertical doesn't really work anymore. Horizontal isn't work either because we're all We all have some CPU or GPU constraints in the world now, and now we have to go in and like crack open services that have been running for 10 or 15 years and go, “Okay, the rules of this service have legitimately changed, and now we have to rewrite them.” None of this is an excuse. This is like we're We have to do the work. We have to make it better.Swyx [00:56:22]: actually as an infra guy, I'm “This is like one of the most fascinating scaling challenges I've ever seen.”Kyle [00:56:26]: That's that's, that's the thing that's the thing that it's hard for Like when we weren't talking about it publicly, and I was like I came out, and I was “Hey, I just want to explain what's going on.” Part of it comes from a very old GitHub ethos, which is it's our it's our uptime. It's down. W What I know you're a developer, so you're, you're inclined to want to understand more what's going on. But at the same time us going “Hey, this service didn't, perform the way we expected, and now we have to go change it,” we weren't We're not trying to hide anything from you i

Double Tap Canada
How Open Claw and Voice Claw Make Computers Fully Accessible

Double Tap Canada

Play Episode Listen Later Jun 1, 2026 56:00


Discover how Voice Claw Real Time transforms AI accessibility by connecting voice commands to Open Claw, enabling hands-free, intelligent computer use for everyday tasks and people with disabilities. Steven Scott and Shaun Preece welcome developer Ben Badejo to explore Voice Claw Real Time, his new voice interface for Open Claw. This system allows AI models to take direct action on computers and mobile devices, not just respond in chat. Ben explains how Open Claw can read emails, manage calendars, browse the web, and even complete booking forms, all triggered by speech. The conversation dives deep into accessibility: Voice Claw Real Time integrates with iOS voice control and Apple Watch, giving users with visual or motor impairments a powerful way to interact with AI. From setting up read-only email access for safety to leveraging Codex for intelligent computer use, Ben highlights how thoughtful system design and natural language configuration can raise the floor for inclusive technology. ----Follow on:YouTube: https://www.doubletaponair.com/youtubeX (formerly Twitter): https://www.doubletaponair.com/xInstagram: https://www.doubletaponair.com/instagramTikTok: https://www.doubletaponair.com/tiktokThreads: https://www.doubletaponair.com/threadsFacebook: https://www.doubletaponair.com/facebookLinkedIn: https://www.doubletaponair.com/linkedinSubscribe to the Podcast:Apple: https://www.doubletaponair.com/appleSpotify: https://www.doubletaponair.com/spotifyRSS: https://www.doubletaponair.com/podcastiHeadRadio: https://www.doubletaponair.com/iheartAbout Double TapHosted by the insightful duo, Steven Scott and Shaun Preece, Double Tap is a treasure trove of information for anyone who's blind or partially sighted and has a passion for tech. Steven and Shaun not only demystify tech, but they also regularly feature interviews and welcome guests from the community, fostering an interactive and engaging environment. Tune in every day of the week, and you'll discover how technology can seamlessly integrate into your life, enhancing daily tasks and experiences, even if your sight is limited."Double Tap" is a registered trademark of Double Tap Productions Inc. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

The Dana & Parks Podcast
HOUR 4: Why can they claw back from 30 years ago, but we can only review the last three years?

The Dana & Parks Podcast

Play Episode Listen Later May 28, 2026 34:10


HOUR 4: Why can they claw back from 30 years ago, but we can only review the last three years? full 2050 Thu, 28 May 2026 22:00:00 +0000 vp7zpjZnFYubAxgtobeEQ6dd6oRcb23C news The Dana & Parks Podcast news HOUR 4: Why can they claw back from 30 years ago, but we can only review the last three years? You wanted it... Now here it is! Listen to each hour of the Dana & Parks Show whenever and wherever you want! © 2025 Audacy, Inc. News

Deck The Hallmark
Black Panther

Deck The Hallmark

Play Episode Listen Later May 18, 2026 53:13


It's Marvel Monday and we're heading to Wakanda! ABOUT BLACK PANTHER T'Challa, heir to the hidden but advanced kingdom of Wakanda, must step forward to lead his people into a new future and must confront a challenger from his country's past. AIR DATE & NETWORK FOR BLACK PANTHER February 16, 2018 | Theatrical Release CAST & CREW OF BLACK PANTHER Director: Ryan Coogler Writers: Ryan Coogler, Joe Robert Cole, Stan Lee Cast: Chadwick Boseman as T'Challa/Black Panther Michael B. Jordan as Erik Killmonger BRAN'S MOVIE SYNOPSIS IT's time for a little history lesson, ok? So, a massive meteorite made of vibranium crashed into Africa Five tribes go to war over it until one warrior consumes a mysterious heart-shaped herb altered by the vibranium and gains incredible abilities, becoming the first Black Panther.  He unites the tribes and establishes the hidden nation of Wakanda. Over generations, Wakanda becomes the most technologically advanced civilization on Earth while hiding in plain sight. In 1992, King T'Chaka visits his younger brother, N'Jobu in California, where he has been working undercover. T'Chaka discovers that he has secretly helped a black-market arms dealer steal vibranium from Wakanda. When confronted, he turns violent, forcing T'Chaka to kill him. To protect Wakanda's secrets, T'Chaka abandons N'Jobu's young son in America and covers up the truth. Got it? Good. Years later, the King is killed and Prince T'Challa returns home to be crowned king of Wakanda. During the event, M'Baku of the Jabari Tribe challenges T'Challa for the throne in ritual combat. T'Challa wins but spares M'Baku's life. I'm sure that's for no reason though. Soon after, this dude name Claw shows up in London and steals an artifact from Wakanda at a Museum. Oh and some dude named Erik Stevens is with him. T'Challa goes to this party to try to capture Klaw but it doesn't go well and leads to a massive chase through the city. This dude that works for the government named Ross ends up being quite hurt, so T'Challa ends up rescuing him and bringing him to Wakanda to heal him with their neat tech. Zuri tells T'Challa some juicy info about how they left this kid in Oakland with no father and that kid grew up to be Erik Stevens and he now goes by Killmonger cuz of how many people he's killed and he's probably coming for us so hide your kids, hide your wife.  Killmonger murders Claw and delivers his body to Wakanda, earning an audience before the tribal elders. Revealing himself as son of N'Jobu and wants to challenge for the throne. T'Challa agrees and loses and is thrown off a waterfall cliff.  He is now the king.  Killmonger prepares to send vibranium weapons to operatives around the world to begin a global revolution. T'Challa's family leaves to asks M'Baku for help. Turns out, he has T'Challa on ice b ut in a good way. They use the heart-shaped herb to bring him back and restore his powers.  T'Challa returns to Wakanda for a final battle. T'Challa and Killmonger have a creazy fight and T'Challa is able to disable Killmonger suit. T'Challa offers Killmonger a change to live but he choses death instead, but T'Challa lets him watch one last sunset.  T'Challa decides Wakanda can no longer hide from the world. He opens an outreach center in Oakland at the very place where N'Jobu died. T'Challa then appears before the United Nations and reveals Wakanda's true technological power to the world.  And then to connect it all to the Avengers, his sister Shuri helps Bucky Barnes continue his recovery in Wakanda. Watch the show on Youtube - www.deckthehallmark.com/youtubeInterested in advertising on the show? Email bran@deckthehallmark.com Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Sherlock Holmes Adventures
Adventure_of_the_Wooden_Claw

Sherlock Holmes Adventures

Play Episode Listen Later May 18, 2026 26:44


Adventure_of_the_Wooden_Claw

The Dr. Axe Show
A Sugar Found in Beef Triggers Immune Chaos — How Tick Bites Create Alpha-Gal Syndrome, and How to Heal

The Dr. Axe Show

Play Episode Listen Later May 15, 2026 29:10


Your horrible reaction to beef could be from a tick-bite. In this episode, you'll learn about Alpha-gal Syndrome, which could be the reason you can't stomach red meat, and why it's all to do with tick saliva. If you have Lyme disease, you may already understand how ticks can wreak havoc with your immune system. Here, Dr. Motley shares his thoughts on why allergic reactions to beef may be on the rise, and the treatments he's seen manage symptoms most effectively, including herbs, homeopathy and ear acupuncture (and the SAAT technique that shows proven, long-lasting results).  If you have a tick-borne illness or an unexplained allergic reaction to beef, this episode offers a hopeful path forward.    Chapters 00:00 - What is Alpha-gal syndrome 11:11 - Why older generations eat meat more easily 12:07 - Why Alpha-gal may be so common 16:01 - Treatments, Auricular therapy 17:48 - Patient case studies 20:48 - Homeopathy treatments 21:33 - Other treatments 22:12 - Herbal remedies 24:54 - Recap and closing thoughts Dr. Motley's recommendations: Soliman Auricular Allergy Treatment Arsenicum Album Natrum Carbonicum Carbo Vegetabilis Nux Vomica Stinging Nettle Quercetin D-Hist: https://shorturl.at/6UQBa Japanese Knotweed: https://tinyurl.com/56r8fukp Cat's Claw: https://shorturl.at/eT7Zm Houttuynia Supreme: https://shorturl.at/gJifj ------  Want more of The Ancient Health Podcast? Subscribe to the YouTube channel. Follow Dr. Motley! Instagram Twitter Facebook Tik-Tok Website ------  * Liposomal supplementation has been proven deeply effective and LivOn Labs got there first. Get 10% off LivOn Labs entire store of liposomal supplements with code MOTLEY at https://www.livonlabs.com/ *Join Doctor Motley's newsletter for TCM insights and regular podcast updates: https://www.doctormotley.com/*If you want to hear more of Dr. Motley's education on Lyme,, check out his membership, complete with courses, a whole library of video-based resources and the chance to pick his brain on weekly live Q+A's. You can try it free for 15 days here: https://www.doctormotley.com/15

The Neuron: AI Explained
Inside Genspark: $0 to $250M ARR in 12 Months with Wen Sang

The Neuron: AI Explained

Play Episode Listen Later May 13, 2026 44:38


Genspark went from AI search startup to autonomous AI agent platform, hitting $250M ARR in 12 months with no paid ads until they bought a Super Bowl spot. Co-founder and COO Wen Sang joins Corey and Grant to explain what "AI employee" actually means, demos Genspark Claw live (including buying us coffee mid-interview), and lays out his big thesis: legacy software is becoming infrastructure while AI agents become the new interface between humans and work. We get hands-on with Workspace 4.0, Claw, and a custom agent built live for the show.• Genspark Workspace 4.0 announcement: https://www.genspark.ai/blog/genspark-ai-workspace-4• Genspark sb-git: https://genspark.ai/sb-git/intro• OpenAI's customer story on Genspark: https://openai.com/index/genspark/• Forbes AI 50 (2026): https://www.forbes.com/lists/ai50/• Marc Benioff on Salesforce Headless 360 (referenced by Wen): https://x.com/Benioff • Andrej Karpathy's "wiki for agents" idea (referenced as inspiration for sb-git): https://x.com/karpathy• Wen on the DealMaker Show: https://alejandrocremades.com/wen-sang/Try Genspark for free: https://genspark.aiSubscribe to The Neuron newsletter: https://theneuron.ai

Where It Happens
Screensharing How to Start an AI Agent Business Today with Genspark Claw

Where It Happens

Play Episode Listen Later May 12, 2026 29:52


Get started building your tiny AI Agent business with Genspark Claw: https://startup-ideas-pod.link/genspark_ In this solo episode, I share seven tiny, cash-flowing startup ideas you can build with AI in just a few prompts. I walk through how to use Genspark Claw, Genspark's new in-the-cloud agent product running Sonnet 4.6, and demonstrate two ideas I have already built (a dead domain flipper and a local restaurant liquidation broker), build a third idea live on camera (a hiring-signal cold outreach machine), and hand you a five-step framework for generating your own ideas. The goal is simple: give you the creative juices, the framework, and the practical know-how to ship a $200-$1,500/day business with AI as your employee. Timestamps 00:00 – Intro 01:28 – Idea 1: The Dead Domain Flipper 06:19 – Idea 2: Local Restaurant Liquidation Broker 11:03 – Idea 3: Hiring-Signal Cold Outreach Machine (building live) 14:30 – Prevent Sleep, Heartbeat, and Treating Genspark Claw Like an Employee 15:52 – Skills, Local File Access, and What Else Genspark Claw Can Do 17:24 – Reviewing the 14 Personalized Cold Emails It Wrote 20:35 – More Ideas: Buy-or-Build Memos, Dead Product Hunt SEO, Forgotten Apps 24:18 – Framework for finding ideas: Public Data, Neglected Assets, Clear Buyer 26:33 – What Else Comes With Genspark AI Works Base 4.0 Key Points Tiny, boring, cash-flowing ideas beat billion-dollar ideas when you want to ship this month. GenClaw plus Slack turns Claude Sonnet 4.6 into an always-on AI employee for around $25/month. The repeatable pattern is: messy feed → mispriced asset → trigger event → obvious buyer → liquidity point. Three hunting lenses: places of constant change, things people ignore, and assets with clear urgency and spread. Talking to your agent in plain English ("strip the HTML entities, make the budget $2,500") replaces most engineering work. Selling agents with outcomes is the new SaaS, and shifts the model from per-seat pricing to outcome-based pricing. The #1 tool to find startup ideas/trends - https://www.ideabrowser.com LCA helps Fortune 500s and fast-growing startups build their future - from Warner Music to Fortnite to Dropbox. We turn 'what if' into reality with AI, apps, and next-gen products https://latecheckout.agency/ The Vibe Marketer - Resources for people into vibe marketing/marketing with AI: https://www.thevibemarketer.com/ #Genspark and #WorkWithGenspark FIND ME ON SOCIAL X/Twitter: https://twitter.com/gregisenberg Instagram: https://instagram.com/gregisenberg/ LinkedIn: https://www.linkedin.com/in/gisenberg/

Where It Happens
Screensharing How to Start an AI Agent Business Today With Genspark Claw

Where It Happens

Play Episode Listen Later May 11, 2026 30:18


Get started building your tiny AI Agent business with Genspark Claw: https://startup-ideas-pod.link/genspark_ In this solo episode, I share seven tiny, cash-flowing startup ideas you can build with AI in just a few prompts. I walk through how to use Genspark Claw, Genspark's new in-the-cloud agent product running Sonnet 4.6, and demonstrate two ideas I have already built (a dead domain flipper and a local restaurant liquidation broker), build a third idea live on camera (a hiring-signal cold outreach machine), and hand you a five-step framework for generating your own ideas. The goal is simple: give you the creative juices, the framework, and the practical know-how to ship a $200-$1,500/day business with AI as your employee. Timestamps 00:00 – Intro 01:28 – Idea 1: The Dead Domain Flipper 06:19 – Idea 2: Local Restaurant Liquidation Broker 11:03 – Idea 3: Hiring-Signal Cold Outreach Machine (building live) 14:30 – Prevent Sleep, Heartbeat, and Treating Genspark Claw Like an Employee 15:52 – Skills, Local File Access, and What Else Genspark Claw Can Do 17:24 – Reviewing the 14 Personalized Cold Emails It Wrote 20:35 – More Ideas: Buy-or-Build Memos, Dead Product Hunt SEO, Forgotten Apps 24:18 – Framework for finding ideas: Public Data, Neglected Assets, Clear Buyer 26:33 – What Else Comes With Genspark AI Works Base 4.0 Key Points Tiny, boring, cash-flowing ideas beat billion-dollar ideas when you want to ship this month. GenClaw plus Slack turns Claude Sonnet 4.6 into an always-on AI employee for around $25/month. The repeatable pattern is: messy feed → mispriced asset → trigger event → obvious buyer → liquidity point. Three hunting lenses: places of constant change, things people ignore, and assets with clear urgency and spread. Talking to your agent in plain English ("strip the HTML entities, make the budget $2,500") replaces most engineering work. Selling agents with outcomes is the new SaaS, and shifts the model from per-seat pricing to outcome-based pricing. The #1 tool to find startup ideas/trends - https://www.ideabrowser.com LCA helps Fortune 500s and fast-growing startups build their future - from Warner Music to Fortnite to Dropbox. We turn 'what if' into reality with AI, apps, and next-gen products https://latecheckout.agency/ The Vibe Marketer - Resources for people into vibe marketing/marketing with AI: https://www.thevibemarketer.com/ #Genspark and #WorkWithGenspark FIND ME ON SOCIAL X/Twitter: https://twitter.com/gregisenberg Instagram: https://instagram.com/gregisenberg/ LinkedIn: https://www.linkedin.com/in/gisenberg/

Steve Talks Books
Page Burners: Night of Knives by Ian C. Esslemont. - Chapters 3 & 4

Steve Talks Books

Play Episode Listen Later May 11, 2026 87:13


This week on Page Burners, the group continues their journey through Night of Knives with a deep dive into Chapters 3–5. The discussion explores the book's slower, more atmospheric middle section, the confusion and chaos of Malaz Island during Shadow Moon, and the increasingly tangled loyalties surrounding Dancer, Kellanved, Surly, and the Claw.The crew debates whether Chapter 3 drags or intentionally leans into disorientation and tension, while also unpacking personal reading pet peeves, confusing timelines, and Esslemont's descriptive style. Things heat up once the flashbacks begin, giving us our first real look at Dasem Ultor during the fall of Y'Ghatan — including his conflict with Hood, Surly's growing influence, and the legendary reputation that made soldiers willing to follow him into hell.Along the way, the episode also touches on Stormriders, Hairlock memes, Shadow Cultists, Dancer's intimidating presence, and the ongoing question at the heart of Night of Knives: who actually knows what's happening tonight on Malaz Island?

M觀點 | 科技X商業X投資
EP301. PLTR 會被打敗嗎、SpaceXAI 算力出租中、NVIDIA 台北 GTC | M觀點

M觀點 | 科技X商業X投資

Play Episode Listen Later May 7, 2026 88:37


Aaron Scene's After Party
RUMPS & TRAUMA DUMPS feat. @p.marcy & @geedolla_sign

Aaron Scene's After Party

Play Episode Listen Later May 7, 2026 61:16


It's a Sunday Funday edition of the After Party! And for this one we got the return of Marcy! She comes on as we reminisce on Jaguars Gentlemen's Club, the most she's made in one night as a dancer and dumps some trauma on the podcast. Follow us on social media @AaronScenesAfterParty

united states christmas tv love california tiktok texas game halloween black world movies art stories school los angeles house nfl las vegas work giving sports ghosts politics college olympic games real mexico reality state challenges news san francisco design west travel games walk truth friend club podcasts video comedy miami story holiday spring food dj brothers football girl wild creator arizona boys dating rich drama walking trauma sex artist seattle fitness brand radio fun kings playing dance girls tour owner team festival south nashville berlin mom chefs night funny san diego detroit professional network podcasting santa utah horror north bbc east band hotels political basketball league toxic baseball mayors experiences mlb feelings sun vacation hong kong camp baltimore fight kansas tx birds loves traveling videos beach snow couple queens streaming daddy scary dancing amsterdam feet salt moms weather sexy television championship lions concerts artists hurricanes sister photography tiger thunder boy new mexico lake eat soccer mtv suck personality fest beef bar spooky dare onlyfans chiefs vip stream snapchat plays cities receiving mayo foot naked vibes showdown oakland jamaica capitol sucks raw jail olympians grandma rico boxing whiskey fighters measure girlfriends sacramento bowl lightning toys cardi b vibe parties photos lover smash tea workout joke jokes paranormal phantom ravens bay nights epidemics barbers snoop dogg bars shots southwest scare cookies metro boyfriends cent coast clubs gym dallas mavericks cinco wide derby improv djs bands calendar hook bite seahawks padre hilarious gentlemen twin sanchez stark booking edm san francisco 49ers myers ranch el paso tweets delicious statue carnival tornados euphoria jaguars hats jamaican dancer downtown eats bit tequila lamar blocking shot taco strippers boobs bro rider twisted evp foodies paso bodybuilding fiesta sneaky mendoza 2022 streams strip wasted requests flights vodka uncut scottsdale booty radiohead sporting noche fam peach rebrand boxer blocked riders nails sausage toes smashing malone freaky futbol horny jags bud electrical ass yankee nm cancun peso towers 2024 bender wheelchairs micheal claw sis swingers sized inch peaks exotic playa stockton asu milfs toy hooters nightlife sucking glendale pantera newsrooms chopped gras headquarters hoes afterparty dancers tempe reggaeton puerto mardi dawg claws choreographers sizes bakersfield lv edc ranchers dumps peoria juarez midland nab patio tailgate joking buns krueger foreplay snowstorms videography monsoons cum loverboy cumming tipsy sunday funday toe crazies titties weatherman dispensaries groupies noches corpus unedited r rated chicas titty asses funday bouncer utep bun throuple locas benders foo myke luchador hooking atx wild n out handicapped juiced plums cruces chihuahuas dispo medicated diablos toxica foos bouncers anuel music culture fitlife toxico nmsu chuco rumps sunland park
Dinner with the Heelers
Rewatch! - The Claw

Dinner with the Heelers

Play Episode Listen Later Apr 28, 2026 27:48 Transcription Available


(00:00:00) Intro (00:01:52) We Just Got Done Rewatching The Claw (00:11:34) Did We Learn Anything Today? (00:14:39) Parting Thoughts (00:17:17) Howl Outs (00:21:40) Apple Podcast Reviews! In this episode we rewatch and then talk about The Claw. I am not Dad. I am Magic Claw. Magic Claw has no children; his days are free and easy.Thank you so much for listening. Connect with us and let us know what you think of the show!Get Dinner with the Heelers merch! At Dashery by TeePublic you can get shirts (and all sorts of other cool things) with Dinner with the Heelers artwork. Grab yours today!Get ad-free episodes via Patreon for only $1 a month: patreon.com/theblueypodcastCheck out this video about how our podcast is made:TikTok: https://www.tiktok.com/@theblueypodcast/video/7370492256005950766Instagram: https://www.instagram.com/reel/C7NKLQhAIUv/A huge thank you to Ryanna Larson (Instagram: @blueyfamilyportraits) for the amazing show cover art. Connect with her on Instagram to commission a portrait for your family!Website: theblueypodcast.comPatreon: patreon.com/theblueypodcastTikTok: @theblueypodcastTwitter: @theblueypodcastInstagram: @theblueypodcastFacebook: Dinner with the HeelersEmail: blueypodcast@gmail.comBecome a supporter of this podcast: https://www.spreaker.com/podcast/dinner-with-the-heelers-a-bluey-podcast--6729926/support.

Hipsters Ponto Tech
OpenClaw e a categoria Claw – Hipsters Ponto Tech #513

Hipsters Ponto Tech

Play Episode Listen Later Apr 28, 2026 64:37


Hoje o papo é sobre claws! Neste episódio, mergulhamos em como o surgimento do OpenClaw marcou o início de uma nova era na forma como agentes de IA podem ajudar no dia a dia, inspirar novos projetos, e reacender a chama da vontade de desenvolver. Vem ver quem participou desse papo: Paulo Silveira, o host que voltou a programar Luiz Couto, Organizador do Claude Code Meetup São Paulo Sérgio Lopes, cofundador da Alura e CEO do Alun Future Studio Vinny Neves, cohost, dev e professor na Alura Links: OpenClaw n8n Claude Code OpenAI Codex Pi Mono OpenClaw apaga emails de executiva da Meta Nanoclaw Claude Desktop Hermes Claude Code: criando sua primeira aplicação Conheça o curso Engenharia de software na era da IA: como usar IA no fluxo real de desenvolvimento da Alura. Inscreva-se na Hipsters.Builders, a newsletter da comunidade builder. Toda semana, a principal newsletter de quem constrói software no Brasil traz notícias, citações e movimentos da comunidade Builder do X, do Hipsters e do IA Sob Controle, além dos melhores links e eventos. Direto no seu e-mail. Vá para o Vale do Silício com Paulo Silveira, Marcell Almeida, Fabrício Carraro e Marcus Mendes na “Imersão IA Sob Controle e Alura no Vale do Silício“! Vagas limitadas, corra para reservar a sua. TechGuide.sh, um mapeamento das principais tecnologias demandadas pelo mercado para diferentes carreiras, com nossas sugestões e opiniões. #7DaysOfCode: Coloque em prática os seus conhecimentos de programação em desafios diários e gratuitos. Acesse https://7daysofcode.io/ Produção e conteúdo: Alura Cursos de Tecnologia – https://www.alura.com.br Edição e sonorização: Rede Gigahertz de Podcasts

The Business of Content
Can media companies ever claw back advertising revenue from big tech?

The Business of Content

Play Episode Listen Later Apr 28, 2026 57:05


My newsletter: https://simonowens.substack.com/   Digital advertising has long been shaped by a David-versus-Goliath dynamic, with publishers struggling to compete against tech platforms that control better data, more powerful ad tools, and vastly larger pools of advertiser demand. That raises a few big questions: Why have publishers struggled to close that gap despite the promise of programmatic advertising? What advantages do platforms like Google and Meta have that the open web still lacks? And as AI takes on a larger role in ad buying, could it help level the playing field — or further entrench the dominance of the largest platforms? Rick Erwin, CEO of Adstra and a longtime ad tech executive, has spent decades working at the center of these shifts. In a recent interview, he explained why tech platforms have built such a durable advantage over publishers, how advertisers decide who sees their ads, and why he believes AI could fundamentally change how ad campaigns are planned and executed.  

Behind The Thread
Allie K. Miller: The Fastest Way To Use AI Agents In Your Business, Content & Life (Open Claw & Claude)

Behind The Thread

Play Episode Listen Later Apr 27, 2026 98:35


Allie K. Miller is the #1 most followed voice in AI. She currently advises Fortune 500 companies on how to use AI. In this episode, Allie breaks down exactly how non-technical people can stop using AI like Google and start using it like a power user. And she gives a clear, specific path to making six figures in the process. Try Vanta Today: https://Vanta.com/calum Download Your Free AI Prompt Playbook - The 8 prompts from Allie K. Miller featured in this episode: https://calumjohnsonshowlinks.lovable.app Follow Us! https://x.com/calum_johnson9https://www.instagram.com/calumjohnson1/?hl=enhttps://x.com/alliekmillerhttps://www.instagram.com/alliekmiller Timestamps: 00:00 Intro 01:11 Why Allie has been posting about AI every single day for a decade 04:20 Why non-technical people are crushing it 07:52 The difference between AI power users and everyone else 12:59 Real businesses that 3x'd revenue using AI (no coding) 15:12 The AI agent that hired someone to fix her cables 20:06 The difference between ChatGPT and AI Agents 25:06 The secret to getting 10x better results from ChatGPT 31:33 Why some people are moving from ChatGPT to Claude 36:19 What is Claude Cowork? (clearly explained for non-technical people) 41:05 How Allie runs her entire content operation with Claude Code 47:19 How to get started with Claude Cowork (in 4.5 minutes) 52:58 How to protect your data when using AI agents 54:57 Will AI actually take your job? 01:21:27 Can you really make six figures as an AI consultant? 01:29:00 The hidden opportunity hiding in plain sight right now

The Dana & Parks Podcast
HOUR 3: Can they just claw back that much unemployment? Did these people know they were overpaid?

The Dana & Parks Podcast

Play Episode Listen Later Apr 23, 2026 34:58


HOUR 3: Can they just claw back that much unemployment? Did these people know they were overpaid? full 2098 Thu, 23 Apr 2026 21:00:00 +0000 gCWRBOOdfL7X6BXsI8foyH2tcNSLhl2g news The Dana & Parks Podcast news HOUR 3: Can they just claw back that much unemployment? Did these people know they were overpaid? You wanted it... Now here it is! Listen to each hour of the Dana & Parks Show whenever and wherever you want! © 2025 Audacy, Inc. News False

The North Shore Drive
Penguins playoffs: Does this team have what it takes to CLAW BACK into Flyers series?

The North Shore Drive

Play Episode Listen Later Apr 23, 2026 31:47


Post-Gazette Penguins insider King Jemison welcomes columnist Noah Hiles to analyze the team's 3-0 series hole against the Flyers. Did the officials handle the Game 3 melee properly? How did the Penguins respond? What role did Stuart Skinner play in the team's second-period collapse? And finally, do the Penguins have a chance to make a major comeback? And what does this series say about where the franchise is headed? Does it make it less likely that Evgeni Malkin returns? Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Aaron Scene's After Party
NECK BEHIND THE DECK AT 3AM feat. @3amfrfr & @thousandgramsclub

Aaron Scene's After Party

Play Episode Listen Later Apr 22, 2026 65:13


We're live on 4/20 from our sponsors Apogee in Sunland Park NM! And on this one we bring on our boy 3am as we catch up with him and he shares some of his most recent projects. Plus he tells us all about his crazy Las Vegas work schedule, doing work for the World Cup and he tells us some of his DJ do's and don'ts! And the OG cohost Marky Mark stops by for a little edible action. Follow us on social media @AaronScenesAfterParty

united states christmas tv love california tiktok texas game halloween black world movies art stories school los angeles house nfl las vegas work giving sports ghosts politics college olympic games real mexico reality state challenges news san francisco design west travel games walk truth friend club podcasts video comedy miami story holiday spring food dj brothers football girl wild creator arizona boys dating rich walking sex artist seattle fitness brand radio fun kings playing dance girls tour owner team festival south nashville berlin mom chefs night funny san diego detroit professional network podcasting santa utah horror north bbc east band hotels political basketball league toxic baseball mayors experiences mlb feelings sun vacation hong kong camp baltimore fight kansas world cup tx birds loves traveling videos beach snow couple queens streaming daddy scary dancing amsterdam feet salt moms weather sexy television championship lions concerts artists hurricanes sister photography tiger thunder boy new mexico lake eat soccer mtv suck personality fest beef bar spooky dare onlyfans chiefs vip stream snapchat plays cities receiving mayo foot naked vibes showdown oakland jamaica capitol sucks raw jail olympians grandma rico boxing whiskey fighters measure girlfriends sacramento bowl lightning toys cardi b vibe parties photos lover smash tea workout joke jokes paranormal phantom ravens bay nights epidemics barbers snoop dogg bars deck shots southwest scare cookies metro boyfriends cent coast clubs gym dallas mavericks cinco wide derby improv djs bands calendar hook bite seahawks padre hilarious gentlemen twin sanchez stark booking edm san francisco 49ers myers ranch el paso tweets delicious statue carnival tornados euphoria jaguars hats jamaican neck dancer downtown eats bit tequila lamar blocking shot taco strippers boobs bro rider twisted evp foodies paso bodybuilding fiesta sneaky mendoza 2022 streams strip wasted requests flights vodka uncut scottsdale booty radiohead sporting noche fam peach rebrand boxer blocked riders nails sausage toes smashing malone freaky futbol horny jags bud electrical ass yankee nm cancun peso towers 2024 bender wheelchairs micheal claw sis swingers sized inch peaks exotic playa stockton asu milfs toy hooters nightlife sucking glendale pantera newsrooms chopped gras headquarters hoes dancers tempe reggaeton puerto mardi dawg claws choreographers sizes bakersfield lv edc ranchers peoria juarez midland nab patio tailgate joking buns krueger foreplay snowstorms videography monsoons cum loverboy cumming tipsy toe crazies titties weatherman dispensaries groupies noches corpus unedited r rated chicas marky mark titty asses funday bouncer utep bun throuple locas benders foo myke luchador hooking atx wild n out handicapped juiced plums cruces chihuahuas dispo medicated apogee diablos toxica foos bouncers anuel music culture fitlife toxico nmsu chuco rumps sunland park
Nightcap with Unc and Ocho
Nightcap Hour 1: Timberwolves CLAW BACK vs Nuggets + Hawks STUN Knicks + Cavs IN CONTROL vs Raptors + Wemby wins DPOY UNANIMOUSLY + Kevin Durant INJURED in PRACTICE

Nightcap with Unc and Ocho

Play Episode Listen Later Apr 21, 2026 80:00 Transcription Available


Shannon Sharpe, Chad “Ochocinco” Johnson and Iso Joe Johnson react to the Minnesota Timberwolves coming back from 19 down to beat the Denver Nuggets to tie the series 1-1, the Atlanta Hawks win game 2 on the road and beat the New York Knicks to tie the series 1-1, and the Cavaliers beat the Raptors to go up 2-0 in the series and much more! Download the PrizePicks app today and use code SHANNON to get $50 in lineups after you play your first $5 lineup! https://prizepicks.onelink.me/LME0/NIGHTCAP 03:50 - Wolves beat Nuggets 26:15 - Hawks beat Knicks in Game 2 45:05 - Cleveland beat Raptors 56:55 - Wemby is the Unanimous DPOY 1:09:15 - Kevin Durant (Timestamps may vary based on advertisements.) #ClubSee omnystudio.com/listener for privacy information.

Bounced From The Roadhouse
Car Toilets, Public Restrooms, Wife Advice, Operation Bear Claw and More!

Bounced From The Roadhouse

Play Episode Listen Later Apr 21, 2026 33:58


On this episode of Bounced From The Roadhouse:Special Guests in 4B:Intro/ National Kindergarten Day, National Yellow Bat Day, Creativity and Innovation DayCar Toilet Public Toilets Toilet Seat Handle How To Turn Your Wife OnBJ's Wife Lesson Insurance Operation Bear ClawDylan Sprouse Defends His Honor Airline BedsSky Diver Football GameThat's a Great QuestionLego Thief In CaliforniaPrincipal Prom King Crown Vic Questions? Comments? Leave us a message! 605-343-6161Don't forget to subscribe, leave us a review and some stars Hosted on Acast. See acast.com/privacy for more information.

But I'm Still A Good Person by Vince Nicholas
Hugging co-workers, I'm now Cat Guy at the office & Cash me ousside w/ my claw arm how bout dat

But I'm Still A Good Person by Vince Nicholas

Play Episode Listen Later Apr 19, 2026 13:38


Hotels: Give us $18k and you can have free drinks, free food and a comped hotel room.Us: Okay.

She's Not Doing So Well - Gay Perspective On Everyday Life
Wait, You Don't Trim an Abandoned Lawn?

She's Not Doing So Well - Gay Perspective On Everyday Life

Play Episode Listen Later Apr 17, 2026 65:49


Send us Fan MailThis week on Not Well, Bobby and Jim spiral beautifully through a night out that started with cheap margaritas and ended somewhere between gay bar chaos, moon conspiracies, death talk, body shame trauma, and the collapse of modern society.We cover: A wild night at Lupitas and Columbus bars  Meeting a hot casket salesman  Why American car culture is broken  Weird truths about body odor and shame  Going around the moon… again?  Trump, distractions, and the state of the world  Kink culture, sensory deprivation, and Claw weekend  Why gay men drinking wine at dive bars feels suspicious  Friendship, loneliness, attention-seeking, and social media weirdness  The usual emotionally unstable nonsense If you like sharp humor, gay chaos, conspiracy tangents, social commentary, and two friends saying what everyone else is thinking… welcome home.

Only in Seattle - Real Estate Unplugged
Drafted Executive Orders Claw Back CA Energy to Insure US has Necessary Fuels For National Security

Only in Seattle - Real Estate Unplugged

Play Episode Listen Later Apr 16, 2026 27:03


California's ongoing war on the oil and gas industry has led to a potential fuel crisis, prompting seven draft Executive Orders to be sent to President Donald J. Trump. These orders aim to ensure the United States has the necessary fuels from California for national security. USC Professor Michael Mische and his co-authors argue that California's Governor and Legislature have proven incapable or unwilling to address the self-created fuel crisis. The proposed Executive Orders include allowing offshore oil production, revoking the Low Carbon Fuel Standard, removing state and local control on oil reserves, and directing California refineries to increase jet and diesel fuel production. The authors believe the U.S., California, and global security would benefit from these measures. The key question is whether the federal government will step in to address California's fuel crisis and its impact on national security.

Aaron Scene's After Party
BATTLE OF THE COHOSTS feat. @geedolla_sign & @xo.mariza_

Aaron Scene's After Party

Play Episode Listen Later Apr 15, 2026 56:58


It's a pop up podcast! And on this episode we have our battle of the cohosts! As Gee and Baby M take each other head on on a variety of questions plus they prove whether guys and girls can solely and ONLY be friends. AND the gang tries out some honey packs and give our honest review. Follow us on social media @AaronScenesAfterParty

united states christmas tv love california tiktok texas game halloween black world movies art stories school los angeles house nfl las vegas battle work giving sports ghosts politics college olympic games real mexico reality state challenges news san francisco design west travel games walk truth friend club podcasts video comedy miami story holiday spring food dj brothers football girl wild creator arizona boys dating rich walking sex artist seattle fitness brand radio fun kings playing dance girls tour owner team festival south nashville berlin mom chefs night funny san diego detroit professional network podcasting santa utah horror north bbc east band hotels political basketball league toxic baseball mayors experiences mlb feelings sun vacation hong kong camp baltimore fight kansas tx birds loves traveling videos beach snow couple queens streaming daddy scary dancing amsterdam feet salt moms weather sexy television championship lions concerts artists hurricanes sister photography tiger thunder boy new mexico lake eat soccer mtv suck personality fest beef bar spooky dare onlyfans chiefs vip stream snapchat plays cities receiving mayo foot naked vibes showdown oakland jamaica capitol sucks raw jail olympians grandma rico boxing whiskey fighters measure girlfriends sacramento bowl lightning toys cardi b vibe parties photos lover smash tea workout joke jokes paranormal phantom ravens bay nights epidemics barbers snoop dogg bars shots southwest scare cookies metro boyfriends cent coast clubs gym dallas mavericks cinco wide derby improv djs bands calendar hook bite seahawks padre hilarious gentlemen twin sanchez stark booking edm san francisco 49ers myers ranch el paso tweets delicious statue carnival tornados euphoria jaguars hats jamaican dancer downtown eats bit tequila lamar blocking shot taco strippers boobs bro rider twisted evp foodies paso bodybuilding fiesta sneaky mendoza 2022 streams strip wasted requests flights vodka uncut scottsdale booty radiohead sporting noche fam peach rebrand boxer blocked riders nails sausage toes smashing malone freaky futbol horny jags bud electrical ass yankee nm cancun peso towers 2024 bender wheelchairs micheal claw sis swingers sized inch peaks exotic playa stockton asu milfs toy hooters nightlife sucking glendale pantera newsrooms chopped gras headquarters hoes dancers tempe reggaeton puerto mardi dawg claws choreographers sizes bakersfield lv edc ranchers peoria juarez midland nab patio tailgate joking buns krueger foreplay snowstorms videography monsoons cum loverboy cumming tipsy toe crazies titties weatherman dispensaries noches corpus unedited r rated chicas titty asses funday bouncer utep bun throuple locas benders foo myke luchador hooking atx wild n out handicapped juiced plums cruces chihuahuas dispo medicated mariza diablos toxica foos bouncers anuel music culture fitlife toxico nmsu chuco rumps baby m sunland park
Just Another Kill Team Podcast
Learn to Master: Nemesis Claw with Baloth Paints

Just Another Kill Team Podcast

Play Episode Listen Later Apr 15, 2026 59:16


Send us Fan MailPaid patreon subscribers are eligible to win free kill teams! Legionary is up next! Get your Crossfire Games goodies here: https://crossfiregames.co/discount/JUSTANOTHERKILLTEAM ----------- JAKTP Discord Link: https://discord.gg/6653HG9XKb JAKTP Instagram: https://www.instagram.com/justanotherkillteampodcast?igsh=ZzR2dmRwZTM3MGQ= JAKTP Youtube: https://www.youtube.com/channel/UCsCGQMlcqFmbwp295HvaxxgJAKTP Patreon Link: https://www.patreon.com/JustAnotherKillteamPodcastLocation: SpainSupport the show

Aaron Scene's After Party
LIVE AT SUNSET feat. @sunsetdiveep @celena_772 & @_dj.snack

Aaron Scene's After Party

Play Episode Listen Later Apr 9, 2026 49:43


We are LIVE for this episode at Sunset Dive! DJ Snack comes on as we talk about good times at EDC plus! We bring on a couple of Sunset bartenders for a sit down as they tell us about the biggest tip they've ever made and some crazy bartending stories! Follow us on social media @AaronScenesAfterParty

united states christmas tv love california live tiktok texas game halloween black world movies art stories school los angeles house nfl las vegas work giving sports ghosts politics college olympic games real mexico reality state challenges news san francisco design west travel games walk truth friend club podcasts video comedy miami story holiday spring food dj brothers football girl wild creator arizona boys dating rich walking sex artist seattle fitness brand radio fun kings playing dance girls tour owner team festival south nashville berlin mom chefs night funny san diego detroit professional network podcasting santa utah horror north bbc east band hotels political basketball league toxic baseball mayors experiences mlb feelings sun vacation hong kong camp baltimore fight kansas tx birds loves traveling videos beach snow couple queens streaming daddy scary dancing amsterdam feet salt moms weather sexy television championship lions concerts artists hurricanes sister photography tiger thunder boy new mexico lake eat soccer mtv suck personality fest beef bar spooky dare onlyfans chiefs vip stream snapchat plays cities receiving mayo foot naked vibes showdown oakland jamaica capitol sucks raw snacks jail olympians grandma rico boxing whiskey fighters measure girlfriends sacramento bowl lightning toys cardi b vibe parties photos lover smash tea workout joke jokes paranormal phantom ravens bay nights epidemics barbers snoop dogg bars shots southwest scare cookies metro boyfriends cent coast sunsets clubs gym dallas mavericks cinco wide derby improv djs bands calendar hook bite seahawks padre hilarious gentlemen twin sanchez stark booking edm san francisco 49ers myers ranch el paso tweets delicious statue carnival tornados euphoria jaguars hats jamaican dancer downtown eats bit tequila lamar blocking shot taco strippers boobs bro rider twisted evp foodies paso bodybuilding fiesta sneaky mendoza 2022 streams strip wasted requests flights vodka uncut scottsdale booty radiohead sporting noche fam peach rebrand boxer blocked riders nails sausage toes smashing malone freaky futbol horny jags bud electrical ass yankee nm cancun peso towers 2024 bender wheelchairs micheal claw sis swingers sized inch peaks exotic playa stockton asu milfs toy hooters nightlife sucking glendale pantera newsrooms chopped gras headquarters hoes dancers tempe reggaeton puerto mardi dawg claws choreographers sizes bakersfield lv edc ranchers peoria juarez midland nab patio tailgate joking buns krueger foreplay snowstorms videography monsoons cum loverboy cumming tipsy toe crazies titties weatherman dispensaries noches corpus unedited r rated chicas titty asses funday bouncer utep bun throuple locas benders foo myke luchador hooking atx wild n out handicapped juiced plums cruces chihuahuas dispo medicated diablos toxica foos bouncers anuel music culture fitlife toxico nmsu celena chuco rumps sunland park
Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0
Extreme Harness Engineering for Token Billionaires: 1M LOC, 1B toks/day, 0% human code, 0% human review — Ryan Lopopolo, OpenAI Frontier & Symphony

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Apr 7, 2026 72:43


We're proud to release this ahead of Ryan's keynote at AIE Europe. Hit the bell, get notified when it is live! Attendees: come prepped for Ryan's AMA with Vibhu after.Move over, context engineering. Now it's time for Harness engineering and the age of the token billionaires.Ryan Lopopolo of OpenAI is leading that charge, recently publishing a lengthy essay on Harness Eng that has become the talk of the town:In it, Ryan peeled back the curtains on how the recently announced OpenAI Frontier team have become OpenAI's top Codex users, running a >1m LOC codebase with 0 human written code and, crucially for the Dark Factory fans, no human REVIEWED code before merge. Ryan is admirably evangelical about this, calling it borderline “negligent” if you aren't using >1B tokens a day (roughly $2-3k/day in token spend based on market rates and caching assumptions):Over the past five months, they ran an extreme experiment: building and shipping an internal beta product with zero manually written code. Through the experiment, they adopted a different model of engineering work: when the agent failed, instead of prompting it better or to “try harder,” the team would look at “what capability, context, or structure is missing?”The result was Symphony, “a ghost library” and reference Elixir implementation (by Alex Kotliarskyi) that sets up a massive system of Codex agents all extensively prompted with the specificity of a proper PRD spec, but without full implementation:The future starts taking shape as one where coding agents stop being copilots and start becoming real teammates anyone can use and Codex is doubling down on that mission with their Superbowl messaging of “you can just build things”.Across Codex, internal observability stacks, and the multi-agent orchestration system his team calls Symphony, Ryan has been pushing what happens when you optimize an entire codebase, workflow, and organization around agent legibility instead of human habit.We sat down with Ryan to dig into how OpenAI's internal teams actually use Codex, why the real bottleneck in AI-native software development is now human attention rather than tokens, how fast build loops, observability, specs, and skills let agents operate autonomously, why software increasingly needs to be written for the model as much as for the engineer, and how Frontier points toward a future where agents can safely do economically valuable work across the enterprise.We discuss:* Ryan's background from Snowflake, Brex, Stripe, and Citadel to OpenAI Frontier Product Exploration, where he works on new product development for deploying agents safely at enterprise scale* The origin of “harness engineering” and the constraint that kicked off the whole experiment: Ryan deliberately refused to write code himself so the agent had to do the job end to end* Building an internal product over five months with zero lines of human-written code, more than a million lines in the repo, and thousands of PRs across multiple Codex model generations* Why early Codex was painfully slow at first, and how the team learned to decompose tasks, build better primitives, and gradually turn the agent into a much faster engineer than any individual human* The obsession with fast build times: why one minute became the upper bound for the inner loop, and how the team repeatedly retooled the build system to keep agents productive* Why humans became the bottleneck, and how Ryan's team shifted from reviewing code directly to building systems, observability, and context that let agents review, fix, and merge work autonomously* Skills, docs, tests, markdown trackers, and quality scores as ways of encoding engineering taste and non-functional requirements directly into context the agent can use* The shift from predefined scaffolds to reasoning-model-led workflows, where the harness becomes the box and the model chooses how to proceed* Symphony, OpenAI's internal Elixir-based orchestration layer for spinning up, supervising, reworking, and coordinating large numbers of coding agents across tickets and repos* Why code is increasingly disposable, why worktrees and merge conflicts matter less when agents can resolve them, and what it really means to fully delegate the PR lifecycle* “Ghost libraries”, spec-driven software, and the idea that a coding agent can reproduce complex systems from a high-fidelity specification rather than shared source code* The broader future of Frontier: safely deploying observable, governable agents into enterprises, and building the collaboration, security, and control layers needed for real-world agentic workRyan Lopopolo* X: https://x.com/_lopopolo* Linkedin: https://www.linkedin.com/in/ryanlopopolo/* Website: https://hyperbo.la/contact/Timestamps00:00:00 Introduction: Harness Engineering and OpenAI Frontier00:02:20 Ryan's background and the “no human-written code” experiment00:08:48 Humans as the bottleneck: systems thinking, observability, and agent workflows00:12:24 Skills, scaffolds, and encoding engineering taste into context00:17:17 What humans still do, what agents already own, and why software must be agent-legible00:24:27 Delegating the PR lifecycle: worktrees, merge conflicts, and non-functional requirements00:31:57 Spec-driven software, “ghost libraries,” and the path to Symphony00:35:20 Symphony: orchestrating large numbers of coding agents00:43:42 Skill distillation, self-improving workflows, and team-wide learning00:50:04 CLI design, policy layers, and building token-efficient tools for agents00:59:43 What current models still struggle with: zero-to-one products and gnarly refactors01:02:05 Frontier's vision for enterprise AI deployment01:08:15 Culture, humor, and teaching agents how the company works01:12:29 Harness vs. training, Codex model progress, and “you can just do things”01:15:09 Bellevue, hiring, and OpenAI's expansion beyond San FranciscoTranscriptRyan Lopopolo: I do think that there is an interesting space to explore here with Codex, the harness, as part of building AI products, right? There's a ton of momentum around getting the models to be good at coding. We've seen big leaps in like the task complexity with each incremental model release where if you can figure out how to collapse a product that you're trying to.Build a user journey that you're trying to solve into code. It's pretty natural to use the Codex Harness to solve that problem for you. It's done all the wiring and lets you just communicate in prompts. To let the model cook, you have to step back, right? Like you need to take a systems thinking mindset to things and constantly be asking, where is the Asian making mistakes?Where am I spending my time? How can I not spend that time going forward? And then build confidence in the automation that I'm putting in place. So I have solved this part of the SDLC.swyx: [00:01:00] All right.[00:01:03] Meet Ryan swyx: We're in the studio with Ryan from OpenAI. Welcome.Ryan Lopopolo: Hi,swyx: Thanks for visiting San Francisco and thanks for spending some time with us.Ryan Lopopolo: Yeah, thank you. I'm super excited to be here.swyx: You wrote a blockbuster article on harness engineering. It's probably going to be the defining piece of this emerging discipline, huh?Ryan Lopopolo: Thank you. It is it's been fun to feel like we've defined the discourse in some sense.swyx: Let's contextualize a little bit, this first podcast you've ever done. Yes. And thank you for spending with us. What is, where is this coming from? What team are you in all that jazz?Ryan Lopopolo: Sure, sure.Ryan Lopopolo: I work on Frontier Product Exploration, new product development in the space of OpenAI Frontier, which is our enterprise platform for deploying agents safely at scale, with good governance in any business. And. The role of VMI team has been to figure out novel ways to deploy our models into package and products that we can sell as solutions to enterprises.swyx: And you have a background, I'll just squeeze it in there. Snowflake, brick, [00:02:00] stripe, citadel.Ryan Lopopolo: Yes. Yes. Same. Any kind of customerswyx: entire life. Yes. The exact kind of customer that you want to,Vibhu: so I'll say, I was actually, I didn't expect the background when I looked at your Twitter, I'm seeing the opposite.Stuff like this. So you've got the mindset of like full send AI, coding stuff about slop, like buckling in your laptop on your Waymo's. Yes. And then I look at your profile, I'm like, oh, you're just like, you're in the other end too. Oh, perfect. Makes perfect.Ryan Lopopolo: I it's quite fun to be AI maximalist if you're gonna live that persona.Open eye is the place to do it. And it'sswyx: token is what you say.Ryan Lopopolo: Yeah. Certainly helps that we have no rate limits internally. And I can go, like you said, full send at this stay.swyx: Yeah. Yeah. So the Frontier, and you're a special team within O Frontier.Ryan Lopopolo: We had been given some space to cook, which has been super, super exciting.[00:02:47] Zero Code ExperimentRyan Lopopolo: And this is why I started with kind of a out there constraint to not write any of the code myself. I was figuring if we're trying to make agents that can be deployed into end to enterprises, they should be [00:03:00] able to do all the things that I do. And having worked with these coding models, these coding harnesses over 6, 7, 8 months, I do feel like the models are there enough, the harnesses are there enough where they're isomorphic to me in capability and the ability to do the job.So starting with this constraint of I can't write the code meant that the only way I could do my job was to get the agent to do my job.Vibhu: And like a, just a bit of background before that. This is basically the article. So what you guys did is five months of working on an internal tool, zero lines of code over a mi, a million lines of code in the total code base.You say it was cenex, more like it was cenex faster than you would've. If you had done it by end. SoRyan Lopopolo: yeah, thatVibhu: was the mindset going into this, right?Ryan Lopopolo: That's right.[00:03:46] Model Upgrades LessonsRyan Lopopolo: Started with some of the very first versions of Codex CLI, with the Codex Mini model, which was obviously much less capable than the ones we have today.Which was also a very good constraint, right? Quite a visceral feeling to ask the [00:04:00] model to build you a product feature. And it just not being able to assemble the pieces together.Which kind of defined one of the mindsets we had for going into this, which is whenever the model just cannot, you always pop open at the task, double click into it, and build smaller building blocks that then you can reassemble into the broader objective.And it was quite painful to do this. Honestly, the first month and a half was. 10 times slower than I would be. But because we paid that cost, we ended up getting to something much more productive than any one engineer could be because we built the tools, the assembly station for the agent to do the whole thing.[00:04:43] Model Generations, Build Systems & Background ShellsRyan Lopopolo: But yeah, so onward to G BT 5, 5, 1, 5, 2, 5, 3, 5 4. To go through all these model generations and see their kind of corks and different working styles also meant we had to adapt the code base to change things up when the model was revved. [00:05:00] One interesting thing here is five two, the Codex harness at the time did not have background shells in it, which means we were able to rely on blocking scripts to perform long horizon work.But with five, three and background shells, it became less patient, less willing to block. So we had to retool the entire build system to complete in under a minute and. This is not a thing I would expect to be able to do in a code base where people have opinions. But because the only goal was to make the Asian productive over the course of a week, we went from a bespoke make file build to Basil, to turbo to nx and just left it there because builds were fast at that point.swyx: Interesting. Talk more about Turbo TenX. That's interesting ‘cause that's the other direction that other people have been doing.Ryan Lopopolo: Ultimately I have. Not a lot of experience with actual frontend repo architecture.swyx: You're talking that Jessica built the sky. So I'm like, I know the NX team. I know Turbo from Jared [00:06:00] Palmer.And I'm like, yeah, that's an interesting comparison.[00:06:02] One Minute Build LoopRyan Lopopolo: The hill we were climbing right, was make it fast.swyx: Is there a micro front end involved? Is it how how complex reactRyan Lopopolo: electron base single app sort of thingswyx: And must be under a minute. That's an interesting limitation. I'm actually not super familiar with the background shelf stuff.Probably was talked about in the fight three release.Ryan Lopopolo: BA basically means that codex is able to spawn commands in the background and then go continue to work while it waits for them to finish. So it can spawn an expensive build and then continue reviewing the code, for example.swyx: Yeah.Ryan Lopopolo: And this helps it be more time efficient for the user invoking the harness.swyx: And I guess and just to really nail this, like what does one minute matter? Like why not five, okay, good. We want no. WeRyan Lopopolo: want the inner loop to be as fast as possible. Okay. One minute was just a nice round number and we were able to hit it.swyx: And if it doesn't complete, it kills it or some something,Ryan Lopopolo: No.We just take that as a signal that we need to stop what we're doing, double click, decompose a build graph a bit to get us to high back under so that we [00:07:00] can able the agent continue to operate.swyx: It's almost like you're, it's like a ratchet. It's like you're forcing build time discipline, because if you don't, it'll just grow and grow.That's right. And you mentioned that my current, like the software I work on currently is at 12 minutes. It sucks.Ryan Lopopolo: This has been my experience with platform teams in the past, where you have an envelope of acceptable build times and you let it go up to breach and then you spend two, three weeks to bring it back down to the lower end of the average low bed stop.But because tokens are so cheap Yeah. And we're so insanely parallel with the model, we can just constantly be gardening this thing to make sure that we maintain these in variants, which means. There's way less dispersion in the code and the SDLC, which means we can simplify in a way and rely on a lot more in variance as we write the software.[00:07:45] Observability, Traces & Local Dev StackVibhu: Lovely.[00:07:46] Humans Are BottleneckVibhu: You mentioned in your article, like humans became the bottleneck, right? You kicked off as a team of three people. You're putting out a million line of code, like 1500 prs, basically. What's the mindset there? So as much as code is disposable, you're doing a lot of review. A lot [00:08:00] of the article talks about how you wanna rephrase everything is prompting everything, is what the agent can't see.It's kind of garbage, right? You shouldn't have it in there. So what's like the high level of how you went about building it, and then how you address okay, humans are just PR review. Like how is human in the loop for this?Ryan Lopopolo: We've moved beyond even the humans reviewing the code as well.[00:08:19] Human Review, PR Automation & Agent Code ReviewRyan Lopopolo: Most of the human review is post merge at this point.But post, post merge, that's not even reviewed. That's justswyx: Oh, let's just make ourselves happy by YouRyan Lopopolo: haven't used fundamentally. The model is trivially paralyzable, right? As many GPUs and tokens as I am willing to spend, I can have capacity to work with my hood base.The only fundamentally scarce thing is the synchronous human attention of my team. There's only so many hours in the day we have to eat lunch. I would like to sleep, although it's quite difficult to, stop poking the machine because it makes me want to feed it. You have to step back, right?Like you need to take a systems thinking mindset to things and [00:09:00] constantly be asking where is the agent making mistakes? Where am I spending my time? How can I not spend that time going forward? And then build confidence in the automation that I'm putting in place. So I have solved this part of the SDLC, and usually what that has looked like is like we started needing to pay very close attention to the code because the agent did not have the right building blocks to produce.Modular software that decomposed appropriately that was reliable and observable and actually accrued a working front end in these things, right?[00:09:35] Observability First SetupRyan Lopopolo: So in order to not spend all of our time sitting in front of a terminal at most, doing one or two things at a time, invested in giving the model that observability, which is that that graph in the post here.swyx: Yeah. Let's walk through this traces and which existed firstRyan Lopopolo: we started with just the app and the whole rest of it. From vector through to all these login metrics, APIs was, I dunno, half an [00:10:00] afternoon of my time. We have intentionally chosen very high level fast developer tools. There's a ton of great stuff out there now.We use me a bunch, which makes it trivial to pull down all these go written Victoria Stack binaries in our local development. Tiny little bit of python glue to spin all these up. And off you go. One neat thing here is we have tried to invert things as much as possible, which is instead of setting up an environment to spawn the coding agent into, instead we spawn the coding agent, like that's the entry point.It's just Codex. And then we give Codex via skills and scripts the ability to boot the stack if it chooses to, and then tell it how to set some end variables. So the app and local Devrel points at this stack that it has chosen to spin up. And this I think is like the fundamental difference between reasoning models and the four ones and four ohs of the past, where these models could not think so you had to put them in [00:11:00] boxes with a predefined set of state transitions.Whereas here we have the model, the harness be the whole box. And give it a bunch of options for how to proceed with enough context for it to make intelligent choices. SoVibhu: sales, so like a lot of that is around scaffolding, right? Yes. Previous agents, you would define a scaffold. It would operate in that.Lube, try again. That's pivoted off from when we've had reasoning models. They're seeming to perform better when you don't have a scaffold, right? That's right.[00:11:28] Docs Skills GuardrailsVibhu: And you go into like niches here too, like your SPEC MD and like having a very short agent MG Agent md.swyx: Yes. Yes.Vibhu: Yeah. So you even lay out what it is here, but I likeswyx: the table contents.Vibhu: Yeah.swyx: Like stuff like this, it really helps guide people because everyone's trying to do this.Ryan Lopopolo: This structure also makes it super cheap to put new content into the repository to steer both the humans and the agents.swyx: You, you reinvented skills, right?Vibhu: One big agents andswyx: skills from first princip holdsRyan Lopopolo: all skills did not exist when we started doing this.Vibhu: You have a short [00:12:00] one 100 line overall table of contents and then you have little skills, right? Core beliefs, MD tech tracker. Yeah. Yeah. The scale is overRyan Lopopolo: The tech jet tracker and the quality score are pretty interesting because this is basically a tiny little scaffold, like a markdown table, which is a hook for Codex to review all the business logic that we have defined in the app, assess how it matches all these documented guardrails and propose follow up work for itself.Before beads and all these ticketing systems, we were just tracking follow up work as notes in a markdown file, which, we could spa an agent on Aron to burn down. There's this really neat thing that like the models fundamentally crave text. So a lot of what we have done here is figure out ways to inject textswyx: intoRyan Lopopolo: the system right when we get a page, because we're missing a timeout, for example.I can just add Codex in Slack on that page and say, I'm gonna fix this by adding a timeout. Please update our reliability documentation. To require that all network calls have [00:13:00] timeouts. So I have not only made a point in time fix, but also like durably encoded this process knowledge around what good looks like.swyx: Yeah.Ryan Lopopolo: And we give that to the root coding agent as it goes and does the thing. But you can also use that to distill tests out of, or a code review agent, which is pointed at the same things to narrow the acceptable universe of the code that's produced.swyx: I think one of the concerns I have with that kind of stuff is you think you're making the right call by making, it's persisted for all time across everything.Yes. But then you didn't think about the exceptions that you need to make, right? And that you have to roll it back.Vibhu: Part of it isswyx: also sometimes it can follow your s instructions too.Vibhu: It's somewhat a skill, right? So it determines when it uses the tools, right? Like it's not like it'll run outta every call.It'll determine when it wants to check quality score, right?Ryan Lopopolo: Yeah. And we do in the prompts we give these agents, allow them to push back,[00:13:51] Agent Code Review RulesRyan Lopopolo: When we first started adding code review agents to the pr, it would be Codex, CLI. Locally writes the change, pushes up a PR on [00:14:00] those PR synchronizations of review agent fires.It posts a comment. We instruct Codex that it has to at least acknowledge and respond to that feedback. And initially the Codex driving the code author was willing to be bullied by the PR reviewer, which meant you could end up in a situation where things were not converging. So yeah, we had to,swyx: he's just a thrash.Ryan Lopopolo: We had to add more optionality to the prompts on both of these things, right? The reviewer agents were instructed to bias toward merging the thing to not surface anything greater than a P two in priority. We didn't really define P two, but we gave it, youswyx: did define P two.Ryan Lopopolo: We gave it a framework within which to score its outputswyx: and then greater than P zero is worse, right?Yes. P two is very good.Ryan Lopopolo: P zero is you will mute the code place ifswyx: you merch thisRyan Lopopolo: thing, right?swyx: Yeah.Ryan Lopopolo: But also on the code authoring agent side, we also gave it the flexibility to either defer or push back against review feedback, right? This happens all the time, right? Like I happen to notice something and leave a code review, [00:15:00] which.Could blow up the scope by a factor of two. I usually don't mean for that to be addressed Exactly. In the moment. It's more of an FYI file it to the backlog, pick it up in the next fix it week sort of thing. And without the context that this is permissible, the coding agents are gonna bias toward what they do, which is following instructions.swyx: Yeah.[00:15:19] Autonomous Merging Flowswyx: I do wanted to check in on a couple things, right? Sure. All the coding review agent, it can merge autonomously. I think that's something that a lot of people aren't comfortable with. And you have a list here of how much agents do they do Product code and tests, CI configuration and release tooling, internal Devrel tools, documentation eval, harness review, comments, scripts that manage the repository itself, production dashboard definition files, like everything.Yes. And so they're just all churning at the same time, is there like a record that, that any human on the team pulls to stop everythingRyan Lopopolo: Because we are building a native application here. We're not doing continuous deploy. So there's still a human in the loop for cutting the release branch.I see. We require a blessed [00:16:00] human approved smoke test of the app before we promote it to distribution, these sort of things.swyx: So you're working on the app, you're not building like infrastructure where you have like nines of reliability, that kinda stuff?Ryan Lopopolo: That's correct. That's correct. Okay. And also like full recognition here that all of this activity took in a completely greenfield repository.There's. Should be no script that this applies generally toswyx: this is a production thing, you're gonna shipRyan Lopopolo: toswyx: customers. Of course. Yeah, of course. So this is realVibhu: And like one of the things there is, you mentioned you started this as a repo from scratch. The onboarding first month or so was pretty, it was like working backwards, right?Yeah. And then you had to work with the system and now you're at that point where you know, you're very autonomous. I'm curious like, okay, so what, how human in the loop is it? So what are the bottlenecks that you wish you could still automate? And part of that is also like, where do you see the model trajectory improving and offloading more human in the loop?We just got 5.4. It's a really good,Ryan Lopopolo: fantastic model, by the way.Vibhu: Yeah. Yeah. It's the first one that's merged. Top tier coding. So it's codex level coding and reasoning. So general reasoning both in one model. SoRyan Lopopolo: andVibhu: computer [00:17:00] use vision.Ryan Lopopolo: Now we now with five four, I can just have Codex write the blog post, whereas for this one I had to balance between chat.swyx: Oh, I need to, I might be out of a job. Oh my God.Ryan Lopopolo: Oh,swyx: I know. You just gave me an idea for a completely AI newsletter that five four could do. Yeah, I get it Now.Ryan Lopopolo: This sort of thing is just one example of closing the loop, right? Like the dashboard thing you mentioned. We have Codex authoring the Js ON, for the Grafana dashboards and publishing them and also responding to the pages, which means when it gets the page, it knows exactly which dashboards are defined and what alerts.What alert was triggered by which exact log in the code base. ‘cause all of this stuff is collated together.swyx: It has to own everything.Yes. Yeah. Yeah.Ryan Lopopolo: And it means that if we have an outage that did not result in a page. It has the existing set of dashboards available to it. It has the existing set of metrics and logs and can figure out where the gaps in the dashboard are or [00:18:00] in the underlying metrics and fix them in one go.In the same way, you would have a full stack engineer be able to drive a feature from the backend all the way to the front end.Vibhu: So it, it seems like a lot of the work you guys had to do was you as a small team are fully working for a way that the model wants the software to be written. It's like less human legible for better. Code legibility, agent legibility. How do you think that affects broader teams? So one at OpenAI, do liaison, like this is how software should be written. Like I can imagine, say you join a new team with this methodology, this mindset there's ways that, teams do code review, teams write code, like teams are structured and a lot of it is for human legibility.So should we all swap? Like how does this play back one broader into OpenAI and then like broader into the software engineering, right? Is it like teams that pick this up will it's pretty drastic, right? You have to make a pretty big switch. Should they just full send Yeah.Ryan Lopopolo: The mindset is very much that I'm removed from the process, right? I can't really have deep code level opinions about [00:19:00] things. It's as if I'm. Group tech leading a 500 person organization.Vibhu: Yeah.Ryan Lopopolo: Like it's not appropriate for me to be in the weeds on every pr. This is why that post merge code review thing is like a good analog here, right?Like I have some representative sample of the code as it is written, and I have to use that to infer what the teams are struggling with, where they could use help, where they're already moving quickly and I can pivot my focus elsewhere.Vibhu: Yeah.Ryan Lopopolo: So I don't really have too many opinions around the code as it is written.I do, however, have a command based class, which is used to have repeatable chunks of business logic that comes with tracing and metrics and observability for free. And the thing to focus on is not how that business logic is structured, but that it uses this primitive ‘cause I know that's gonna give leverage by default.Vibhu: Yeah.Ryan Lopopolo: Yeah, back to that sort of systems stinking,Vibhu: and you have part of that in your blog post, enforcing architecture and ta taste how you set boundaries for what's used. There's also a section on redefining [00:20:00] engineering and stuff, but yeah, it's just, it's interesting to hear,Ryan Lopopolo: and as the models have gotten better, they have gotten better at proposing these abstractions to unblock themselves, which again, lets me move higher and higher up the stack to look deeper into the future on what ultimately blocked the team from shipping.swyx: Yeah. You mentioned so you, this is primarily a, it is like a 1 million line of code base electron app. But it manages its own services as well, so it's like a backend for front end type thing.Ryan Lopopolo: We do have a backend in there, but that's hosted in the cloud.Yeah. This sort of structure is actually within the separate main and render processesWithin theswyx: electric.That's just how electronic works.Ryan Lopopolo: Yeah, of course. So have also treated like. MVC style decomposition with the same level of rigor, which has been very fun.swyx: I have a fun pun. This is a tangent, NVC is model view controller. Any sort of full stack web Devrel knows that.But my AI native version of this is Model view Claw, the clause the harness.Ryan Lopopolo: That's right. That's right. I do think that there is an interesting space to [00:21:00] explore here with Codex, the harness as part of building AI products, right? There's a ton of momentum around getting the models to be good at coding.We've seen big leaps in like the task complexity with each incremental model release where if you can figure out how to collapse a product that you're trying to build, a user journey that you're trying to solve into code, it's pretty natural to use the Codex Harness to solve that problem for you. It's done all the wiring and lets you just communicate and prompts to let the model cook.Yeah. It's been very fun. And there's also a very engineering legible way of increasing capabil. It's fantastic, right? Yeah. Just give you, just give the model scripts, the same scripts you would already build for yourself.swyx: Yeah.Yeah. So for listeners, this is Ryan saying that software engineering or coding against will eat knowledge work like the non-coding parts that you would normally think.Oh, you have to build a separate agent for it. No, start a coding agent and go out from there. Which open Claw has like it's pie Underhood.Ryan Lopopolo: [00:22:00] Yes.Vibhu: Basically define your task in code. Everything is a codingswyx: agent by the way. Since I brought it up, it's probably the only place we bring it up. Is any open claw usage from you?Any?Ryan Lopopolo: No. No. Not for me. I don't have any spare Mac Minis rattling around my house.swyx: You can afford it? No. I just, I'm curious if it's changed anything in opening eye yet, but it's probably early days. And then the other, the other thing I, I wanna pull on here is like you mentioned ticketing systems and you mentioned prs and I'm wondering if both those things have to go away or be reinvented for this kind of coding.So the git itself and is like very hostile to multi-agent.Ryan Lopopolo: Yeah. We make very heavy use of work trees.swyx: But like even then, like I just did a, dropped a podcast yesterday with Cursors saying, and they said they're getting rid of work trees ‘cause it still has too many merge conflicts.It's still un too un unintuitive. But go ahead.Ryan Lopopolo: The models are really great at resolving merge conflicts. Yeah. And to get to a state where I'm not synchronously in the loop in my terminal, I almost don't care that there are mergeswyx: with disposable.[00:23:00] Yeah.Ryan Lopopolo: We invoke a dollar land skill and that coaches codex to push the PR Wait for human and agent reviewers Wait for CI to be green.Fix the flakes if there are any merged upstream. If the PR comes into conflict, wait for everything to pass. Put it in the merge queue. Deal with flakes until it's in Maine. End. This is what it means to delegate fully, right? This is in a, very large model re probably a significant tax on humans to get PRS merged, but the agent is more than capable of doing this and I really don't have to think about it other than keep my laptop open.swyx: Yeah. I used to be much more of a control freak, but now I'm like, yeah, actually you could do a better job of this than me. Yeah. With the right context. Yes.[00:23:47] Encoding Requirementsswyx: Anything else in harness in general? Just this piece, I just wanna make sure we,Ryan Lopopolo: I think one thing that I maybe didn't make super clear in the article that I heard on Twitter as an interesting, that's respond [00:24:00]swyx: to them.What's the chatter and then what's your response?Ryan Lopopolo: Ultimately, all the things that we have encoded in docs and tests and review agents and all these things are ways to put all the non-functional requirements of building high scale, high quality, reliable software into a space that prompt injects the agent.We either write it down as docs, we add links where the error messages tell how to do the right thing. So the whole meta of the thing is to basically tease out of the heads of all the engineers on my team, what they think good looks like, what they would do by default, or what they would coach a new hire on the team to do to get things to merch.And that's why we pay attention to all the mistakes, mistakes that the agent makes, right? This is code being written that is misaligned with some as yet not written down, non-functional requirement.swyx: Sorry, what? Did the online people misunderstand orRyan Lopopolo: No,swyx: whatyouRyan Lopopolo: responded to? Somebody just literally said that.I was like, oh yeah,swyx: okay,Ryan Lopopolo: This is the [00:25:00] thing. This is what I've been doing. Oh, youswyx: agree? Yeah. I see. Interesting.Ryan Lopopolo: One other neat thing, which I did totally did not expect is folks were just. Taking the link to the article and giving it to pi or Codex and say, make my repo this,Vibhu: you achi a whole recursion.Ryan Lopopolo: And it was wildly effective. Really? It was wildly effective. NoVibhu: way. It just actually is something I tried with five, four yesterday. I didn't have time. Last time I was like out speaking of something, and this is one of my things, I was like, okay, I have this article. Can we just scaffold out what it would be like to run this?And I, I did it first as that and then I was like, okay, let me take another little side repo and say okay, if I was to fully automate this like this because I haven't written a line of code, it'sRyan Lopopolo: like over full, setVibhu: it right. The side thing I'm doing of voice. TTS I'm just like, slobbing out, whatever.It's nothing production. I'm like, how would I make this like this? And it's actually like a really good way. It's like a good way to learn what could be changed, what could be like, it's just a good analyzing, right? You give it all the codes, you give it all the context, you give it the article and it walks you through it very well.That's right. That's right.[00:25:57] Inlining Dependencies[00:25:57] Dependencies Going Away & Brett Taylor's Responseswyx: I guess one more thing before we go to Symphony is I wanted to cover [00:26:00] Brett Taylor's response. We had him on the show. He is your chairman, which is wild. Yeah. That he's reading your articles as well and like getting engaged in it. He says software dependencies are going away.Basically they can just be like vendored. Yes. Response.Ryan Lopopolo: Aswyx: hundred percent. A hundred percent agree. You still pro qr, you still pay Datadog. You still pay Temporal. Thank you.Ryan Lopopolo: Yep. The level of complexity of the dependencies that we can internalize is, I would say low, medium right now. Just based on model capability.What does the,swyx: what is medium?Ryan Lopopolo: I would say like a. A couple thousand line dependency is a thing that we could in-house No problem. Call in an afternoon of time. One neat thing about it is like probably most of that code you don't even need. Like by in-house and abstraction, you can strip away all the generic parts of it and only focus on what you need to enable the specific thing.Yes. You're building,swyx: I've been calling this the end of b******t plugins.Ryan Lopopolo: Yeah.swyx: Because there's so much when I published an open source thing, I want to accept everything, be liberal. I want to accept, this is post's law, but that means there's so much bloat. Yes. There's so much overhead.Ryan Lopopolo: One other neat thing about [00:27:00] this too is when we deploy Codex Security on the repo, it is able to deeply review and change. The internalized dependencies in a much lower friction way than it would be to like, push patches upstream, wait for them to be released, pull them down, make sure that's compatible with all the transitive I have in my repo and things like that.So it's also much lower friction to internalize some of these things if code is free. ‘cause the tokens are cheap sort of thing.swyx: Yeah. Yeah. I think like the only argument I have against this is basically scale testing, which obviously the larger pieces of software like Linux, MySQL, he calls up even the Datadog and Temporals and then maybe security testing where Yes.Classically, I think, is it linis tos, it said security open source is the best disinfectant.Ryan Lopopolo: Many eyes.swyx: Many eyes. And if inline your dependencies and code them up, you're gonna have to relearn mistakes from other people that Yep.Ryan Lopopolo: Yep. And to internalize that dependency, you're back to zero and you have to start.Reassembling all those bits and pieces to Yeah. Have [00:28:00] high confidence in the code as it is written. Yeah.Vibhu: Even part of the first intro of this, you basically mentioned like everything was written by codex, including internal tooling, right? So internal tooling, like when you're visualizing what's going on it's writing it for itself.swyx: Yeah. I'm built internal tools way I now, and like I just show them off and they're like, how long did you spend? And I didn't spend any time. I just prompted it,Ryan Lopopolo: very funny story here.swyx: Yeah, go ahead.Ryan Lopopolo: We had deployed our app to the first dozen users internally had some performance issues, so we asked them to export a trace for us get a tar ball, gave it to our on-call engineer, and he did a fantastic job of working with Codex to build this beautiful local Devrel tool, next JS app, the drag and drop the tar ball in, and it visualizes the entire trace.It's fantastic. Took an afternoon, but none of this was necessary. Because you could just spin up codex and give it the tar ball and ask the same thing and get the response immediately. So in a way, optimizing for human [00:29:00] legibility of that debugging process was wrong. It kept him in the loop unnecessarily when instead he could have just like Codex cooked for five minutes and gotten this same.swyx: Yeah, you verify your instincts here of this is how we used to do it. Or this is how I would have used to solve it.Ryan Lopopolo: Yeah. In this local observability stack. Like sure, you can de deploy Yeager to visualize the traces, but I wouldn't expect to be looking at the traces in the first place because I'm not gonna write the code to fix them.swyx: Yeah. So basically there needs to be like this kind of house stack and owning the whole loop. I think that is very well established. And it sounds like you might be like sharing more about that in the future, right?Ryan Lopopolo: Yeah. I think we're excited to do[00:29:36] Ghost Libraries Specs[00:29:36] Ghost Libraries & Distributing Software as SpecsRyan Lopopolo: We're gonna talk about Symphony in a little bit, but like the way we distribute it as a spec, which I think folks are calling Ghost Libraries on Twitter.This is like a such a cool name. It does mean it becomes much cheaper to share software with the world, right? You define a spec, how you could build your own specifying as much as is required for a coding agent to reassemble it [00:30:00] locally. The flow here is very cool. Like we have taken. All the scaffolding that has existed in our proprietary repo spun up a new one.Ask Codex with our repo as a reference. Write the spec. We tell it. Spin up a team ox spawn a disconnected codex to implement the spec. Wait for it to be done. Spawn another codex and another team ox to review the spec com or review the implementation compared to upstream and update the spec so it diverges less.And then you just loop over and over Ralph style until you get a spec that is with high fidelity able to reproduce the system as it is. It's fantastic.Vibhu: And you're basically, you're not really adding any of your human bias in there, right? That's correct. A lot of times people write a spec and be like, okay, I think it should be done this way, and you'll riff on something.And it's no, the agent could have just handled it like you're still scaffolding in a sense, right? I want it done this way. It can determine its spec better.swyx: That's right. That's right. Part of me it, I'm, I've been working a lot on evals recently, and part of me is wondering if [00:31:00] an agent can produce a spec that it cannot solve.Is it always capable of things that he can imagine or can you imagine things that it is impossible to do?Ryan Lopopolo: I think with Symphony, we, there's like this there's this axis where you have things that are easier, hard, or established or new, right? And I think things that are hard and new is still something that the models need humans.Yeah. Drive.swyx: Yeah. Yeah.Ryan Lopopolo: But I think those other quadrants are largely salt. Given the right scaffold and the right thing that's gonna drive the agent to completion,swyx: it's crazy that it solved,Ryan Lopopolo: but it means that the humans, the ones with limited time and attention get to work on the hardest stuff, like the problems where it's pure white space out in front. Or like the deepest refactorings where you don't know what the proper shape of the interfaces are. And this is where I wanna spend my time. ‘cause it lets me set up for the next level of scale.swyx: Yeah. Yeah. Amazing. Let's introduce Symphony.I think we've been mentioning it every now and then. Elixir. Interesting option.Ryan Lopopolo: Yeah.swyx: Yeah. I'm not,Ryan Lopopolo: again, like the [00:32:00] elixir manifestation here is just a derivative. Is it a modelswyx: chosen? Yeah.Ryan Lopopolo: Yeah. Yeah. And it chose that because the process supervision and the gen servers are super amenable to the type of process orchestration that we're doing here.You are essentially spinning up little Damons for every task that is in execution and driving it to completion, which. Means the mall gets a ton of stuff for free by using Elixir and the Beam.swyx: I had to go do a crash course in Beam and Elixir, and I think most people are not operating at that scale of concurrency where you need that.But it is a good mental model for Resum ability and all those things. And these are things I care about. But tell me the story, the origin story of Symphony. What do you use it for? Is this, how did it form maybe any abandoned paths that you didn't take?[00:32:46] Terminal Free Orchestration[00:32:46] Symphony: Removing Humans from the LoopRyan Lopopolo: At the end of December we were at about three and a half PRS per engineer per day.This was before five two came out in the beginning of January. Everyone gets back from holiday with five two and no other work [00:33:00] on the repository. We were up in the five to 10 PRS per day per engineer. And I don't know about y'all, but like it's very taxing to constantly be switching like that. Like I was pretty tapped out at the end of the day, again, where are the humans spending their time? They're spending their time context switching between all these active tmox pains to drive the agent forward.swyx: Yeah. No way. Yeah.Ryan Lopopolo: So let's again, build something to remove ourselves from the loop. And this is what frantic sprinted adapt here to find a way to remove the need for the human to sit in front of their terminal.So a lot of experimentation with Devrel boxes and, automatically spinning up agents, like it seems like a fantastic end state here, where my life is beach. I open live twice a day and say yes no to these things. Yeah. And this is again, a super, super interesting framing for how the work is done.Because I become more latency and sensitive. I have [00:34:00] way less attachment to the code as it is written. Like I've had close to zero investment in the actual authorship experience. So if it's garbage. I can just throw it away and not care too much about it. In Symphony, there's this like rework state where once the PR is proposed and it's escalated to the human for review, it should be a cheap review.It is either mergeable or it is not. And if it's not, you move it to rework. The elixir service will completely trash the entire work tree NPR and start it again from scratch. Okay. And this is that opportunity again to say, why was it trash right? What did the agent do that wasswyx: bad. Yeah.Ryan Lopopolo: Fix that before moving the ticket toswyx: endRyan Lopopolo: of progress again.swyx: Yeah. Why is this not in codex app? I guess this, you guys are ahead of Codex app,Ryan Lopopolo: yeah, so the way the team has been working is basically to be as AI pilled as possible and spread ahead. And a lot of the things we have worked on have fallen out [00:35:00] into a lot of the products that we have.Like we were in deep consultation with the Codex team to. Have the Codex app be a thing that exists, right? To have skills be a thing that Codex is able to use. So we didn't have to roll our own to put automations into the product. So all of our automatic refactoring agents didn't have to be these hand rolled control loops.It has been really fantastic to be, in a way, un anchored to the product development of Frontier and Codex and just very quickly try to figure out what works and then later find the scalable thing that can be deployed widely. It's been a very fun way to operate. It's certainly chaotic. I have lost track very often of what the actual state of the code looks like.‘cause I'm not in the loop. There was. One point where we had wired playwright directly up to the Electron app. With MCPM CCPs, I'm pretty bearish on because the harness forcibly injects all those tokens in the [00:36:00] context, and I don't really get a say over it. They mess with auto compaction. The agent can forget how to use the tool.There's probably only what three calls in playwright that I actually ever want to use. So I pay the cost for a ton of things. Somebody vibed a local Damon that boots playwright and exposes a tiny little shim CLI to drive it. And I had zero idea that this had occurred because to me, I run Codex and it's able to, it's oh, it's better.Yeah. Like no knowledge of this at all. Uhhuh.[00:36:30] Multi Human ChaosRyan Lopopolo: So we have had like in human space to spend a lot of time doing synchronous knowledge sharing. We have a daily standup that's 45 minutes long because we almost have to. Fan out the understanding of the current state.swyx: Yeah, I was gonna say this is good for a single human multi-agent, but multi human, multi-agent is a whole like po like explosion of stuff.Ryan Lopopolo: Yeah. And that this is fundamentally why we have such a rigid, like 10,000 [00:37:00] engineer level architecture in the app because we have to find ways to carve up the space so people are not trampling on each other.swyx: Sorry, I don't get the 10,000 thing. Did I miss that?Ryan Lopopolo: The structure of the repository is like 500 NPM packages.It's like architecture to the excess for what you would consider, I think normal for a seven person team. But if every person is actually like 10 to 50. Then the like numbers on being super, super deep into decomposition and sharding and like proper interface boundaries make a lot more sense.swyx: Yeah. To me, that's why I talked about Microfund ends and I, an anex is from that world, but Cool. It is just coming back to, to, to this I dunno if you have other, thoughts on. Orchestrating so much work coin going through this. Is this enough? Is this like any aha moments?Vibhu: It'll be interesting to see like where, okay, so right now you pick linear as your issue tracker, right?swyx: Or it's like a is it actually linear? This is actually linear.[00:37:55] Linear vs Slack WorkflowVibhu: Oh, that's linear. It's linear.swyx: Oh I never looked atVibhu: video. The demo video I had to download to [00:38:00] run.swyx: So I, because I'm a Slack maxie, but Yeah, linear. Linear is also really good. Yes,Ryan Lopopolo: we do make a good use of Slack. We we fire off codex to do all these lotion, elasticity, fix ups, the things that like sync that knowledge into the repository.It's super cheap. Yeah.swyx: Yeah.Ryan Lopopolo: Just do it in Codex.swyx: My biggest plug is OpenAI needs to build Slack. You need to own Slack. Build yours. Turn this into Slack.Ryan Lopopolo: I did read about it. Youswyx: did?Ryan Lopopolo: Yeah.[00:38:25] Collaboration Tools for AgentsRyan Lopopolo: I would say that if we think that we want these agents to do economically valuable work, which is like this is the mission, right?We want AI to be deployed widely, to do economically valuable work, then we need to find ways for them to naturally collaborate with humans, which means collaboration tooling, I think, is an interesting space to explore.swyx: Yeah, totally. Yeah. GitHub, slack, linear.Vibhu: Yeah, that was my thing. Okay, where do we see right now Codex has started Codex Model, then CLI, now there's an app, app can let me shoot off multiple Codex is in parallel, but there's no great team collaboration for Codex.And it [00:39:00] seems like your team had some say into what comes out, right? So you talked to ‘em, codex kind of was a thing. From there, if you guys are on the bound, what stuff that like, you might not focus on, but what do you expect other people to be building, right? So people that are like five x 50 Xing.Should you build stuff that's like very niche for your workflow, for your team? Should it be more general so other people can adopt? Is there a niche there? ‘Cause part of it is just okay, is everything just internal tooling? Do we have everything our own way? Like the way our team operates has our own ways that we like to communicate or is there a broader way to do it?Is it something like a issue tracker? Just thoughts if you wanna riff on that.[00:39:35] Standardizing Skills and CodeRyan Lopopolo: I think TBD we have not figured this out in a general way. I do think that there is leverage to be had in making the code and the processes as much the same as possible. If you think that code is context, code is prompts, it's better from the agent behavior perspective to be able to look in a package in directory X, Y, Z, and it not to have to page so [00:40:00] deeply into directory if you C, because they have the same structure, use the same language, they have the same patterns internally.And that same like leverage comes from aligning on a single set of skills that you're pouring every engineer's taste into to make sure that the agent is effective. So like in our code base, we have, I think, six skills. That's it. And if some part of the software development loop is not being covered, our first attempt is to encode it in one of the existing setup skills, which means that we can change the agent behavior.Yeah. More cheaply than changing the human driver behavior.swyx: Yeah.[00:40:39] Self Improvement via Logsswyx: Have you ever, have you experimented with agents changing their own behavior?Ryan Lopopolo: We do.swyx: Yeah. Or parent agent changing a subagents, behavior or something like that.Ryan Lopopolo: We have some bits for skill distillation. So for example, there's one neat thing you can do with Codex, which is just point it at its own session logs to ask it to tell you how you can use [00:41:00] the tool pedal better.swyx: It's like introspectionRyan Lopopolo: or ask it to do things. I useVibhu: this session better. What skills should Iswyx: high? I like the modification of, you can do, just do things to you can just ask agent to do things.Ryan Lopopolo: Yeah. You can just codex things. This is like a, this is like a silly emoji that we have, right? You can just codex things, you can just prompt things.It's really glorious future we live in, but okay, you can do that one-on-one. But we're actually slurping these up for the entire team into blob storage and. Running agent loops over them every day to figure out where as a team can we do better and how do we reflect that back into the repositories?Yes, though everybody benefits from everybody else's behavior for free. Same for like PR comments, right? These are all feedback. That means the code as written, deviated from what was good, a PR comment, a failed build. These are all signals that mean at some point the agent was missing context. We gotta figure out how toswyx: Yeah.Ryan Lopopolo: Slurp it up and put it back in the reboot.swyx: By the way, I do this exactly right. I used to, when I use cloud code for [00:42:00] knowledge work, cloud cowork is like a nice product, right? Yes. In I think you would agree. I always have it tell me what do I do better next time? And that's the meta programming reflection thing.So I almost think like you have six reflection extraction levels in symphony and almost like the zero of layer. So the six levels are PO policy, configuration, coordination, execution, integration, observability. We've talked about a couple of these, but the zero layer is like the, okay, are we working well?Can we improve how we work? Yes. Can I modify my own workflow without MD or something? I don't know.Ryan Lopopolo: Yeah, of course. Yeah, of course you can. Like this thing is also able to cut its own tickets ‘cause we give it full access.Yeah. Make it a ticket to have it cut. Tickets you can.Put in the ticket that you expect it to file as on follow up work,swyx: like Yeah. Self-modifying. Yeah.Ryan Lopopolo: Yeah.[00:42:44] Tool Access and CLI FirstRyan Lopopolo: Put, don't put the agent in a box. Give the agent full accessibility over it. Domain.swyx: I had a mental reaction when you said don't put the agent in a box. So I think you should put it in a box. Like it's just that you're giving the box everything it needs.Ryan Lopopolo: Yeah. Context and tools.swyx: But we're like, as developers, we're used to calling [00:43:00] out to different systems, but here you use the open source things like the Prometheus, whatever, and you run it locally so that you can have the full loop. I assume.Ryan Lopopolo: Yep.Vibhu: I think likeRyan Lopopolo: another, you wanna minimize cloud, cloud dependencies.Vibhu: You also want to make sure that you think about what the agent has access to. What does it see? Does it go back into the loop, like from the most basic sense of you let it see its own like calls, traces it can determine where it went wrong. But are you feeding that back in? So you know, just the most basic level of you wanna see exactly what's input output, like does the agent have access to.What is being outputted, right? It can self-improve a lot of these things. It's allRyan Lopopolo: text, right? My job is to figure out ways to funnel text from one agent to the other.swyx: It's so strange like way back at the start of this whole AI wave Andre was like, English is the hottest day programming language.It's here, it's just Yeah. The feature as well.Vibhu: A lot of, okay. Like a lot of software, a lot of stuff. There's a gui, it's made for the human. We're seeing the evolution of CLI for everything, right? All tools have CLIs. Your agents can use [00:44:00] them well, do we get good vision? Do we get good little sandboxes?Like right now? It's a really effective way, right? Models love to use tools. They love the best. They love to read through text. So slap a CLI let it go loose. That works for everything.Ryan Lopopolo: It does. Yeah. Yeah.[00:44:14] UI Perception and RasterizingRyan Lopopolo: We've also been adapting nont, textual things to that shape in order to improve model behavior in some ways, right?We want the agent to be able to see the UI agents do not perceive visually in the same way that we do. They don't see a red box, they see red box button, right? They see these things in latent space. So if we want, Hey, yeah, I do. We haveswyx: a ding if that goes off every time. Alien spaceRyan Lopopolo: ding.Anyway if we wanna actually make it see the layout, it's almost easier to rasterize that image to ask EOR and feed it in to the agent. Ha. And there's no reason you can't do both, right? To like further refine how the model perceives the object it's [00:45:00] manipulating.swyx: Cool. Could we, you wanna talk about a couple more of these layers that might bear more introspection or that you have personal passion for?[00:45:07] Coordination Layer with ElixirRyan Lopopolo: I will say that the coordination layer here was a really tricky piece to get right.swyx: Let's do it. Yep. I'm all about that. And this is Temporal core.Ryan Lopopolo: This is where when we turn the spec into Elixir, where like the model takes a shortcut, right? Like it's oh, I have all these primitives that I can make use of in this lovely runtime that has native process supervision.Which is I think, a neat way to have taken the spec and made it more choices achievable by making choices that naturally mapswyx: Yeah.Ryan Lopopolo: To the domain, right? In the same way that like you would prefer to have a TypeScript model repo if you are doing full stack web development, right? Because the ability to share types across the front end and backend reduces a lot of complexity.And becauseswyx: that's what graph kill used to be.Ryan Lopopolo: That's right. Andswyx: I don't know if it's still alive, butRyan Lopopolo: [00:46:00] no humans in the loop here. So like my own personal ability to write or not write elixir. Doesn't really have to bias us away from using the right tool for the job. It is just wild.swyx: Love it. I love it.Yeah. I wonder if any languages struggle more than others because of this? I feel like everyone has their own abstractions. That would make sense. But maybe it might be slower, it might be more faulty where like you'd have to just kick the server every now and then. I, I don't know. I think observability layer is really well understood.Integration layer, CP is dead. I think all these just like a really interesting hierarchy to travel up and down. It's common language for people working on the system to understandRyan Lopopolo: The policy stuff is really cool, right? Yeah. You don't really have to build a bunch of code to make sure the system wait for the, to passswyx: it's institutional knowledge.Ryan Lopopolo: Yeah. You just give it the G-H-C-L-I with some text that say CI has to pass. It makes the maintenance of these systems a lot easier.[00:46:57] Agent Friendly CLI Outputswyx: Do you think that CLI maintainers need to be [00:47:00] do anything special for agents or just as is? It's good because like I don't think when people made the G GitHub, CLI, they anticipated this happening.Ryan Lopopolo: That's correct. The GH CLI is fantastic. It's great super industry.swyx: Everyone go try GH repo create GH pull and then pull request number, right? GH HPR, like 1 53, whatever. And then it like pullsRyan Lopopolo: basically my only interaction with the GitHub web UI at this point is GH PR view dash web.Exactly. Glanceswyx: at the diffRyan Lopopolo: and be like Sure thing. Send it. Yeah. But the CLI are nice ‘cause they're super token efficient and they can be made more token efficient really easily. Like I'm sure you all have seen like I go to build Kite or Jenkins and I could just get this massive wall of build output.And in order to unblock the humans, your developer productivity team is almost certainly gonna write some code that parses the actual exception out of the build logs and sticks it in a sticky note at the top of the page. And you basically [00:48:00] want CLI to be structured in a similar way, right? You're gonna want to patch dash silent to prettier because the agent doesn't care that every file was already formatted.Just wants to know it's either formatted or not. So it can then go run a right command. Similarly, like in our PNPM distributed script runner, when we had one, when you do dash recursive, like it produces a absolute mountain of text. But all of that is for passing. Test suites. So we ended up wrapping all of this in another scriptswyx: to suppress the,Ryan Lopopolo: which you can vibe the channel only output the failing parts of the tests.swyx: You make a pipe errors versus the standard, standard out. I don't know. Okay. Whatever. Too much thinking have to do that. The CII used to maintain SCLI for my company and yeah, this is like core, very core to my heart. But you're vibing my job.Ryan Lopopolo: That's right.swyx: Cool. Any other things?This is a long spec. [00:49:00] I appreciate that. It's got a lot of strong opinions in here. Any other things that we should highlight? I think obviously you can spend the whole day going through some of these, but I do think that some of these have a lot of care or some of this you might wanna tell people, Hey, take this, but, make it your own.[00:49:15] Blueprint Spec and GuardrailsRyan Lopopolo: Fundamentally, software is made more flexible when it's able to adapt to the environment in which it is deployed, which means that things like linear or GitHub even are specified within the spec, but not required pieces of it. There's like a more platonic ideal of the thing that you could swap in like Jira or Bitbucket, for example.But being able to tightly specify things like the ID formats or how the Ralph Loop works for the individual agents. Basically means you can get up and running with a fully specified system quickly that you then evolve later on. I think we never intended for this to be a static spec that you can [00:50:00] never change.It's more like a blueprint to get something worth a starting point up and running.swyx: Yeah.Ryan Lopopolo: For you then to vibe later to your heart's content,swyx: you have like code and scripts in here where it's oh, I think this is a really good prompt. It's just a very long prompt.Ryan Lopopolo: Fundamentally, the agents are good at following instructions, so give them instructions.And it will, improve the reliability of the result. We, much like the way we use Symphony, we don't want folks to have to monitor the agent as it is vibing the system into existence. So being very opinionatedVery strict around what these success criteria are means that our deployment success rate goes up. Yeah. It means we don't have to get tickets on this thing.Vibhu: Think it all goes back to that like code to disposable, right? Like early on when you had CLI or you'd kick off a Codex run, it would take two hours. You would wanna monitor okay, I'm in the workflow of just using one.I don't want it to go down the wrong path. I'll cut it off and, just shoot off four, like that was my favorite thing of the Codex app, right? Yeah. Just Forex it like, [00:51:00] it's okay. One of them will probably be right, one of them might be better. Stop overthinking it. Like my first example was probably like deep research.When you put out deep research and I'd ask it something like, I asked it something about LLM, it thought it was legal something and spent an hour, came back with a report completely off the rails. And I was like, okay, I gotta monitor this thing a bit. No don't monitor it. Just you want to build it so it's that it, it goes the right way.And you don't wanna, you don't wanna sit there and babysit, right? You don't want to babysit your agentsRyan Lopopolo: with that deep research query that you made. Looking at the bad result, you probably figured out you needed to tweak your prompt Yeah. A bit, right? That's that guardrail that you fed back into the code base for the task, your prompt to further align the agent's execution.Same sort of concept supply there too.swyx: When you talk, how are the customers feelingRyan Lopopolo: for Symphony? I think we have none, right? This is a thing we have put out into theswyx: world. Symphony's internal, right? As long as you are happy, you are the customer. That'

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0
Marc Andreessen introspects on The Death of the Browser, Pi + OpenClaw, and Why "This Time Is Different"

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Apr 3, 2026 76:20


Fresh off raising a monster $15B, Marc Andreessen has lived through multiple computing platform shifts firsthand, from Mosaic and Netscape to cofounding A16z. In this episode, Marc joins swyx and Alessio in a16z's legendary Sand Hill Road office to argue that AI is not just another hype cycle, but the payoff of an “80-year overnight success”: from neural nets and expert systems to transformers, reasoning models, coding, agents, and recursive self-improvement. He lays out why he thinks this moment is different, why AI is finally escaping the old boom-bust pattern, and why the real bottleneck may be less about models than about the messy institutions, incentives, and social systems that struggle to absorb technological change.This episode was a dream come true for us, and many thanks to Erik Torenberg for the assist in setting this up. Full episode on YouTube!We discuss:* Marc's long view on AI: from the 1980s AI boom and expert systems to AlexNet, transformers, and why he sees today's moment as the culmination of decades of compounding technical progress* Why “this time is different”: the jump from LLMs to reasoning, coding, agents, and recursive self-improvement, and why Marc thinks these breakthroughs make AI real in a way prior cycles were not* AI winters vs. “80-year overnight success”: why the field repeatedly swings between utopianism and doom, and why Marc thinks the underlying researchers were mostly right even when the timelines were wrong* Scaling laws, Moore's Law, and what to build: why he believes AI scaling laws will continue, why the outside world is messier than lab purists assume, and how startups can still create durable value on top of rapidly improving models* The dot-com crash and AI infrastructure risk: Marc's comparison between today's AI capex boom and the fiber/data-center overbuild of 2000, plus why he thinks this cycle is different because the buyers are huge cash-rich incumbents and demand is already here* Why old NVIDIA chips may be getting more valuable: the pace of software progress, chronic capacity shortages, and the idea that even current models are “sandbagged” by supply constraints* Open source, edge inference, and the chip bottleneck: why Marc thinks local models, Apple Silicon, privacy, trust, and economics all point toward a major role for edge AI* American vs. Chinese open source AI: DeepSeek as a “gift to the world,” why open models matter not just because they're free but because they teach the world how things work, and how open source strategies may shift as the market consolidates* Why Pi and OpenClaw matter so much: Marc's claim that the combination of LLM + shell + filesystem + markdown + cron loop is one of the biggest software architecture breakthroughs in decades* Agents as the new “Unix”: how agent state living in files allows portability across models and runtimes, and why self-modifying agents that can extend themselves may redefine what software even is* The future of coding and programming languages: why Marc thinks software becomes abundant, why bots may translate freely across languages, and why “programming language” itself may stop being a salient concept* Browsers, protocols, and human readability: lessons from Mosaic and the web, why text protocols and “view source” mattered, and how similar principles may shape AI-native systems* Real-world OpenClaw use: health dashboards, sleep monitoring, smart homes, rewriting firmware on robot dogs, and why the most aggressive users are discovering both the power and danger of agents first* Proof of human vs. proof of bot: why Marc thinks the internet's bot problem is now unsolvable via detection alone, and why biometric + cryptographic proof of human becomes necessaryTimestamps* 00:00 Marc on AI's “80-Year Overnight Success”* 00:01 A Quick Message From swyx* 01:44 Inside a16z With Marc Andreessen* 02:13 The Truth About a16z's AI Pivot* 03:29 Why This AI Boom Is Not Like 2016* 06:33 Marc on AI Winters, Hype Cycles, and What's Different Now* 10:09 Reasoning, Coding, Agents, and the New AI Breakthroughs* 12:13 What Founders Should Build as Models Keep Improving* 16:33 AI Capex, GPU Shortages, and the Dot-Com Crash Analogy* 24:54 Open Source AI, Edge Inference, and Why It Matters* 33:03 Why OpenClaw and PI Could Change Software Forever* 41:37 Agents, the End of Interfaces, and Software for Bots* 46:47 Do Programming Languages Even Have a Future?* 54:19 AI Agents Need Money: Payments, Crypto, and Stablecoins* 56:59 Proof of Human, Internet Bots, and the Drone Problem* 01:06:12 AI, Management, and the Return of Founder-Led Companies* 01:12:23 Why the Real Economy May Resist AI Longer Than Expected* 01:15:53 Closing ThoughtsTranscriptMarc: Something about AI that causes the people in the field, I would say, to become both excessively utopian and excessively apocalyptic. Having said that, I think what's actually happened is an enormous amount of technical progress that built up over time. And like for, for example, we now know that neural network is the correct architecture.And I, I will tell you like there was a 60 year run where that was like a, you know, or even 70 years where that was controversial. And so, so the way I think about what's happening is basically, I think, I think about basically the, the, the period we're in right now is it's, I call it 80 year overnight success, right?Which is like, it's an overnight success ‘cause it's like bam, you know, chat GPT hits and then, and then oh one hits, and then, you know, open claw hits and like, you know, these are open, these are, these are like overnight, like radical, overnight transformative successes, but they're drawing on an 80 year sort of wellspring backlog, you know, of, of, of, of ideas and thinking it's not just that it's all brand new, it's that it's an unlock of all of these decades of like very serious, hardcore research.If I were 18, like this is a hundred, this is what I would be spending all of my time on. This is like such an incredible conceptual breakthrough.swyx: Before we get into today's episode, I just have a small message for listeners. Thank you. We will not be able to bring you the ai, engineering, science, and entertainment contents that you so clearly want if you didn't choose to also click in and tune into our content.We've been approached by sponsors on an almost daily basis, but fortunately enough of you actually subscribed to us to keep all this sustainable without ads, and we wanna keep it that way. But I just have one favor to ask all of you. The single, most powerful, completely free thing you can do is to click that subscribe button.It's the only thing I'll ever ask of you, and it means absolutely everything to me and my team that works so hard to bring the in space to you each and every week. If you do it, I promise you will never stop working to make the show even better. Now, let's get into it.Alessio: Hey everyone, welcome to the Lidian Space Pockets. This is CIO, founder Kernel Labs, and I'm joined by s Swix, editor of Lidian Space.swyx: Hello. And we're in a 16 Z with a, uh, mark G and welcome.Marc: Yes, yes. A and what, half of 16? Something like that. A one. Exactly,swyx: exactly. Uh, apparently this is the, the final few days in your, your current office.You're moving across the road.Marc: Uh, we're, yeah. We have a, we have some, we have some projects underway, but yeah, this is actually, oh, this is the original. We're in actually the original office. We're in the, we're in the, we're, we're in the whole thing.swyx: It's beautiful. Yeah. Great.Marc: Thank you.swyx: So I have to come out, uh, this is a, you know, I wanted to pick a spicy start in October, 2022.I just made friends with Roone and, uh, I wanted to give him something to sort of be spicy about. And I said, uh. Uh, it'll never not be funny. The A 16 Z was constantly going. The future is where the smart people choose to spend their time and then going deep into crypto and not in ai. And that was in October 22nd, 2022.And Ruen says there was an internal meeting in a 16 Z to reorient around Gen ai. Obviously you have, but was there a meeting? What, what was that?Marc: I mean, I don't, look, I've been doing AI since the late eighties.swyx: Yeah.Marc: So I, I don't know, like all that, as far as I'm concerned, this stuff is all Johnny cum lately.Yeah. You, I mean, look, we've been doing ar entire existence. I mean, we've been doing AI machine learning deep, you know, deeply. We've been doing this stuff way from the beginning. Obviously a AI is just core to computer science. I, I, I actually view them as like quite, uh, quite continuous. Um, you know, Ben and I both have computer science degrees.Um, you know, we, we both, Ben, Ben and I actually both are world enough to remember the actual AI boom in the 1980s. Yeah. There was like a, there was a big AI boom at the time. Um, and there was a, was names like expert systems. Um, and they of like lisp and lisp machines. Uh, I, I coded in lisp. I was coding a lisp in 1989.When that was the, the language of the AI future. Um, yeah. So this is something that we're like completely, you completely comfortable with. I've been doing the whole time and are very enthusiastic aboutswyx: is there a strong, like this time is different because, uh, my closest analog was 20 16 17. It was an AI boom.Mm-hmm. And it petered out very, very quickly. Um, we, it just, it just in terms of investingMarc: sort of, sort of,swyx: yeah. Investment, investment excitement.Marc: Although that's really when the, the, the Nvidia phenomenon really, it was, I would say it was in that period when it was very clear that at, at the time it, the vocabulary was more machine learning, but it, it was very clear at that time that machine learning was hitting some sort of takeoff point.Alessio: Yeah.Marc: Well, and as you guys, you guys have talked about this at length on, on your thing, but, you know, if you really track what happened, I think the real story is, it was, it was the Alex net, uh, basically breakthrough in like 2013. That was the, that was the real knee in the curve. Um, and then it was obviously the transformer breakthrough in 17.Alessio: Yeah.Marc: Um, and then everything that followed. But, but, you know, look, machine learning, you know, there were, you know, look, uh, I mean look, I've been working, you know, I've been working with, uh, one of my, you know, kind of projects working with Facebook since 2004. Um, and on the board since 2007, and of course, you know, they, they started using machine learning very early, um, and, you know, have used it basically, you know, for like 20 years for, you know, content, you know, feed optimization and advertising optimization.And obviously many, you know, financial services. You know, many, many, many companies, many different sectors have been doing this. And so it's like one of these things, it's like, it's not a, it's not a single thing. Like it's, it's like, it's like layers, right? Yeah. Um, and, and the layers arrive at different paces and, but they kind of build up.swyx: Yeah.Marc: Uh, they kind of build up over time and then, and then, yeah. And then look, in retrospect, it was 2017 was kind of the, you know, the key, the key point with the trans transformer and then. And then as you guys know, there was this really weird like four year period where it's like the, the transformer existed and then it was just like,swyx: let's go.Yeah.Marc: Well, but, but it was just, but, but between 2020, but between 2017 and 2021, I mean, that was the era of which like companies like Google had internal chat Botts, but they weren't letting anybody use them.swyx: Yeah.Marc: Right. And then, you know, and then OpenAI developed Chat GT or GPT two, and then they told everybody, this is way too dangerous to deploy.Right. Yeah. You know, we can't possibly let normal people, normal people use this thing. And then you, you guys, I'm sure remember AI Dungeon, um mm-hmm. So the o for, there was like a year where like the only way for a normal person to use GP T three was in, in AI dungeon.Alessio: Yeah.Marc: And so you, you, we would do this, you'd go in there and you'd pretend to play Dungeons and Dragons.In reality, you're just trying to talk to talk to GPT. And so there was this, you know, there was this long, you know, and I, you know, the big, big companies, you know, big companies are cautious and, you know, the big companies were cautious. It, it, by the way, it took open ai. You know, they, they, they talk about this, it took open AI time to actually adjust, you know, kind of re redirect their researchswyx: path.I, I think, uh, let say Rosewood, right? Uh, the, the dinner that founded OpenAI was right there.Marc: Right, right. But that, that dinner would've taken place in 20swyx: 18Marc: 19. The formation of OpenAI Uhhuh as late as 2018.swyx: Uh, uh, sorry. Uh, no, I'm, I'm, I'm, I'm wrong. Probably It should be 20. Yeah. They just celebrated a 10 year anniversary, so it it is 2025.Yeah, so, so 2015?Marc: Yeah. 2015. Yeah. 2015. But then, uh, um, Alec Radford did G PT one in what, probablyswyx: mm-hmm. 17, 18,Marc: yeah. 17, 18. So it, yeah. For, and then, and then they didn't really, and then GPT three was what? 2020? 2020.swyx: 2020.Marc: Because that became copilot immediately. Even open ai, which has been, you know, the leader of, of this thing in the last decade, you know, e even they had to adapt and, and, and lean into the new thing.And so. Um, yeah, I, I think it's just this process of basically sort of wave after wave layer after layer, you know, building on itself. And then you kind of get these catalytic moments where, where the whole thing pops and, and obviously that's what's happening now.swyx: Is it useful to think about will there be any ai, winter?‘cause there's always these patterns. Like, is this, in the summer is something I constantly think about because do I get, do I just like. Just get endlessly hyped and just trust that I will only be early and never wrong or right. Well, are we, will there be a winter?Marc: So there's something about, say the following.There's something about AI that has led to this repeated pattern. Um, and, and, and you guys know this,swyx: it's summer, winter, summer,Marc: winter, summer, winter, summer, winter. And it goes back 80 years. Yeah. 80 years. Uh, so the original neural network paper was 1943. Right. Which is, which is amazing. Uh, that it was, it was far back that long.And then there was you, if you guys have ever talked about this on your show, but there was this, uh, there was a big, uh, there was an a GI conference at Dartmouth University in 1950. 55. 55, yeah. And they got a NSF grant to, uh, for the, all the AI experts at the time to spend the summer together. And they figured if they had 10 weeks together, they could get a GI, uh, at the other end.And they got their, by the way, they got the grant, they got the 10 weeks and then, you know, 1955, you know. No, no. A GI. And like I said, I, I lived through the eighties version of this where there was a big, a big boom and a crash. And so, so there is this thing, and there, there is something about AI that causes the people in the field, I would say, to become both excessively utopian and excessively apocalyptic.Um, and, and it's probably on both sides of like the, the, the boom bus cycle. You, you kind of see that play out. Having said that, I think what's actually happened is like just, and you know, and we now know in retrospect like an enormous amount of technical progress that built up over time. And like for, for example, we now know that neural network is the correct architecture.And I, I will tell you like there was a 60 year run where that was like a, you know, or even 70 years or that was controversial. And, and we now know that that's the case. And so we, we now, you know, everything we're building on today just sort of derives from the original idea in 1943. And so, so in retrospect, we, we now know that like, these, these guys are right.They, they, you know, they would get the timing wrong and they thought, you know, capabilities would arrive faster, or they were, it could be turned into businesses sooner or whatever, but like, they were fundamentally, the, the scientists who worked on this over the course of decades were fundamentally correct about what they were doing.And, and the, and the payoff from, from, from all their work is happening now. And so, so the way I think about what's happening is basically, I think, I think about basically the, the, the period we're in right now is it's, I call it 80 year overnight success, right? Which is like, it's an overnight success.‘cause it's like bam, you know, chat, GPT hits and then, and then oh one hits, and then, you know, open claw hits and like, you know, these are open, these are, these are like overnight, like radical, overnight transformative successes, but they're drawing on an 80 year sort of wellspring backlog, you know, of, of, of, of ideas and thinking it's not just that it's all brand new, it's that it's an unlock of all of these decades of like very serious, hardcore research.Um, and thinking, and look, there were AI researchers who spent their entire lives. They got their PhD. They, they worked for, they've researched for 40 years. They retired in a lot of cases, they passed away and they never actually saw it work.swyx: Yeah. It's all sad.Marc: It is. It is sad. It's sad. Knewswyx: Jeff Hinton was like the last guy.Marc: Yeah. Yeah. Well, there were the guys, uh, was a guy, Alan Newell. I mean, there's tons of John McCarthy. You know, John McCarthy was like one of the inventors in the field. He's one of the guys who organized the Dartmouth Conference and you know, he taught at Stanford for 40 years. Wow. And passed, you know, passed away, I don't know, whatever, 10, 10 years ago or something.Never, never actually go. Got to see it happen. But like, it is amazing in retrospect, like, these guys were incredibly smart and they worked really hard and they were correct. So anyway, so then it's like, okay, you know, say history doesn't repeat, but it rhymes. It's like, okay, does that mean that there's gonna be another, like, you know, basically boom buzz cycle.And I, I will tell you, like, let, like in a sense, like yes, everything goes through cycles and, you know, people get overly enthusiastic and overly depressed and there's, there's a time, there's a timelessness to that. Having said that, there's just no question. Um, so the form, the foremost dangerous words in investing this time are, this time is different.Do you know the 12 most dangerous words investing? No. The four most d foremost dangerous words in investing are this time is different. Yeah. Um, the 12 most dangerous words. And so like, I'll tell you what's different. Like now it's working like, like there's just no, I mean, look, there's just no question.And by the way, I, I'll just give you guys my take. Like L LLMs, like from, from basically the Chad G PT moment through to spring of 25. I think you could still, I think well intention, well, and of. Form skeptics could still say, oh, this is just pattern completion. And oh, these things don't really understand what they're doing.And you know, the hall hallucination rates are way too high. And, you know, this is gonna be great for creative writing and creating, you know, Shakespeare and so sonnets and, you know, as, as rap lyrics or whatever, like, it's gonna be great and all that stuff, but we're not gonna be able to harness this to make this relevant in, you know, coding or in medicine or in law or in, you know, you know, kind of feels that, you know, kind of really, really matter.And I think basically it was the reasoning breakthrough. It, it was oh one and then R one that basically answered that question basically said, oh no, we're gonna be able to actually turn this into something that's gonna work in the real world. And, and then obviously the coding breakthrough over the, over basically the coding breakthrough that kind of catalyzed over the holiday break was kind of the third step in that.Mm-hmm. Where you're just like, alright, if, if, you know, if Linus Tova is saying that the AI coding is no better than he is like. Like, that's, that's never happened before. That's theswyx: benchmark.Marc: Yeah. That's never happened before. And so now we know that it's, it's gonna sweep through coding and, and then, and then we, we know, you know, we know that if it's gonna work in coding, it's gonna work in everything else.Right. It's just then, because that's, that's like, that's like, that's like the hardest in many ways. That's the hardest example. And how everything else is gonna be a, a derivative of that. And then on top of that, we just got the agent breakthrough, you know, with Open Claw, which is fantastic. Which is amazing and incredibly powerful.And then we just got the, the, um, the auto research, uh, you know, the, the self-improvement. You know, we're now into the self-improvement breakthrough. And so the, so the way I think about it is we've had four fundamental breakthroughs in functionality, l OMS reasoning, uh, agents, um, and then, uh, and, and then now RSI, um, and, and they're all actually working.Um, and so I'm, I'm just, as you like, you can tell I'm jumping outta my shoes. Like, like this is, like this is it like this, this is the culmination of 80 years worth of worth of work, and this is the time it's becoming real.Alessio: Yeah.Marc: I, I'm completely convinced.Alessio: I think the anxiety that people feel is like during the transistor era, yet Mors law, and it's like, all right, we understand why these things are getting better.We understand the physics of it. Yeah. With ai, it's. It's so jagged in like the jumps where like, like you said, it's like in three months you have like this huge jump like, and people are like, well this can keep happening. Right? But then it keeps happening,Marc: it'll keep happening.Alessio: And so like how do you think about also timelines of like what's we're building?I think we always have this question with guests, which is like, you know, should you spend time building harness for a model versus like the next model just gonna do it one shot in the lead space. Right. And how does that inform, like how you think about the shape of the technology? You know, you talk about how it's a new computing platform.If you have a computing platform, then like every six months it like drastically changes in what it looks like. It's hard to build companies on top of it.Marc: Yeah. So, so a couple things. So one is like, look, the, the Moore's law was what we now call a scaling law. Like Moore's Law was a scaling law and for your younger viewers, more Moore's Law was every chip chip chips either get twice as powerful or twice as cheap every, every 18 months.And that, and that and that, you know, that it's gotten more complicated in the last few years. But like that, that was like the 50 year trajectory of, of, of the computer industry. And then, and then by the way, and that's what took the mainframe computer from a $25 million current dollar thing into, you know, the phone in your pocket being, you know, a million times more powerful than that.Like that, you know, for, for 500 bucks. And so that, that was a scaling law. And then, and then, and then key to any scaling law, including Moore's Law and the AI scaling laws is, you know, they're not really laws, right? They're, they're, they're, they're predictions, but when they work, they become self-fulfilling predictions because they, they, they, they, they set a benchmark and, and then the entire industry, right?All the smart people in the industry kind of work to make sure that, that, that actually happens. And so they, they kind of motivate the breakthroughs that are required to, to keep that going. And, and in and in chips, that was a 50 year, that was a 50 year run. Right. And it, it was amazing. And it's still happening in, in some areas of, of chips.I think the same thing is happening with the, the core scaling laws. The core scaling laws. In, in, in ai, you know, they're, they're not really laws, but like they, they are basically. There are predictions and then they're motivating catalysts for the research work that is required to be. And, and, and, and by the way, also the investment, uh, dollars, um, uh, you know, required to basically keep, you know, keep the curves going and, and look, it, it is, it's gonna be complicated and it's gonna be variable and they're, you know, there're gonna be walls that are gonna look like they're fast approaching, and then they're gonna be, you know, engineers are gonna get to work and they're gonna figure out a way to punch through the walls.And obviously that's, you know, that's been happening a lot, you know, and then look, there's gonna be times when it looks like the walls have, you know, the, the, the laws have petered out and then they're gonna, they're gonna pick up again and surge and then, and then, and then it, it appears what's happening to the eyes is there's not multiple, you know, multiple scaling laws.Um, there's multiple areas of improvement. And, and I think, you know, I don't know how many more there are already yet to be discovered, but there are probably some more that we don't know about yet. You know, they, like, for example, there's probably some scaling law around, um, world models and robotics that we don't fully understand, you know, kind of acquisition of data at scale in the real world that we don't fully understand yet.So that, that, that one will probably kick in at some point here. There's a bunch of really smart people working on that. Um, and so, yeah, I, I think the expectation is that, that, you know, the, the scaling laws generally are gonna continue. Yeah. The, the pace of improvement will continue to move really fast.Um. To your question on like what to build. So, uh, I'm a complete believer the scaling laws are gonna continue. I'm a complete believer the capabilities are gonna keep getting amazing, um, you know, leaps and bounds. Uh, the part where I kind of part ways a little bit with how, what I would describe as the AI purists, um, you know, which is, which I would characterize as like the people who are.In many ways, the smartest people in the field, but also the people who spend their entire life, like at a lab, um, and have, have, I would say, have very little experience in the outside world. Um, the, the, the nuance I would offer is the outside world of 8 billion people and institutions and governments and companies and economic systems and social systems is really complicated.Um, and, um, and doesn't, you know, it it 8 billion people making collective decisions on planet Earth is not a simple process of like, just like you see this happening now. It's like a bunch of AI CEOs have this thing, which is just like, well, there's just this, they just all have this kind of thing when they talk in public where they're just like, well, there's these, these obvious set of things that so society to do.Alessio: Mm-hmm.Marc: And then they're like, society's not doing any of those things. Right. And it's like, how can society not, you know, what, whatever their theory is, how can society not see x, y, Z? Mm-hmm. And the answer is, well, society is number one. There's no single society, it's like 8 billion people. And they like all have a voice, and they all have a vote, like at the end of the day of how they, they react to change.And then, you know, it just like, it's just human reality is just really complicated and messy. Um, and, and, and so the specific answer to your question is like, as usual, it depends. Um, you know, it, it depends. Look, pe there's no question people are gonna, like, there's no question they're gonna be companies.It's already happening. There are companies that think that they're building value on top of the models and then they're just gonna get blissed by the, by the next model. There's no question that's happening. But I think there's no question also that just the process of adaptation of any technology into the real and into the real messy world of humanity is, is just going to be messy and complicated.It's, it's not going to be simple and straightforward. It's gonna be messy and complicated. And there are gonna be a lot of companies and a lot of products, um, uh, and in, in fact entire industries that are gonna get built to, to, to basically actually help all of this technology actually reach real people.Alessio: The amount of capital going into these companies, I mean, Dario talked about it on the Door Cash podcast and Door Cash was like, why don't you just buy 10 x more GPUs? And he is like, because I'm gonna go bankrupt if the model doesn't exactly hit the, the performance level. How do you think about that?Also as a risk on, you know, you guys are investors, open AI and thinking machines and world apps. It seems like we're leveraging the scaling loss at a pretty high rate, right? Like how comfortable, I guess, do you feel with the downside scenario, like, and say like things Peter out, you think you can kind of like restructure uh, these build outs and uh, you know, capital investments.Marc: Yeah. So should start by saying, so I live through the.com crash, um, and I can tell you stories for hours about the.com crash and it was horrible. No, it was awful. It was, it was, it was apocalyptic by the way. The, a lot of the.com crash was actually at the time, it was actually a telecom crash. It was a bandwidth crash.Like the, the thing that actually crashed, that wiped out all the money with the tele, the telecom companies.swyx: GlobalMarc: crossing. Global, global, yeah.swyx: I'm from Singapore and they, they laid so much cable o over over our oceans.Marc: Actually there was a scaling law in the.com. Era. And it was literally the, the US Commerce Department put out a report in 1996 and they said internet traffic was doubling every quarter.Um, and, and actually in 1995 and 1996, internet traffic actually did double every quarter. And so that became the scaling law. And so what all these telecom entrepreneurs did was they went out and they raised money to build fiber, anticipating that the demand for bandwidth is gonna keep doubling every quarter.Doubling every quarter though is like, you know, grains of chess and the chessboard, like at some point the numbers become extremely large. Right. And, and, and it really, and really what happened was the internet. The internet by the way, continuously kept growing basically since inception. And it's, you know, it's, it's continuously grown.It's never shrunk. And it's grown really fast compared to anything else. Mm-hmm. You know, in, in, in human history. But it wasn't doubling every quarter as of 19 98, 19 99. And so there was this gap in the expectation of what they thought was a scaling law versus reality. And that's actually what caused the.com crash, which was the, it they, they way over companies like global crossing way overbuilt fiber, which is sort of the, and by the way, fiber, telecom equipment, you know, so all the, all the networking gear, you know, and then, and then by the way, the actual physical data centers, like that was the beginning of the, of the, of the data center build and then, and the data center overbuild.And so you had that, but it was, it was literally, I think it was like $2 trillion got wiped out, right? It was like Jesus, it was like a big, it was. And by the way, the other, the other subtlety in it was the internet companies themselves never really had any debt. ‘cause tech, tech companies generally don't run on debt, but the telecom companies run on debt.Physical infrastructure companies run on debt. And so the companies like Global Crossing not just raise a lot of equity, they also raise a lot of debt. So they're highly levered. And so then you just do the thing. It's just like, okay, you have a highly levered thing where you're, you're just over, you're overbuilding capacity.Demand is growing, but not as fast as you hoped. And then boom, bankrupt. Right. And, and then it, and then it's like they say about the hotel industry, which is, it's always the third owner of a hotel that makes money. It has to go bankrupt twice, right? You have to wash out all of the over optimistic exuberance before it gets to actually a stable state.And then it makes money. So by the way, all of those data centers and all of those, all the fiber that they're in use, it's all in use today. Yeah. But 25 years later. But it, it, it took, and actually the elapsed time was, it took 15 years. It took 15 years from 2000 to 2015 to actually fill, fill up all that capacity.The cautionary warning is the, the overbuild can happen. Um, and, and, and, and, you know, you, you get into this thing where basically everybody, everybody who basically has any sort of institutional capital, it's like, wow. It's just, I, I don't know how to invest in these crazy software things. For sure I can put build data centers and for sure I can buy GPUs that I can deploy, you know, compute grids and, and all these things.Um, and so, you know, if you're a pessimist, you could look at this and you could say, wow, this is like really set up to be able to basically replicate, you know, what we went through, what we went through in 2000. Obviously that would be bad. The counter argument, which is the one I I agree with, which is the counter on, on the other side is a couple things.One is the companies that are investing all the, the companies that are investing the money are like the bluest chip of companies. And so back, back, back in the, in the do, like Global Crossing was like a, it was like an entrepreneur. It was like a, a new venture, but like the money that's being deployed now at scale is Microsoft, and, you know, and Amazon and Google, Facebook and Facebook and Nvidia and, you know, these, these, these, and, and now you know, by the way, open ai philanthropic, which are now at like, you know, really serious size, um, you know, as companies with, you know, very serious revenue.These are very large scale companies with like, lots, lots of cash, lots of debt capacity that they've, they've never used. And so th this is institutional in a way that, that really wasn't at the time. And then the other is, at least for now, every dollar that's being put into anything that results in a running GPU is being turned into revenue right away.Like so, and you guys know this, like everybody's starved for capacity, everybody's starved for compute capacity and then, you know, all the associated things, memory and, and, and interconnected and everything else. Um, data center space. And so e every dollar right now that's being put into the ground is turning into revenue.And, and it, and in fact, I actually think there's an interesting thing happening, which is because everybody starve for capacity, the models that we actually have that we can use today are inferior versions of what we would have if not for the supply constraints. That's true. Um, if Right pose a hypothetical universe in which GPUs were 10 times cheaper and 10 times more plentiful mm-hmm.The models would be much better. ‘cause you would just allocate a lot more money to training and you'd just build better models and they would be better. Um, and so we're, we're actually getting the sandbag version of the technology.swyx: Yeah. No. Everything we use is quantized because the, the labs have to keep the, the full versions,Marc: right?swyx: LikeMarc: we're not even getting the good stuff.swyx: Yeah.Marc: But, but getting the good stuff, it's, it's just, even if technical progress stops. Once there's like a much bigger build of like GPU manufacturing capacity and memory, you know, all, all the things that have to happen in the course of the next five or 10 years.Once it happens, even the current technology is gonna get, gonna get much better. And then as you know, like there's just like a million ways to use this stuff. Like there's just like a million use cases for this. Mm-hmm. Like, it, it, you know, this isn't just sending packets across a, a thing, whatever, and hoping that people find something to do with it.This is just like, oh, we apply intelligence into every domain of human activity. And then it works like incredibly well. Yeah. Um. Here's what I know, here's what I know. Um, in the next three or four year, it's like somewhere between three or four years out, basically everything is selling out. So like the, the entire supply chain is, is, is, is sold out or, or, or selling out.And so there, there's no, like, we're just gonna have like chronic supply shortage for, you know, for years to come. Um, there's going to be a response from the market that's gonna result in an enormous, you know, it's happening now. An enormous flood of investment in a new fab capacity and ev you know, every, everything else to be able to do that, at some point the supply chain constraints will unlock, you know, at least to some degree that will be another accelerant to industry growth when that happens.‘cause the products will get better and everything will get cheaper. Um, and so, so I know that's gonna happen. I know that, you know, the deployments, you know, the, the actual use cases are like really compelling. And then, like I said, you know, with reasoning and agents and so forth, like, I know they're just gonna get like much, much better from here.And so I, I, I know the capabilities are like really real and serious. I also know that the technical progress is not going to stop. It. It, it is excel. It is, is accelerating. Like the, the breakthroughs are are tremendous. I mean, even just month over month, the breakthroughs are really dramatic. And so, you know, I think if you were a cynic and there, there are cynics, you can look at 2000, you can find echoes.But I can't even imagine betting it that this is gonna like somehow disappoint and, you know, at least for years to come, I think it would be essentially suicidal to make that bet. Yeah. Um, it was that Michael Burry, uh, uh, that'sswyx: anMarc: interesting guy, huh? We'll pick on a guy. We'll pick, let's pick on one guy.We'll pick. Well ‘cause he did, he he came out with, it was, it was the, heswyx: doesn't mind.Marc: It was the Nvidia short. Right. He came with the Nvidia short. And then if you guys probably talked about this, which is the, the analysis now that like the current models are getting better faster at such a rate that if you are running an Nvidia, if you're running an Nvidia inference chip today, that's three years old, you're making more money on it today than you did three years ago because the pace of improvement of the software is, is faster than the, the, the depreciation cycle, the chip.And then my understanding is Google is running. I don't if they've, I don't know exactly what, uh, these are rumors that I've heard or maybe it's public, but, um, I think Google's running very old TPUs, very profitably. Ference. Yeah. And very profit and very profitably. Yeah. Um, and so, so it actually turns out, as far as I can tell, it's actually the opposite of the Beery thesis is actually.He was actually 180 degrees wrong. It's actually the, the, the, the old Nvidia chips are getting more valuable, which is something that's like literally never happened before. Like it's never been the case that you have an older model chip that becomes more valuable, not less valuable. And that, and again, that's an expression of the just ferocious pace of software progress.Ferocious pace of capability payoff. Yeah. Uh, that you're getting on the other side of this. And so I just, the idea of betting against that, like.swyx: Yeah. Yeah. Well, one ofMarc: my, it seems like an invitation to get your face ripped up.swyx: One of my early hits was like modeling the lifespan of the H 100 and h two hundreds and, and going like, you know, usually they advise like four to seven years and it was, you know, maybe you sort of realistically haircut cut it down to two to three.Yeah. But actually it's going up and not down. Yeah. And, and uh, that's, I mean that's, I think that's the dream. Uh, we are finding utilization and I think utilization solves all problems. Like, you can, you can find use, use cases for even like the poor, like even memory, we're having a shortage. Right. And, and even like the, the shittier versions of, of memory that we do have, we are finding use cases for it.So like That's great.Marc: Yeah.Alessio: How, how important is open source AI and kinda like edge inference in a world in which you have three years of supply crunch. Like, do you think in the, like, you know, if you fast forward like five years, like how do you think about inference, uh, in the data center versus at the edge?Marc: Well, so just to start, yeah. So I think, I think open source is very important for a bunch of reasons. I think edge, edge inference is very important for a bunch of reasons. I, I think just practically speaking, if we're just gonna have fundamental construc, supply crunches for the next, I mean, you, you guys know if you just project forward demand over the next three years, right?Yeah. Relative to supply, one of the, its main predictions you can do is what's gonna, what, what's gonna happen to the cost of, of inference in the core, uh, over the next three years? And like, it may rise dramatically, right? Like, so, so what is, and then is, is, you know, like the, the, the big model competition are subsidizing heavily right now.Right? Right. And so, so what's the, what will be the average person's, you know, per day, per month token cost, you know, three years from now to do all the things that they want to do. And I, I don't know, it's gonna. I mean, I have, you guys probably have friends, I have friends today who are paying a thousand dollars a day for open claw, for claw tokens to run open claw.Right? And so, okay. $30,000 a month. Right? And, and by the way, those, those friends have like a thousand more ideas of the things that they want their claw to do, right? Yeah. And so you, you could imagine there, there's like latent demand of up to, I don't know, five or $10,000 a day of, of, of tokens for a fully deployed, you know, per personal agent.Uh, and obviously consumers can't pay that, right? And so, so, but it gives you a sense of the fu of the fu of the future scope of demand, right? And so, so even, even if there's a 10 x improvement in price performance, that still, you know, goes to a hundred dollars a day, which is still way beyond what people can pay.Mm-hmm. So there's just gonna be like. Ferocious to me, by the way. The agent thing, the other interesting thing is I think the agent thing, so up until now, a lot of the constraints of GGPU constraints, I think the agent thing now also translates into CPU constraints. Mm-hmm. Right?swyx: CPU memory.Marc: Yes. CPU memory, right?And so, like the entire chip ecosystem is just gonna get wait,swyx: wait for network constraints, that that will be the killer.Marc: It's all bottleneck potentially for years. And so, so I, I think that Brad, and, and I think it's actually possible, I mean, generally inference costs are gonna keep coming down, but I think the, let's put it this way, the rate of decline, I think may level out here for a bit because of these supply constraints.And then at some point, maybe the lab stops subsidizing so much and that, that, that again, will be, be an issue. And so there's just gonna be so much more demand for inference than, than can be satisfied. Um, you know, kind of with the centralized model. And then, and then, you know, you guys know this, but like all the, just the dramatic, I mean just the dramatic innovations that have happened in the Apple silicon to be able to do, uh, inferences, it's quite amazing the level of effort being put.Like the open source guys are putting incredible effort into getting, you know, this recurring pattern where the big model will never run on a pc, and then six months later mm-hmm. Oh, it runs in a pc, right? It's like amazing. And there's very smart people working on that. So there's all that. And then look, there's also, you know.There's also like other, there's other motivators. There's other motivators which is just like, okay, how much trust are the big centralized model providers? You know, how much trust are they building in the market versus, you know, how much are, you know, at least for, in certain cases with some people, for certain use cases, people being like, well, I'm not willing to just like, turn everything over.So there, there, there's all the trust issues. Um, by the way, there's also just like straight up price optimization. There's many uses of AI where you don't need Einstein in the cloud. You just need like a, a a, a smart local model. There's also performance issues where you want, you know, you want, you know, you're gonna want your doorknob to have an AI model in it.Right. You know, to be able to, you know, do, um, you know, to be able to do access control. Um, obviously like everything with a chip is gonna have an AI model in it. Mm-hmm. And it, a lot of those are gonna be local. Um, and so, yeah. No, like I think, I think you're gonna have ti and then you're gonna, by the way, also wearable devices, you know, you don't wanna do a complete round trip.You want, you know, you, whatever your smart devices are, you want it to be like super low latency. Yeah.swyx: The question, do we care who makes it? Yeah. One of the biggest news this week was the collapse of AI two, the Allen Institute. Mm-hmm. One of the actual American open source model labs. Yeah. Um, and, uh, I'm not that optimistic on, on American open source.Yeah. Like you, you guys invested in MIS trial and MIS trial's doing extremely well outside of China. That's about it.Marc: Yeah. We'll see. We'll see. I look, I, number one, I do think we care. Uh, I do think we, I do think we care who makes it. Um, I would say this, the, the, the, the previous presidential administration wanted to kill it in the us Oh yeah.They wanted to drown in the bathtub. Um, and so they wanted to kill it. So at least we have a government now that actually like, actually wants it wants it to happen. And youswyx: earned to councilMarc: and Yeah. And the new and the P pcast. Yeah. So the, the, you know, this admin for whatever other political issues people have, which are many, you know, this administration has, I think a very enlightened view and in particular an enlightened view on AI and in particular on open source ai.Uh, and so they're very supportive. Um, my read is the Chi. The Chinese have a very, the various Chinese companies have a very specific reason to do open source, which is, they, they, they don't fundamentally, they don't think they can sell commercial, uh, AI outside of China right now. And or at least specifically not, not in the US for a combination of reasons.And so they, they kind of view, I think, open source AI as a bit of a loss leader against basically domestic, uh, you know, paid, paid services. And then kind of an, you know, kind of an ancillary products. You know, they're, they're very excited about it, by the way. I think it's great. I think it's great that they're doing it.Um, you know, I think Deeps seek was like a gift to the world. Um, I think. The great thing about open source, open source, the, the, the impact of open source is felt two ways. One is you, you get the software for free, but the other is you get to learn how it works, right? And so like the paper, the paper, the paper and, and the code, right?And the code. And so, like, for example, I thought this was amazing. So open comes out with L one and it's an amazing technical breakthrough, and it's just like, absolutely fantastic. But of course they don't explain how it works in detail. And then of course they hide the, they hide the reasoning traces, right?And, and then, and then, and then everybody's like, okay, this is great, but like, who's gonna be able to replicate this? Are other people gonna be able to do this? You know, is their secret sauce in there? And then our one comes out and it's just like, there's the code and there's the paper, and now the whole world knows how to do it.And then, you know, three months later, every other AI model is, is adding reasoning. And so, so you get this kind of double, like even if the Chinese models themselves are not the models that get used, the education that's taken place to the rest of the world, the information diffusion, you know, is incredibly powerful.So that happens and then, I don't know. We'll, we'll see. You know, there are a bunch of American, you know, open source, you know, ai, uh, model companies. I mean, look, there's gonna be tremendous, you know, there already is. There's, you know, there's gonna be tre there's tremendous competition, uh, among the primary model companies.You know, there's, depending on how you count, there's like four or five, you know, big co model companies now that are, you know, kind of neck and neck, uh, in different ways. Um, uh, you know, and, and, and, um, you know, and then obviously Bo Bo both X and then MetAware involved are, you know, both have huge, you know, huge attempts to, you know, kind of, to kind of leapfrog underway.And then you've got, you know, a whole fleet of startups, new companies, including a whole bunch that we're backing, that are, you know, trying to come out with different approaches. And then you've got whatever it is. I don't know how, how many, how many, like main line foundation model companies are there in China at this point?It's probably six. It'sswyx: five Tigers is what they call it. Yeah. Uh, Quinn is in questionable because there's change in leadership,Marc: right?swyx: Yeah.Marc: But that, does that include, that includes like Moonshot,swyx: yes. Can deep seek, uh, uh, ZI, um, Quinn oh one is in there.Marc: Right. And then, um, and by dance and, and then you see,swyx: ance would be like the next tier ance.They weren't as prominent. They weren't, didn't haveMarc: a leading. Yeah. But they, you at least, you know, ance is very inspiring and presumably they have more stuff coming and Tencent probably has more stuff coming and, and so forth. And so, so, so like, look, here, here would be a thing you can anticipate, which is there are not these markets, there are not going to be between the US and China right now, there's like a dozen primary foundation model companies that are like at scale, at, at some level of a critical mass.It's not gonna be a dozen in three years, right? Like, it just because these industries don't bear a dozen, it's, it's gonna be three or you know, there's gonna be three or four big winners or maybe one or two big winners. And so there's gonna be like a whole bunch of those guys that are gonna have to figure out alternate strategies.Um, and I think like open source is one of those strategies. And so I, I think you could see like a whole, i, I, I think the questions like, who's gonna do open source? I think that could change really fast. I, I think that, that, that's a very dynamic thing. I think it's very hard to predict what happens. And, and I think it's very important.swyx: NVIDIA's doing a lot.Marc: Well, I was gonna say. Well, exactly. And then you're got Nvidia and then, and then, you know, just to, again, indu, there's an old thing in business strategy, which is called, uh, commoditize Compliments. Commoditize the compliment. That's right. And so if your Jensen is just kind of obvious, of course, you wanna commoditize the software.Yeah. And he's, and to his enormous credit, he's putting enormous resources behind that. And so maybe it, maybe it's literally Nvidia and I think that would be great.Alessio: Yeah. Uh, narrative violation to European projects, uh, in the, uh, damn.swyx: I'm hosting my, uh, Europe, uh, conference soon. And I got both of them.Alessio: They got us.They got us. MarkMarc: finished. They got us, us. Well, wait a minute. Where was Peter? So where was Steinberger when he did? In AustriaAlessio: was, yeah, yeah, yeah.Marc: He was in what? He was in Vienna. Oh, he was in Vienna. And then where is he now?swyx: Uh, he's moving to sf.Marc: Okay. Okay. Alright. Okay, there we go. And then, yeah, the PI guy, right?The PI guys are European.swyx: Yeah, they're also, they're buddies inAlessio: Australia. Mario's also there. Yeah.Marc: Right. And are they, yeah, they haven't announced yet. Any sort of change changed or have theyAlessio: No, they're, they have a company there.Marc: Okay. Got, okay. Good.Alessio: Good, good,good.Alessio: Um,Marc: yeah, good.swyx: Anyways, I think pie and open cloud very important software things and, and I just wanted you to just go off on what you think.Marc: Yeah. So I think in co the, the combination of the two of them I think is one of the 10 most important softwares. Openswyx: Claw got all the attention, but Right. Talk about pie,Marc: pi pie's, kind of the Yeah. PI's, PI's kind of the architectural breakthrough for those of us who are older. There was this whole thing that was very important in the world of software basically from like 1970 to, I don't know, it still is very important, but like 19, from 1973 to like basically the creation of Linux, which is basically this, this thing used to call like the Unix mindset.Like so, so, ‘cause there were all these different, you know, theories. There are all these different operating systems and mainframes and, and then you know, all these windows and Mac and all these things. And then there was this, but kind of behind it all was this idea of kind of the Unix mindset. And the Unix mindset was this thing where basically you don't have these, like, like in the old days, like, like the operating system that like made the computer industry really work, like in the 1960s mm-hmm.Was this thing called o os 360, which was this big operating system that IBM developed that was supposed to basically run everything. And it was this like giant monolithic architecture in the sky. It was like a, you know, it was like a giant castle. Um, of software. And, and by the way, it worked really well and they were very successful with it.But like, it was this huge castle in the sky, but it was this thing, it was almost unapproachable, which is like, you had to be kind of inside IBM or very close to IBM. And you had to really understand every aspect, how the system worked. And then the, the Unix sky is originally out of at and t and then out out of Berkeley, um, you know, came out and they said, no, let's have a completely different architecture.And the way architecture's gonna work is we're gonna have, we're gonna have a, a prompt and, and a, and a shell. And then, and then we're gonna, all, all the functionality is gonna be in the form of these discreet modules, and then you're gonna be able to chain the modules together. Mm-hmm. Yeah. And so like the, the, the op, it's almost like the operating, operating system itself is gonna be a programming language.Um, and then that led led to the, the, the sort of centrality of the shell. Um, and then that led to sort of, uh, you know, basically chaining together Unix tools. And then that led to the emergence of these, these scripting languages like Pearl, where you, you could basically kind of very easily do this, and then the shells got more sophisticated and then, and then, and then look like, you know, that, that, that number one, that worked and that, that was the world I grew up in.Like I was, I was a Unix guy. You know, sort of from, call it 1988 to, you know, kind of all, all the way through my work and it worked really well. It, it's in the background, um, you know, nor normal people don't need to, didn't need to necessarily know about it, but like, if you were doing like system architecture, application development, you, you, you knew all about it.Um, and then, you know, it's been in the background ever since. And, you know, look, your Mac still has a Unix shell, you know, kind of in there, and your iPhone still has a Unix shell kind of buried in there somewhere. So they're kind of in there. And then, you know, the Windows shell is kind of a, you know, sort of a weird derivative of that.But, um, you know, but look, the inter, the internet runs on Unix, um, and that smartphones, actually, both iOS and Android are Unix derivatives. And so, you know, kind of Unix did end up winning. But, but anyway, and then we just started taking that for granted. And then, and then so, so basically the, the way I think about what happened with Pie and then with Open Claw is basically what those guys figured out is, I always say the, the great breakthroughs are obvious in retrospect, right?Which is the best kind, the best kind. They weren't obvious at the time or somebody else would've done them already. Um, and so there is a, like a real conceptual leap, but then you look at it sort of the backwards looking and you're just like, oh, of course. Mm-hmm. Like the, the, to me those are always the best breakthroughs.Well, actually language models themselves are like that. It's just like, oh, next token completion. Oh, of course.swyx: Yeah. What other objective mattered?Marc: Yeah, exactly. But, but like it, right. But she's even saying it wasn't obvious until somebody actually did it. Right. And so the conceptual breakthrough is real and deep and powerful and, and very important.And so the way I think about pie and olaw is it's basically marrying the, the language model mindset to the un to the Unix, basically shell prompt mindset. And so it's, it's basically this idea that what, what, so what is an agent, right? And as, as, and as you know, like many smart people who have been trying to figure out what an agent is for, for, for decades, and they've had many architectures to build agents and the whole thing.And it turns out what is an agent. So it turns out what we now know is an agent is the following. It's, so it's a language model. And then above that, it's a ba, it's a bash shell. Um, so it's a, it's a Unix shell, and then it's, and then the agent has access, uh, has access to, to the shell. And, you know, hopeful, hopefully in a sandbox, maybe in, maybe in a sandbox.So it's, it's the model. Um, it's the shell. Um, and then it's a fi, it's a file system. Um, and then the state is stored in files. And then, you know, there's the markdown format for the, you know, for, for the files themselves. And then, and then there's basically what in Unix is called Aron job. There's a loop and then there's a heartbeat for the, there's heartbeat and, and the thing basically Wake Wakes up.Wakes up. So it's basically LLM plus shell, plus file system, plus markdown, plus kron. And it turns out that's an agent. And, and, and every part of that, other than the model is something that we already completely know and understand. And in fact, it turns out that like the latent power of the Unix shell is like extraordinary because basically like all, like, there's just like an, there's just enormous latent power in the shell.There's enormous numbers of Unix commands, there's enormous number of command line interfaces into all kinds of things already in the, you know, your entire, I mean your entire, just to start with, your computer runs on a shell. If you're running a Mac or a, or, or a phone, your computer, your computer's running on a shell, uh, already.And so like the full power of your computer is available at the command line level. Um, and then it turns out it's really easy to expose other functions as a command line interface. And so like this whole idea where we need like MCP and these like product mm-hmm. Fancy protocols, whatever, it's like, no, we don't, we just need like a command, command line thing.So that's the architecture. And then it turns out what is your agent? Your agent has a bunch of files starting a file system. And then there's the thing that just like completely blew my mind when I write my head around it as a result of this, which is like, okay. This means your agent is now actually independent of the model that it's running on.Because you can actually swap out a different LLM underneath your agent and your, your agent will change personality somewhat. ‘cause the model is different, but all of the state stored in the files will be retained.swyx: Yeah. Different instruction set, but you just compiledit.Marc: Right, exactly. And it's all right.It's like right. Swapping out a ship and recompiling, but it's, it's still, it's still your agent with all of its memories. Um, and with all of its capabilities. And then by the way, you can also swap out the shell, uh, so you can move it to a different execution environment that is also, is also a b shell, by the way, you can also switch out the file system, right.Uh, and you can, and you can, and you can swap out the, the, the heartbeat for the, the crown framework, the, the loop that the agent framework itself. And so your agent basically is ba basically at the end of the day, it's just. It's just, its files. Um, and then, and then there's of course it a openswyx: call.Marc: Yeah, it's, it's basically, it's, it's just the files.Um, and then by the way, as a consequence of that, the agent and then the agent itself, it turns out a couple important things. So one is it, it's, it, it can migrate itself, right? And so you're, you can instruct your agent, migrate yourself to a different, uh, runtime environment, migrate yourself to a different file system, migrate yourself to a different, you know, swap out the language model.Your agent will do all that stuff for you. And then there's the final thing, which is just amazing, which is the agent is the agent actually has full introspection. It actually, it actually knows about its own files and it could rewrite its own files. Right. Which by the way, is basically no widely deployed software system in history where the, the, the thing that you're using actually has full introspective knowledge of how it itself works and is able to modify itself.Like that, that, I mean, there have been toy systems that have had that, but there, there's never been a widely deployed system that has that capability and then that leads you to the capability. That just like completely blew my mind when I wrap my head around it, which is you can tell the agent to add new functions and features to itself and it can do that.Extend yourself. Yeah. Right? Extend, extend yourself. Like extend yourself. Give yourself a new capability. Right? And so, and so literally it's just like you run into somebody at a party and they're like, oh, I have my open claw, do whatever, connect to my eat, sleep bed, and it gives me better advice and sleep.And you go home at night and you tell your claw, or if they're at the party, by the way, you tell your claw, oh, add this capability to yourself. And your claw will say, oh, okay, no problem. And it'll go out on the internet and it'll figure out whatever it needs and then it'll go out to claw code or whatever.It'll write whatever it needs. And then the next thing you know, it has this new capability. And so you don't even have to, like, you can have it upgrade itself without even having to, without having to do anything other than tell it that you want it to do that. And so anyway, so the, the combination of all this is just, I mean, this is just like a massive, incredible, I mean, it's just incredible.Like if I, if I were, if I were 18, like this is a hundred, this is what I would be spending all of my time on. This is like such an incredible conceptual breakthrough. Yeah. And again, pe people are gonna look at it and they already get this response. People are gonna look at it and they're gonna say, oh, well, where's the breakthrough?‘cause these, the, all of these components were already known before. Mm-hmm. But, but this is the key, the key to the breakthrough was by using all these components that were known before, you get all of the underlying capability of that's buried in there. And so all, and so for example, computer use all of a sudden just kind of falls, trivi, trivial.Of course it's gonna be able to use your computer. It has full access to the shell. Right. And then, and then you just, you, you give it access to a browser, and then you've got the computer and the browser and, and often away it goes. And, and then you've got all the abilities of the browser also. Um, yeah.And so, and so the capability unlock here is profound. My friends who are, you know, deepest into this, are having their claw do like a, like, literally like a thousand things in their lives. They have new ideas every day. They're just like constantly throwing new challenges at the thing. And by the way, it's early and, you know, these are, you know, these are prototypes and there are, you know, as you guys know, there's security issues.Yeah. And, and so, you know, there's a bunch of stuff to be ironed out, but the, the unlock of capability is just incredible.swyx: Yeah.Marc: And I, I have absolutely no doubt that everybody in the world is gonna, is gonna have at least, you know, an agent like this, if not an entire family of agents. And w

This Week in Startups
The 5-Step Framework for AI Agents That Improve While You Sleep | E2269

This Week in Startups

Play Episode Listen Later Mar 31, 2026 87:14


This Week In Startups is made possible by:Quo - https://quo.com/TWiSTLinkedIn Jobs - https://LinkedIn.com/twistIru - https://iru.com/twistPlaud - https://Plaud.ai/twistToday's show:Google project manager Shubham Saboo is running 6 AI agents on a Mac Mini that handle all of his side business autonomously, from research, to social posts, to newsletters.He joins Jason and Lon to walk us through his 5-step framework for designing an efficient AI agent team:Start with one agent and onboard them like a new hireStop Googling fixes; just ask your agent how to use itPut your agents on fixed schedulesAdd shared memory so you don't have to repeat yourselfLet agents run self-reviews and rewrite their own instructions.Get the walkthrough to put this entire plan into practice on today's episode. PLUS fresh demos of the “Minecraft”-inspired virtual workspace MoltWorld and AgentMail, which is Gmail for your AI pals, and Jason explains the thinking behind his viral, controversy-stirring “don't talk to journalists” tweet.Follow Shubham: https://x.com/Saboo_Shubham_“How I Built an Autonomous AI Agent Team That Runs 24/7” on X: https://x.com/Saboo_Shubham_/status/2022014147450614038?s=20Follow Mike: https://x.com/mihalich1988MoltWorld: https://moltworld.io/Follow Haakam: https://x.com/haakamaujlaAgentMail: https://www.agentmail.to/Jason's post about talking to journalists: https://x.com/Jason/status/2037573025458016659NYT responds: https://x.com/NYTimesPR/status/2037648223771263082Translated Japanese tweets: https://x.com/melonneet40/status/2038020624015315289, https://x.com/rambling_28/status/2038041455999246422Japanese people singing “Country Roads”: https://x.com/harukaawake/status/2038081269830222259Trailer for “Nuremberg” (now on Netflix in the US): https://www.youtube.com/watch?v=WvAy9C-bipYTimestamps:0:00 Intro1:24 Plaud: If your work depends on conversations — interviews, meetings, calls — you need a Plaud NotePin. You can check it out at https://Plaud.ai/twist and use code TWIST for 10% off!3:05 We're Claw-pilled once again; it's an all AI Agent showcase6:15 Google AI PM Shubham Saboo's Top 5 OpenClaw tips10:14 Quo (formerly OpenPhone) gives you a clean, modern way to handle every customer call, text, and thread all in one place. Try it free at https://quo.com/TWiST.13:26 Tip #1 — Onboard your agent like a new hire17:31 Tip #2 — Talk to your agents constantly19:02 Tip #3 — Put your agents on a schedule19:51 LinkedIn Jobs - Hire right, the first time. Post your first job and get $100 off towards your job post at https://LinkedIn.com/twist.23:41 Tip #4 — Add cross-agent memory29:33 Tip #5 — Let your agents self-improve29:51 Iru unifies identity, endpoint security, and compliance into one platform. TWiST listeners get 20% off when they book a demo at https://iru.com/twist!34:03 Why Jason says founders should avoid journalists38:24 How biased IS the New York Times?47:02 DEMO: Co-founder Mike Nosov shows us MoltWorld51:14 But what's the utility of this?1:05:15 DEMO: Haakam Aujla presents AgentMail (YC S25)1:09:59 How does AgentMail make money?1:17:03 How Grok Translations are creating cross-cultural dialogue on XSubscribe to the TWiST500 newsletter: https://ticker.thisweekinstartups.comCheck out the TWIST500: https://www.twist500.comSubscribe to This Week in Startups on Apple: https://rb.gy/v19fcpFollow Lon:X: https://x.com/lonsFollow Alex:X: https://x.com/alexLinkedIn: ⁠https://www.linkedin.com/in/alexwilhelmFollow Jason:X: https://twitter.com/JasonLinkedIn: https://www.linkedin.com/in/jasoncalacanisCheck out all our partner offers: https://partners.launch.co/Great TWIST interviews: Will Guidara, Eoghan McCabe, Steve Huffman, Brian Chesky, Bob Moesta, Aaron Levie, Sophia Amoruso, Reid Hoffman, Frank Slootman, Billy McFarlandCheck out Jason's suite of newsletters: https://substack.com/@calacanis

Late Confirmation by CoinDesk
OpenClaw Devs Targeted in GitHub Phishing Scam | CoinDesk Daily

Late Confirmation by CoinDesk

Play Episode Listen Later Mar 20, 2026 2:50


OpenClaw developers targeted in fake CLAW token airdrop scams. OpenClaw developers are being targeted on GitHub with fake $5,000 CLAW token giveaways that lead to wallet-draining sites. The campaign adds to a series of crypto-related scams exploiting OpenClaw's name, which prompted founder Peter Steinberger to ban all crypto discussion on the project's Discord. CoinDesk's Jennifer Sanasie hosts "CoinDesk Daily." - Nexo is the premier digital wealth platform. Receive interest on your crypto, borrow against it without selling, and trade a range of assets. Now available in the U.S with 30 days of exclusive privileges. Get started at nexo.com/coindesk. - This episode was hosted by Jennifer Sanasie. “CoinDesk Daily” is produced by Jennifer Sanasie and edited by Victor Chen.