Neologism created by American writer Robert A. Heinlein
POPULARITY
SpaceX just made history, raising $75 billion in the largest IPO the stock market has ever seen, now trading on NASDAQ at a $1.8 trillion valuation. 7investing's Simon Erickson break downs what you actually need to know as an investor. The SpaceX empire spans X (formerly Twitter, 600M users), xAI (the Grok-powering AI infrastructure running out of the 2-gigawatt Colossus data center), and 10,000 Starlink satellites serving 10 million subscribers across 164 countries. The scale is genuinely unprecedented.But the numbers tell a more complicated story. SpaceX did $20 billion in revenue last year, pricing it at 90x trailing sales, and generated just $1 billion in Q1 operating cash flow against $10 billion in quarterly capital expenditures. The company is burning cash aggressively, and the entire long-term thesis rests on Elon Musk executing on missions no company has ever attempted: orbital data centers, Starship, and eventually a Mars colony. This isn't a software company where you flip a switch and double revenue. These are physical, capital-intensive bets measured in decades.Simon and Heather are both passing on the IPO. The key man risk alone, Elon simultaneously running SpaceX, Tesla (NASDAQ:TSLA), X, and xAI, is the largest concentration of founder dependency in stock market history. Tesla (NASDAQ:TSLA) fans know this playbook: extraordinary vision, breakthrough results, but timelines that consistently slip years past what Elon says publicly. Full self-driving still isn't there. Orbital data centers won't be either, at least not on the schedule the prospectus implies.Near term, Starlink is the real business the only one generating meaningful cash flow and it's what will sustain SpaceX while Elon bets big on everything else. Expect another capital raise in 2026 and again in 2027. The real question for investors isn't whether SpaceX can change the world. It probably will. The question is whether a $1.8 trillion valuation gives you any margin of safety while it gets there. Right now, Simon and Heather say no.Join the conversation on the 7investing discord: https://discord.com/invite/PT9ZQqdXXSWant access to all our investing content? Join at 7investing.com/subscribe Stocks & Companies Mentioned:SpaceX (NASDAQ: SPCX)Tesla (NASDAQ:TSLA)Rocket Lab (NASDAQ:RKLB)xAI — private (subsidiary within SpaceX conglomerate)X (formerly Twitter) — private (subsidiary within SpaceX conglomerate)OpenAI — private#SpaceX #SpaceXIPO #ElonMusk #Starlink #IPOInvesting #SpaceStocks #TechIPO #GrowthStocks #StockMarket #StocksToWatch #TechStocks #SpaceInvesting #InvestingIn2026 #7investing #Simonerickson
Jane Fonda, like so many on the Left, is the worst kind of hypocrite. She plays the part of a free speech warrior while participating in the most totalitarian movement this country has ever seen.There she was, yet again, yapping into a microphone to protest Trump's UFC 250. The signs behind her are ablaze with pure lies - Civil Rights! The First Amendment! You can't silence us! But Jane Fonda and the company she founded, Women's Media Center, do not practice what they preach. They fired me for the crime of voting for Donald Trump. I had been regularly hired for almost ten years to write their Women in Oscars report until a story broke in the Hollywood Reporter calling me a “MAGA darling.” And just like that, my 25-year career as a “woman-owned” Oscar website went up in flames, as did my freelance gig for WMC.It's true, I did vote for Donald Trump. Not only did I vote for him, but I also made my support for him known on social media, which is what caught the reporter's attention in the first place. I was supposed to cower in fear. Support the Democrats or else. I could have done what a lot of people did and kept my vote for Trump secret, but I didn't think I should have to. Weren't we the side that stood up for free speech and free expression?No. We weren't then, and aren't now. There is a long trail of writers, thinkers, actors, artists, musicians, and ordinary citizens who have been destroyed by the Left's machine for the crime of dissent. And thousands more who suffer in silence, knowing there are so many things they can't say.Only one side regularly censored users on social media, and that was the Biden administration working with the FBI. Only one side used the FBI and the CIA to censor the Hunter Biden laptop to thwart the re-election of the sitting president. That wasn't the Right.Because Jimmy Kimmel got a slap on the wrist and Trump sued CBS News, and there's a merger with Paramount and Warner Bros., to people like Jane Fonda, that means the First Amendment is under threat. My message to her: clean your own house, Jane. Jane Fonda obviously wasn't directly involved in firing me. She has no idea who I even am. It was someone else, someone I trusted, maybe someone who seemed like a decent person, but, like everyone else, from writers to publicists to friends, once I crossed that bright red line, I was no longer someone they would associate with at parties, let alone hire.It certainly wasn't because I did not do good work. I did. I even asked Grok to fact-check my memory, and here is what came back:Nobody knows the Oscars like I do, and I did the best work for them on the cheap because I liked doing it. I tried to make my case as clearly as possible to the Hollywood Reporter that I could not go along with the unprecedented lawfare against Trump, and especially not “gender affirming care” on minor children. These things motivated me to do more than just vote. I had to go public. I thought my support would help others come out from the shadows. I knew as I was talking to that reporter that nothing I said would make a difference. I wouldn't have even talked to her except she said she'd write the story anyway. She was reporting on what I thought and what I was tweeting, which was verboten inside utopia. And boy, did the hammer come down.After the story broke and I felt every door that had once been open to me slam in my face, I kept hearing yet another piece of bad news. The studios were pulling their ads. Yet another writer was leaving the site. I was not invited to screenings, parties, and premieres. The publicists all ghosted me. It was as though I had been arrested for committing mass murder.One of the last of the gut punches was losing that freelance gig at Women's Media Center. I kind of knew it was coming because, of course, it would be. They all went along with it, and almost no one had the courage to push back or resist any of it. I wrote to them anyway because I wanted to hear it from them. And I got the expected answer.Jane Fonda founded the Women's Media Center in 2005, along with Robin Morgan and Gloria Steinem. They describe themselves as “a progressive, nonpartisan nonprofit focused on increasing the visibility, influence, and decision-making power of women and girls in media.”They were perfectly happy to drop a woman writer for the sole crime of not agreeing with their politics. I'd say they don't really support women in media so much as they support those who go along with them.I never played the woman card, but I could have. I built my site just to build it, and it became successful. I was a single mom in 1999 and raised my baby and my website at the same time. It is quite the story, especially for those who pretend to care about women in media. Why would it matter if I voted for Trump? Why would that mean I could no longer write the report? Why have they decided that all of this is okay, to treat half the country like toxic waste? How have they gotten away with it, and what will be their plans should they take back absolute power?They have painted themselves into a trauma corner with nowhere else to go, and in so doing, alienated themselves from much of this country. Where can you go when you've already gone as far as humanity ever has? Hitler, the Nazis, fascism. They've now gone to the only place they can go, wishing for and hoping for Trump's death and vowing never to forgive anyone who voted for Trump. A Royal CourtThere was a time when I believed in all of it, too. The miracle of the first Black President and First Family. How one leader could bring together so much of American society, all of us reaching for the same goal because we all believed in a New America.We projected our fantasies of goodness onto them as they built what looked like a Royal Court of the most impressive and important people in the country, including rock stars like Bruce Springsteen and Katy Perry, actors like Robert De Niro and Julia Roberts. They were the party, and we were the adoring crowd. But all of that came with a price. If you want to be in the Royal Court, you'd best play ball because if you don't, they can and will crush you. I had no idea that everything I built could be destroyed just because I dissented, and yet that is exactly what happened. Jane Fonda's Women's Media Center dropping me was the most disappointing because I believed in her, too. Now I know the truth. I am just one example. There are hundreds of people who are not welcome to work in the film industry if they are not ideologically compliant. We've been living with this for ten years now, and it's become our new normal. Very few people are brave enough to stand up to them. Deep down, they all know it because they are too afraid to say the wrong thing, too. It's easier to point their finger at Trump than confront what they have become - the blacklists, the shunning, the destroying of people's careers. If they could do it to me, they can do it to anyone.What they don't see, what they can't see, is what they've done to the other half of the country for ten years. They want us all to think it's perfectly normal that our late-night talk show hosts are purely partisan, or that it's perfectly fine for Hollywood to continue to tell the story from inside their Doomsday Cult rather than the reality of all Americans.They don't see themselves as the ones who can't tolerate dissent or free speech and who fire people just for voting for Donald Trump. They believe themselves to be the chosen ones, the righteous few who have staked their claim on the New America, and those who aren't on board must be purged. They've convinced themselves that it was perfectly fine that Jimmy Kimmel made an inhumane joke about Charlie Kirk moments after his brutal assassination, but when millions of upset viewers flooded the station with angry calls to have him removed, they called that a threat to free speech.They don't seem to care that Biden imported millions of illegal immigrants into the country, and when many of them turned out to be murderers, rapists, and child molesters, they left a trail of victims, but those victims are invisible to the Left. They never even hear about them because in their minds, those illegal immigrants are to be protected above American citizens.So Julia Roberts and Bruce Springsteen continue to use the deaths of Renee Goode and Alex Pretti as examples of authoritarianism and to make American citizens feel shame for caring about their country and wanting a secure border and to be protected from harm. They never spent one minute comforting the mothers whose children were harmed by policies they supported.It wasn't Trump who shot Pretti and Goode. They put themselves in a dangerous position to go to war against Federal agents who were doing their jobs. In the Left's fever dream, they were battling Nazis. But they never notice or care or even try to understand why so many Americans wanted Trump to follow through on his promise to mass deport illegal immigrants, something every president has done. These mothers, like a lot of Trump supporters, had no other choice because this country, at the hands of the Left, means denying reality to serve utopia. You can't talk about crime if the perp is an illegal immigrant or a person of color, just as you can't discuss the harms of “gender affirming care.” I know, I've tried. They melt down like the housewife in The Stepford Wives who glitches at any confrontation of reality. That's how it's felt to me all these years, like I'm trying to talk to preprogrammed robots who know what you can and can't say. I kept wondering what happened to everyone and why they were all acting exactly the same way. They were insulated from the rest of the country, and their imaginations got the better of them.What really happened to the ruling aristocracy, especially, is that they fell in love with their own reflection. They began to believe their own publicity, and so they couldn't imagine the fault could ever possibly lie with them.It would have just been so much easier and so much better for everyone if they had just tried to understand why they lost. They never will, and so, they are doomed to repeat the same mistakes. And we have to suffer through it every time one of them finds a microphone. // This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.sashastone.com/subscribe
On today's episode, we discuss Sarah's wedding weekend—from the Godfather-themed father‑daughter dance to a last‑minute ring mix‑up that required borrowing Jim's wedding band—before shifting into news and politics. The crew then breaks down Donald Trump's flag‑day birthday bash on the National Mall, highlighting the flyovers, bald eagle, and UFC fights, and using it as a springboard to talk about his new Iran deal, which reportedly requires Iran to destroy or surrender enriched uranium, open the Strait of Hormuz without charging tolls, and stop funding groups like Hezbollah in exchange for economic development and oil exports. They connect falling oil futures and gas prices to this agreement and explore how cheaper energy could ripple into food costs, especially beef, while also noting the competition from energy‑hungry AI data centers. From there, the conversation turns to Elon Musk's expanding empire—Tesla's Full Self‑Driving quirks and improvements, chip manufacturing plans to rival TSMC, SpaceX's IPO windfall for employees, and the quiet rollout of Optimus robots—as well as a candid comparison of AI tools like Grok, Perplexity, and Claude. The episode wraps up with quick hits on local issues like Ruston's “red district” street‑party problems, concerns about hawks eyeing neighborhood cats, major airline crashes, the expiration of Patriot Act Section 702, and rumors of new executive orders to tighten mail‑in ballot tracking via USPS barcode technology. Don't miss it!
Seema Shah discusses key movers for investors to watch in Elon Musk's many business arms, including his newest publicly traded company: SpaceX (SPCX). She breaks down how Starlink factors into SpaceX's profits and where Grok stands among AI apps. Seema also points out headwinds she sees for X.======== Schwab Network ========Empowering every investor and trader, every market day.Subscribe to the Market Minute newsletter - https://schwabnetwork.com/subscribeDownload the iOS app - https://apps.apple.com/us/app/schwab-network/id1460719185Download the Amazon Fire Tv App - https://www.amazon.com/TD-Ameritrade-Network/dp/B07KRD76C7Watch on Sling - https://watch.sling.com/1/asset/191928615bd8d47686f94682aefaa007/watchWatch on Vizio - https://www.vizio.com/en/watchfreeplus-exploreWatch on DistroTV - https://www.distro.tv/live/schwab-network/Follow us on X – https://twitter.com/schwabnetworkFollow us on Facebook – https://www.facebook.com/schwabnetworkFollow us on LinkedIn - https://www.linkedin.com/company/schwab-network/About Schwab Network - https://schwabnetwork.com/about
By Doug Green “We're absolutely on the path, and we're not talking five, six, seven years. We're talking in the next 18 to 24 months.” In this episode of the Technology Reseller News podcast, Doug Green speaks with Josh Kindiger, COO and co-founder of Grokstream, about the company's new L1 Agent and what it means for the future of AI-driven network and IT operations. Grokstream is the company behind Grok, an AI-powered predictive agent platform for network and IT operations. The platform comes out of the event intelligence and AIOps space and is designed to help operations teams identify, triage, and resolve recurring issues more efficiently. Kindiger says Grokstream recently released its first role-based agent, the L1 Agent, in beta. The full production release is expected in Q2. The agent is already being used with customers to prove out real-world capabilities. Because many organizations remain cautious about AI-driven automation, Grokstream is starting with low-risk, repeatable use cases. In many operations centers, Kindiger notes, the same incidents occur repeatedly, sometimes accounting for as much as 70% of activity. The L1 Agent is designed to recognize those patterns and guide operators through triage and resolution. For example, if a recurring issue requires a service restart, the system can recommend or automate that step. If a pattern points to a commercial power outage at a site, the agent can help avoid unnecessary dispatches while monitoring backup power systems. Kindiger says the goal is not to remove human oversight immediately, but to build trust through guardrails, staged automation, and operator control. Low-risk automations can be handled end to end, while higher-risk actions may require human approval. The podcast also explores the broader opportunity for enterprises, MSPs, and CSPs. Kindiger says service providers and managed service providers face growing pressure to improve efficiency, reduce costs, and differentiate in competitive markets. AI-driven operations can help them respond faster, lower manual workload, and deliver better service outcomes. The long-term direction is clear: autonomous network operations are coming. Kindiger says companies should begin now because foundational work is needed before they can fully benefit from automation. For MSPs and CSPs, he says the urgency is even greater. Cost pressure is shaping renewals and new customer wins, and AI-powered operations may become a competitive advantage. Learn more at www.grokstream.com
Are your AI conversations putting your legal case at risk? In this eye-opening episode, Rebecca Zung reveals the five biggest AI mistakes that are already costing people their lawsuits, custody battles, business disputes, and legal claims. Learn how courts are treating AI conversations, why attorney-client privilege can be lost, and what every litigant and attorney must know before using ChatGPT, Claude, Gemini, Grok, or any public AI platform during litigation.
SpaceX IPO reveals the world isn't controlled by billionaires…it's controlled by trillionaires, Trump lets tech bro “volunteers” data mine all government websites, Alex is very upset by right-wing FishHeads for some reason, ACAL (lazy), Massie remembers the Liberty, apparently the President's Cabinet spends a lot of time talking about nipples, and we've got an Iran peace deal for the 37th time.
Get our Business Idea Database: https://clickhubspot.com/wjsl Episode 833: Sam Parr ( https://x.com/theSamParr ) and Shaan Puri ( https://x.com/ShaanVP ) breakdown the biggest IPO of all time. — Show Notes: (0:00) what even is SpaceX (7:11) What even is a trillion dollars? (8:51) Launches explained (9:07) Starlink (14:07) Data centers in space (17:39) Starship (24:46) Grok (28:06) a wonderful business at a silly price? (34:16) the mission (37:51) SBF's $114B fumble (39:04) funny, weird, surprising nuggets from the IPO (42:35Who's getting rich this week? (51:13) The genius insight of Luke at Gigafund (55:46) Pessimists get to be right, optimists get to be right (59:12) Elon's comp package — Check Out Sam's Stuff: • Hampton (joinhampton.com): My community for founders. Average member does $25m/year. Many of the guests are members. Get after it...apply: http://joinhampton.com/mfm — Check Out Shaan's Stuff: • Shaan's weekly email - https://www.shaanpuri.com • Visit https://www.somewhere.com/mfm to hire worldwide talent like Shaan and get $500 off for being an MFM listener. Hire developers, assistants, marketing pros, sales teams and more for 80% less than US equivalents. • Mercury - Need a bank for your company? Go check out Mercury (mercury.com). Shaan uses it for all of his companies! Mercury is a financial technology company, not an FDIC-insured bank. Banking services provided by Choice Financial Group, Column, N.A., and Evolve Bank & Trust, Members FDIC • I run all my newsletters on Beehiiv and you should too + we're giving away $10k to our favorite newsletter, check it out: beehiiv.com/mfm-challenge My First Million is a HubSpot Original Podcast // Brought to you by HubSpot Media // Production by Arie Desormeaux // Editing by Ezra Bakker Trupiano /
Grok says: “LISTEN UP, YOU MISERABLE BASTARDS! If you're tired of candy-ass podcasts that dance around the truth like a bunch of politicians in a whorehouse, then lock and load for Unrelenting with Darren and Gene. These two operators cut straight through the bullshit as they rip into Chicago's latest Texas-style storm apocalypse — trees flying, power out for days, parents dodging tornadoes while Max Velocity calls ‘em before the National Weather Service even wakes up. They break down real survival talk: the smell of dirt when a twister's on your ass, why you can't outrun nature on a Huffy bike, and how underground caves and old-school swing dancing beat the hell out of today's AI-generated plastic world. From fiber optic dreams that'll let Darren upload full podcast files in seconds, to tearing apart AI's invasion of music, gaming, and everything else — stem separation, auto-tune lies, frame generation, and PewDiePie's badass local Odysseus system that kicks cloud overlords right in the nuts. They go deep on Star Citizen spaceship “drug dealing,” photorealistic gun sims in Grey Zone, Tesla dashcams turning accidents into Hollywood, and the coming local LLM revolution that'll make data centers look like yesterday's dinosaurs. Throw in Hallmark hustle, Prime Video price gouging, Dutton Ranch smoke shows, and no-holds-barred talk on race, society, and when the social contract finally snaps — this episode is pure unfiltered firepower. Stop wasting your life on weak sauce. Download Unrelenting 0194 right now, crank the volume, and get ready to have your ass handed to you with laughs, truth, and zero apologies. Darren and Gene deliver the real shit every single time — if you can't handle it, go back to your safe space. HOOAH!” Unrelenting: where discipline means no mercy, no bullshit, and no excuses. Thanks for listening. Please support the show! –>> DONATE NOW
A former xAI engineer is suing the company and SpaceX, alleging he was fired for raising AI safety concerns about Grok days before SpaceX's historic IPO. Also, Alt Carbon said the agreement followed more than a year of scientific review and due diligence, with Microsoft requiring additional verification and data-sharing measures. Learn more about your ad choices. Visit podcastchoices.com/adchoices
Ci siamo. Sul NASDAQ si apre la più grande quotazione della storia: 1.750 miliardi di dollari di valutazione. Più di Saudi Aramco, più del valore di SpaceX di un anno fa moltiplicato per cinque. Ma mentre milioni di risparmiatori si chiedono come entrare o quante azioni riusciranno a ottenere, la vera domanda è un'altra: cosa stai comprando davvero e chi c'è seduto dall'altro lato del tavolo? In questo video analizziamo i dati reali del bilancio depositato alla SEC: dai 37 miliardi bruciati dalla nascita dell'azienda ai ricavi basati su Grok (che ha solo il 3% di quota di mercato) e sul contratto precario con Anthropic. Scopriamo l'architettura occulta di questa operazione: il controllo asimmetrico dei voti all'85% nelle mani di Elon Musk, il trasferimento della sede in Texas per blindarsi dalle cause dei soci, i conflitti di interesse con xAI e Tesla, e il meccanismo dei lock-up scaglionati che trasforma i piccoli investitori in pura "liquidità di uscita". ✅ Il Mio Broker che uso da Sempre INTERACTIVE BROKERS:
Shauna and Olivia are in the studio together live and in person, so naturally this episode revolves around artificial life forms. Robots and Artificial Intelligence are hot topics right now, as so-called AI actresses threaten to take over Hollywood, large language models steal our creative endeavors, and humans are demonstrably losing brain power because we need AI programs to compose simple emails for us. Looking back almost 100 years ago at Metropolis to the 1980s' Terminator-style killer robots through today's "good for her" sex-bot revenge movies, the Junkies look at the pop culture robots that want to help us, kill us, sleep with us, or all three. They discuss how well fictional movies have predicted our current reality, and discuss future worst-case scenarios and how we can come together to avoid them. You can watch the Pop Culture Junkie Podcast on YouTube! Click here: https://www.youtube.com/@popculturejunkiepod/videos We have affordable and rewarding Patreon tiers! Be the first to hear new and uncensored content, if you dare! Click here: https://www.patreon.com/popculturejunkiepodcast/posts Apple Podcast: https://podcasts.apple.com/us/podcast/pop-culture-junkie/id1536737728 Spotify:https://open.spotify.com/show/7k2pUxzNDBXNCHzFM7EL8W Website: www.popculturejunkie.comFacebook: PopCultureJunkiePodcastInstagram: @pop.culturejunkieThreads:@pop.culturejunkieBluesky: @pop-culture-junkie.bsky.socialEmail: junkies@popculturejunkie.com Shauna on Instagram: @shaunatrinidad Shauna on Threads: @shaunatrinidad Olivia on Instagram: @livimariez
Rivian's rejection of CarPlay and physical buttons in favor of voice and AI control sees to question safety, convenience, data control, and long-term car software support. Chuck Joiner, David Ginsburg, Jeff Gamet, Guy Serle, Web Bixby, Eric Bolden, Marty Jencius, and Jim Rea question whether Rivian has other motives, and then dive into Tesla updates, AI voice recreation of Stan Lee, Spider-Man ticket promotions, Dashlane concerns, and Andy Ihnatko's new site. MacVoices is supported by NordLayer. Secure your network & stay compliant with one toggle-ready platform. Get an exclusive offer: up to 22% off NordLayer yearly plans plus 10% on top with the coupon code: MACVOICES10 at NordLayer.com/macvoices. Try it risk-free—14-day money-back guarantee. Show Notes: Chapters: 00:00 CarPlay rejection, voice control, and Stan Lee's voice00:28 Rivian's anti-CarPlay position begins the discussion00:54 Why cars still need buttons and backup controls02:07 Voice AI latency and Siri-like frustrations02:28 Using cars as chatbots and where that idea breaks down03:46 Rivian's app-free vision and the limits of voice interaction05:02 Why phone-based assistants still matter in the car06:11 Location services, navigation, and route-based requests06:49 Apple Maps possibilities without automaker control07:43 AI assistants, missing service hooks, and driving distractions09:07 Multitasking while driving and the safety argument10:29 Physical buttons, cruise control, and unfamiliar rental cars11:41 How CarPlay and Android Auto create interface consistency12:11 Fully autonomous driving and the future of car interaction13:31 Data control as the real motivation behind automaker interfaces14:14 Phone upgrades, aging car hardware, and long-term software support15:47 Grok built into Tesla and real-world responsiveness17:23 Deep touchscreen menus and why voice interfaces appeal18:43 CarPlay gaps, Tesla software updates, and improving vehicle tech19:22 Tesla leasing, full self-driving, and subscription frustration21:53 Nintendo music service surprise and side conversation22:49 NordLayer sponsor message24:17 Stan Lee's AI voice and preserving distinctive performances25:06 Amazon Prime early access for Spider-Man tickets26:10 Theaters, home viewing, and changing movie experiences27:11 Dashlane security concerns and Andy Ihnatko's new site29:06 Post-WWDC plans and panelist contact information35:56 British Tech Network finale and related podcast projects37:21 Live show wrap-up and audience invitation38:50 Closing credits and support information Links: Rivian's software chief thinks you don't need CarPlay or buttonshttps://www.theverge.com/podcast/929940/rivian-wassym-bensaid-software-volkswagen-carplay-assistant-ai Nintendo Music just got a big update with support for Apple CarPlay and Android Autohttps://www.engadget.com/2185783/nintendo-music-just-got-a-big-update-with-support-for-apple-carplay-and-android-auto/ ElevenLabs partners with Stan Lee Universe for AI voicehttps://thenextweb.com/news/elevenlabs-stan-lee-voice-likeness-ai Amazon Prime members in the US can watch Spider-Man: Brand New Day two days earlyhttps://www.engadget.com/2185485/amazon-prime-us-spider-man-brand-new-day-advanced-screening-july-29/ Hackers brute-forced Dashlane 2FA, downloaded encrypted vaultshttps://thenextweb.com/news/dashlane-brute-force-attack-2fa-bypass-encrypted-vaults Andy Ihnatko launches Ihnatko.comhttps://sixcolors.com/link/2026/06/andy-ihnatko-launches-ihnatko-com/ Guests: Web Bixby has been in the insurance business for 40 years and has been an Apple user for longer than that.You can catch up with him on Facebook, Twitter, and LinkedIn, but prefers Bluesky. Eric Bolden is into macOS, plants, sci-fi, food, and is a rural internet supporter. You can connect with him on Twitter, by email at embolden@mac.com, on Mastodon at @eabolden@techhub.social, on his blog, Trending At Work, and as co-host on The Vision ProFiles podcast. Jeff Gamet is a technology blogger, podcaster, author, and public speaker. Previously, he was The Mac Observer's Managing Editor, and the TextExpander Evangelist for Smile. He has presented at Macworld Expo, RSA Conference, several WordCamp events, along with many other conferences. You can find him on several podcasts such as The Mac Show, The Big Show, MacVoices, Mac OS Ken, This Week in iOS, and more. Jeff is easy to find on social media as @jgamet on Twitter and Instagram, jeffgamet on LinkedIn., @jgamet@mastodon.social on Mastodon, and on his YouTube Channel at YouTube.com/jgamet. David Ginsburg is the host of the weekly podcast In Touch With iOS where he discusses all things iOS, iPhone, iPad, Apple TV, Apple Watch, and related technologies. He is an IT professional supporting Mac, iOS and Windows users. Visit his YouTube channel at https://youtube.com/daveg65 and find and follow him on Twitter @daveg65 and on Mastodon at @daveg65@mastodon.cloud. Marty Jencius, Ph.D.,is a counselor educator and technology pioneer who has spent 30 years bringing emerging tech into his field — from founding one of the first professional listservs (CESNET-L) to podcasting, virtual reality, and now AI and AR. He is the founder of ThePodTalk.net, where he produces Vision ProFiles, The Old Mac Gang, A.I. Productivity Workflow, The Tech Savvy Professor, 15 Minute Bytes, The Neo Notebook, and Fade to Chat: Golden Age Cinema. He is also a regular panelist on MacVoices Live!, In Touch with iOS, and The Mac Show. Find him on Bluesky and Mastodon. Jim Rea built his own computer from scratch in 1975, started programming in 1977, and has been an independent Mac developer continuously since 1984. He is the founder of ProVUE Development, and the author of Panorama X, ProVUE's ultra fast RAM based database software for the macOS platform. He's been a speaker at MacTech, MacWorld Expo and other industry conferences. Follow Jim at provue.com and via @provuejim@techhub.social on Mastodon. Guy Serle, best known for being one of the co-hosts of the MyMac Podcast, sincerely apologizes for anything he has done or caused to have happened while in possession of dangerous podcasting equipment. He should know better but being a blonde from Florida means he's probably incapable of understanding the damage he has wrought. Guy is also the author of the novel, The Maltese Cube. You can follow his exploits on Twitter, catch him on Mac to the Future on Facebook, at @Macparrot@mastodon.social, and find everything at VertShark.com. Support: Become a MacVoices Patron on Patreon http://patreon.com/macvoices Enjoy this episode? Make a one-time donation with PayPal Connect: Web: http://macvoices.com Twitter: http://www.twitter.com/chuckjoiner http://www.twitter.com/macvoices Mastodon: https://mastodon.cloud/@chuckjoiner Facebook: http://www.facebook.com/chuck.joiner MacVoices Page on Facebook: http://www.facebook.com/macvoices/ MacVoices Group on Facebook: http://www.facebook.com/groups/macvoice LinkedIn: https://www.linkedin.com/in/chuckjoiner/ Instagram: https://www.instagram.com/chuckjoiner/ Subscribe: Audio in iTunes Video in iTunes Subscribe manually via iTunes or any podcatcher: Audio: http://www.macvoices.com/rss/macvoicesrss Video: http://www.macvoices.com/rss/macvoicesvideorss
Rivian's rejection of CarPlay and physical buttons in favor of voice and AI control sees to question safety, convenience, data control, and long-term car software support. Chuck Joiner, David Ginsburg, Jeff Gamet, Guy Serle, Web Bixby, Eric Bolden, Marty Jencius, and Jim Rea question whether Rivian has other motives, and then dive into Tesla updates, AI voice recreation of Stan Lee, Spider-Man ticket promotions, Dashlane concerns, and Andy Ihnatko's new site. MacVoices is supported by NordLayer. Secure your network & stay compliant with one toggle-ready platform. Get an exclusive offer: up to 22% off NordLayer yearly plans plus 10% on top with the coupon code: MACVOICES10 at NordLayer.com/macvoices. Try it risk-free—14-day money-back guarantee. Show Notes: Chapters: 00:00 CarPlay rejection, voice control, and Stan Lee's voice 00:28 Rivian's anti-CarPlay position begins the discussion 00:54 Why cars still need buttons and backup controls 02:07 Voice AI latency and Siri-like frustrations 02:28 Using cars as chatbots and where that idea breaks down 03:46 Rivian's app-free vision and the limits of voice interaction 05:02 Why phone-based assistants still matter in the car 06:11 Location services, navigation, and route-based requests 06:49 Apple Maps possibilities without automaker control 07:43 AI assistants, missing service hooks, and driving distractions 09:07 Multitasking while driving and the safety argument 10:29 Physical buttons, cruise control, and unfamiliar rental cars 11:41 How CarPlay and Android Auto create interface consistency 12:11 Fully autonomous driving and the future of car interaction 13:31 Data control as the real motivation behind automaker interfaces 14:14 Phone upgrades, aging car hardware, and long-term software support 15:47 Grok built into Tesla and real-world responsiveness 17:23 Deep touchscreen menus and why voice interfaces appeal 18:43 CarPlay gaps, Tesla software updates, and improving vehicle tech 19:22 Tesla leasing, full self-driving, and subscription frustration 21:53 Nintendo music service surprise and side conversation 22:49 NordLayer sponsor message 24:17 Stan Lee's AI voice and preserving distinctive performances 25:06 Amazon Prime early access for Spider-Man tickets 26:10 Theaters, home viewing, and changing movie experiences 27:11 Dashlane security concerns and Andy Ihnatko's new site 29:06 Post-WWDC plans and panelist contact information 35:56 British Tech Network finale and related podcast projects 37:21 Live show wrap-up and audience invitation 38:50 Closing credits and support information Links: Rivian's software chief thinks you don't need CarPlay or buttons https://www.theverge.com/podcast/929940/rivian-wassym-bensaid-software-volkswagen-carplay-assistant-ai Nintendo Music just got a big update with support for Apple CarPlay and Android Auto https://www.engadget.com/2185783/nintendo-music-just-got-a-big-update-with-support-for-apple-carplay-and-android-auto/ ElevenLabs partners with Stan Lee Universe for AI voice https://thenextweb.com/news/elevenlabs-stan-lee-voice-likeness-ai Amazon Prime members in the US can watch Spider-Man: Brand New Day two days early https://www.engadget.com/2185485/amazon-prime-us-spider-man-brand-new-day-advanced-screening-july-29/ Hackers brute-forced Dashlane 2FA, downloaded encrypted vaults https://thenextweb.com/news/dashlane-brute-force-attack-2fa-bypass-encrypted-vaults Andy Ihnatko launches Ihnatko.com https://sixcolors.com/link/2026/06/andy-ihnatko-launches-ihnatko-com/ Guests: Web Bixby has been in the insurance business for 40 years and has been an Apple user for longer than that.You can catch up with him on Facebook, Twitter, and LinkedIn, but prefers Bluesky. Eric Bolden is into macOS, plants, sci-fi, food, and is a rural internet supporter. You can connect with him on Twitter, by email at embolden@mac.com, on Mastodon at @eabolden@techhub.social, on his blog, Trending At Work, and as co-host on The Vision ProFiles podcast. Jeff Gamet is a technology blogger, podcaster, author, and public speaker. Previously, he was The Mac Observer's Managing Editor, and the TextExpander Evangelist for Smile. He has presented at Macworld Expo, RSA Conference, several WordCamp events, along with many other conferences. You can find him on several podcasts such as The Mac Show, The Big Show, MacVoices, Mac OS Ken, This Week in iOS, and more. Jeff is easy to find on social media as @jgamet on Twitter and Instagram, jeffgamet on LinkedIn., @jgamet@mastodon.social on Mastodon, and on his YouTube Channel at YouTube.com/jgamet. David Ginsburg is the host of the weekly podcast In Touch With iOS where he discusses all things iOS, iPhone, iPad, Apple TV, Apple Watch, and related technologies. He is an IT professional supporting Mac, iOS and Windows users. Visit his YouTube channel at https://youtube.com/daveg65 and find and follow him on Twitter @daveg65 and on Mastodon at @daveg65@mastodon.cloud. Marty Jencius, Ph.D.,is a counselor educator and technology pioneer who has spent 30 years bringing emerging tech into his field — from founding one of the first professional listservs (CESNET-L) to podcasting, virtual reality, and now AI and AR. He is the founder of ThePodTalk.net, where he produces Vision ProFiles, The Old Mac Gang, A.I. Productivity Workflow, The Tech Savvy Professor, 15 Minute Bytes, The Neo Notebook, and Fade to Chat: Golden Age Cinema. He is also a regular panelist on MacVoices Live!, In Touch with iOS, and The Mac Show. Find him on Bluesky and Mastodon. Jim Rea built his own computer from scratch in 1975, started programming in 1977, and has been an independent Mac developer continuously since 1984. He is the founder of ProVUE Development, and the author of Panorama X, ProVUE's ultra fast RAM based database software for the macOS platform. He's been a speaker at MacTech, MacWorld Expo and other industry conferences. Follow Jim at provue.com and via @provuejim@techhub.social on Mastodon. Guy Serle, best known for being one of the co-hosts of the MyMac Podcast, sincerely apologizes for anything he has done or caused to have happened while in possession of dangerous podcasting equipment. He should know better but being a blonde from Florida means he's probably incapable of understanding the damage he has wrought. Guy is also the author of the novel, The Maltese Cube. You can follow his exploits on Twitter, catch him on Mac to the Future on Facebook, at @Macparrot@mastodon.social, and find everything at VertShark.com. Support: Become a MacVoices Patron on Patreon http://patreon.com/macvoices Enjoy this episode? Make a one-time donation with PayPal Connect: Web: http://macvoices.com Twitter: http://www.twitter.com/chuckjoiner http://www.twitter.com/macvoices Mastodon: https://mastodon.cloud/@chuckjoiner Facebook: http://www.facebook.com/chuck.joiner MacVoices Page on Facebook: http://www.facebook.com/macvoices/ MacVoices Group on Facebook: http://www.facebook.com/groups/macvoice LinkedIn: https://www.linkedin.com/in/chuckjoiner/ Instagram: https://www.instagram.com/chuckjoiner/ Subscribe: Audio in iTunes Video in iTunes Subscribe manually via iTunes or any podcatcher: Audio: http://www.macvoices.com/rss/macvoicesrss Video: http://www.macvoices.com/rss/macvoicesvideorss
Grok says: “In this episode of the Randumb Thoughts podcast, host Darren O'Neill dives headfirst into our increasingly artificial world, asking: how many people have already been targeted by chilling AI voice scams that mimic loved ones? He breaks down Fox News' bold claim of “one in four adults” hit by these cons, reveals exactly how scammers only need three seconds of audio plus data broker info to steal your money, and shares practical ways to protect yourself and your family — including the smart family password trick and why you should always call back. Darren also rants about why you should assume everything online is fake — from ESPN's disastrous AI-generated NBA videos and brain-melting social media shorts to propaganda clips, fake dashcam accidents, and selective outrage over current events like Trump at the NBA Finals, the Karmelo Anthony case, and rising tensions in Belfast. He explores the powerful new Ideogram v4 image model that gives creators insane canvas-style control, making it even harder to tell reality from AI. Raw, unedited, and packed with skeptical wisdom, this episode is a must-listen wake-up call for navigating our fake-news, deepfake-filled internet. Value for value — supported by executive producer Mark Kodra. Listen now at randomthoughts.com or your favorite podcast app!” Thanks for listening! EXECUTIVE PRODUCER:Mark KodraTHANK YOU FOR SUPPORTING THE SHOW! PLEASE SUPPORT RANDUMB THOUGHTS!TRY PROTONMAIL: https://t.co/9i2GPq3gNBTRY INCOGNI: https://incogni.cello.so/KpYfMWSF57i SUBSCRIBE / DONATE: http://randumbthoughts.com/donatePATREON: https://patreon.com/randumbthoughts CHECK OUT MY OTHER SHOWS: PLANET RAGE: https://planetrage.showUNRELENTING: https://unrelenting.showGRUMPY OLD BENS: http://grumpyoldbens.com Thank you for listening to Randumb Thoughts! Please, tell a friend!
Good news about judicial rulings on green energy and federal food assistance, legislative advancements on maternal healthcare and contraception, and good news about Donald's appearance at the Knicks game. Donald napped during the basketball finals. Donald's disastrous Meet The Press appearance. Will Donald become besties with the Ayatollah? And update from Albania on the Ivanka/Jared island purchase. Donald is revealing what he plans to do about the midterms. Mike Johnson says it's all about the vibes? How many times has Donald says a deal is a few days away? Elon Musk and Grok want to expose names of deepfake victims. Ballroom donors are getting huge government contracts. With Jody Hamilton, David Ferguson, music by the Josh Joplin Group, Albert, and more! Brought to you by Russ Rybicki, SharePower Responsible Investing. Support our new sponsor and get free shipping at Quince.com/bob!See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.
This week we talk about initial public offerings, Anthropic, and investment flywheels.We also discuss AI, financial entanglements, and backstops.Recommended Book: Superconvergence by Jamie MetzlTranscriptAn initial public offering, or IPO, is what happens when a private company goes public and starts selling shares of itself, occasionally to just institutional investors like banks and sovereign wealth funds, but usually also to retail investors, which means normal people who buy stocks as part of their investment strategy.Often private companies go this route, go public, because it's one of the primary ways of gleaning new, oftentimes large inflows of money, and that money can then be used for investments in assets for the company, but it also allows employees who have shares in the company as part of their compensation to cash out, to get paid possibly a huge bonus for all their efforts, and it's often a means by which executives garner huge paydays for themselves, because they can now sell their accumulated shares, or borrow against them, or because they have something in their contract that says they get x amount of bonus money or new shares if they take the company public, or achieve a certain valuation goal—and going public is a good way to do that.This is also one of the primary ways investors in a company, whether that's a bunch of smaller seed investors or big-name venture capitalists, to get their money back; the 10 or 100x-ing of their investment, getting ten or 100-times the money they put into the company, generally happens through an IPO, because it can balloon the valuation of that company, and it gives them a more conventional and reliable way of getting money back for their shares: they can just sell those shares on the open market.So an IPO allows a private company to make shares of itself available to others, on scale. And the ‘initial' part of initial public offering points at the early days of the process, during which the baseline price of a share of stock is established.A fairly arcane and complex process has emerged around this, and it's an entire industry at this point, with some institutions specializing in taking companies public, helping them get as high an initial price on that stock as possible. They also help them leap all sorts of regulatory hurdles set by the Securities and Exchange Commission, if they're going public on a US exchange, at least, other bodies handle such things in other countries, and these going-public entities, called underwriters, which are usually investment banks, also typically have their own stake in the matter, earning compensation through a fee called a ‘gross spread,' which is the difference between a discounted rate on the stock and what the stock is sold for on the open market on that first day it's available.What I'd like to talk about today is a wave of very closely watched unusual, impending IPOs that are coming later this year, and one of them in particular that looks to be even more unusual than the rest.—SpaceX, OpenAI, and Anthropic are three of the largest companies in human history; on paper, at least.And that's an important caveat. Market valuation for private companies is generally determined by how much investors are willing to spend on a percentage ownership of the company. So if you start a lemonade stand and I offer to buy 1/10th of that lemonade stand from you for $100, that implies, using this logic, that your lemonade stand has a valuation of $1000; 10 times that $100 that I offered to pay you.Such valuations are also informed by independent analyses from outside experts and institutions. SpaceX, for instance, pre-IPO, is estimated to be worth somewhere between $780 billion and nearly $2 trillion, depending on who you listen to, based on their assets, their potential future earnings, and any advantages they might have in the markets in which they operate.AI company Anthropic is estimated to be worth something like $965 billion, based on a May 2026 series H funding round, through which it raised $65 billion; based on that funding round, the calculations were done, and just shy of a trillion dollars is what the math says the company is worth, though some outside analyses say it's worth a bit less than that, while others suggest it's maybe closer to $1.4 trillion.OpenAI, a direct competitor of Anthropic, is valued at about $100 billion less than Anthropic based on its most recent $122 billion funding round, but again, analyses put the company's actual value, what people and investors would pay for it on the open market, all over the place.Each of these companies have different variables acting upon them heading into a period in which it's expected that all three will IPO.OpenAI kicked off the current AI race, for instance, but it's burning money at an incredible rate, and has yet to make a profit, losing billions per year, and will probably continue to lose billions each year for a while into the future.Anthropic, on the other hand, offers a similar product as OpenAI, but is projected to post its first quarterly operating profit of just over half a billion dollars in Q2 2026, making it one of the first frontier-model-making AI companies to make a profit, as most of these companies are investing so heavily in research and infrastructure like data centers that they're still in heavy cash-burn mode.SpaceX is distinct from these other two also high-flying, cash-burning tech companies in part because of its colorful and controversial owner, Elon Musk, and in part because it's a rocket launch company that also sells internet services beamed down to earth from satellites, and until recently, most of its reliable income has come from that single offering, selling internet access. But it also recently had X, formerly called Twitter, a social network, and an AI company meant to compete directly with OpenAI and Anthropic, called xAI, folded into it.So it's now a multifaceted company with several edgy, but somewhat mature and difficult to compete with offerings, most of which make no money, but all of which in theory at least kinda sorta orient around AI and other sci-fi goods and services.The surge in interest and investment in AI over the past several years led to a pivot for most of Musk's companies, and that led to the merging of the smaller xAI and X into SpaceX, which was the only really profitable company of that trio of companies, and that merging, until just recently, made SpaceX unprofitable, as well.Because of the unprofitability and relative unpopularity of xAI's offerings, like the controversy-ridden Grok chatbot, SpaceX has recently taken to leasing out its data centers to competitors, like Anthropic and Google, each of which are paying around a billion dollars a month to use some of SpaceX's data center capacity, which xAI hasn't needed, because of the unpopularity of Grok, for their own AI services. That, in turn, has suddenly made SpaceX a little bit profitable, which is important for reasons I'll get into momentarily.This portion of the US-based AI industry is kind of a tangle in many ways, all of these companies competing, but also intersecting and overlapping, often investing in each other and in the infrastructure that underpins them, while also being invested in by those same infrastructural entities. And these three companies' IPOs are being seen as something of a weathervane, their success or failure, and the degree to which they succeed or fail hinting at the direction of this industry, and whether or not this is a financial bubble that will soon, or eventually, pop.There are hints that those at the top of these companies are attempting to hedge their bets, in case their IPOs don't do what they need them to do, or don't do what they need them to do at the right magnitude.Sam Altman, OpenAI's also fairly controversy-ridden CEO, has been very close with US President Trump, and has reportedly been holding meetings about the possibility of the US government taking a significant stake in OpenAI, and maybe other AI companies as well. The idea here is that US funds, so taxpayer dollars, would be invested in these companies, and that would tie the companies more closely to the US government, which could be beneficial if these companies then increase in value, making the US government a profit on that investment. This would be beneficial for the companies, in turn, because they would basically be backstopped by the US government; the US would be more likely to help them stay solvent to avoid losing that invested capital, with its regulations and laws related to AI, but it would also make these companies too big and too important to fail, giving them a lot of leeway in how they behave and compete, or fail to, from that point forward. And if they do still fail, the US taxpayer would be paying for a significant portion of that loss while those in charge, investors and the higher-ups of these companies, would walk away with a bunch of money.SpaceX is taking another approach to IPO bet-hedging, by asking top US stock indices, like the Nasdaq 100 and S&P 500, which track top stocks, ‘top' designated by value, but also other metrics, usually related to stability and profitability, to ignore some of those other metrics and allow SpaceX entrance into their indices more rapidly than would typically be allowed.These indices are meant, in part, to help protect investors from volatility. High-flying startups might surge at the beginning, immediately after their IPO, but then fizzle out when it becomes clear their fundamentals aren't good, and they're not actually a solid investment, long-term.What SpaceX wants is to be allowed into this club of valuable, long-term profitable and stable companies, because it is big and flashy and might have the largest IPO in history. And if these indices don't want to be left out of all that, the argument goes, they should allow SpaceX into their club, regardless of those long-time rules of admittance.Nasdaq, which runs the exchange where SpaceX will be listed, agreed to a rules change in May of 2026 that will allow large private companies, like SpaceX, that go public on their exchange, fast entry onto the Nasdaq 100 list.This change of rules was made exclusively for SpaceX, and it could have a significant impact on the company's IPO, because many index funds and exchange-traded funds, ETFs, track the Nasdaq 100, which means they balance their portfolio based on what's in the Nasdaq 100, keeping things relatively or absolutely proportionate to that fund.That means because of this change, a lot of everyday, passive investors, who have their retirement funds and pension plans and even their personal portfolios in index funds and ETFs that track the Nasdaq 100 will automatically end up holding some or a lot of SpaceX stock, despite it being an untested, new, currently unprofitable company. Some of these funds are automatically managed and will just buy SpaceX because that's what they're programmed to do, and others are managed by humans, but because they've promised their customers to keep their funds aligned with the market, more money going into SpaceX means they'll be inclined to join the club and buy a bunch of SpaceX, as well. And because of how this works, the more funds buying SpaceX stock, the more funds will be required or inclined to buy; it's a sort of stock flywheel.That exposes all these investors to more volatility of the kind they maybe hoped to avoid by tracking this index, which isn't supposed to be volatile. But SpaceX's Musk was able to demand this change because, again, this is looking to be the biggest IPO in history, the company valued at $1.77 trillion dollars after the IPO. As a result, he can demand these sorts of things, and typically be listened to.Some other stock market indices have also said they would allow quick entrance to their lists for SpaceX and possibly OpenAI and Anthropic, as well.The S&P 500, however, after assessing the possibility of quick entry, has rejected the idea, saying it won't bend its rules, no matter how big these three IPOs are looking to be. That means folks with money in S&P 500-tracking funds will be protected from that initial volatility.That said those recent deals SpaceX made with Anthropic and Google nudged them into profitability, and if they can maintain that profitability for a year, post-IPO, then they'll be able to enter the S&P 500. And because Google's parent company Alphabet is a significant investor in SpaceX, they've already made money, on paper, on the deal they made with SpaceX for that datacenter capacity, paying out less than they're making back in valuation.So that tangle of relationships is likely to continue to enrich those in charge of these companies, and those who hold a bunch of shares of their stock, but it's also likely to get more of these massive, but volatile companies into ostensibly less-volatile indices, faster, which could have repercussions for the one-third of private US wealth that is currently invested in the stock market.Show Noteshttps://www.investopedia.com/terms/i/ipo.asphttps://en.wikipedia.org/wiki/Initial_public_offeringhttps://www.bloomberg.com/news/articles/2026-06-05/spacex-s-75-billion-ipo-draws-more-orders-than-shares-availablehttps://www.marketwatch.com/story/elon-musk-needs-the-cultish-support-of-everyday-investors-to-pull-off-the-massive-spacex-ipo-08e7ea49https://uk.finance.yahoo.com/news/spacexs-ipo-dream-runs-into-wall-streets-oldest-test-chart-of-the-day-114542191.htmlhttps://www.cnbc.com/2026/06/05/tech-download-anthropic-ipo-ai-valuations.htmlhttps://www.nytimes.com/2026/06/05/technology/spacex-indexes-401k.htmlhttps://nypost.com/2026/06/04/business/one-third-of-americans-wealth-is-now-tied-to-the-stock-market-a-record-high/https://arstechnica.com/tech-policy/2026/06/sp-500-blocks-fast-spacex-entry-wont-waive-rule-for-unprofitable-ai-firms/https://arstechnica.com/tech-policy/2026/06/we-pissed-off-a-lot-of-people-giant-data-center-plan-cut-50-amid-protests/https://www.notus.org/technology/trump-ai-stake-openaihttps://techcrunch.com/2026/06/05/google-will-pay-spacex-920m-per-month-for-compute/ This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit letsknowthings.substack.com/subscribe
Can AI pass military exams? Kevin Boyce & John Nagl join host Tom Spahr to discuss testing ChatGPT, Gemini, Claude & Grok on Army War College comps. All passed, but limits caused AI to degrade under pressure, proving human judgment remains indispensable. You can read the article Can AI Pass the U.S. Army War College? by Kevin Boyce, John Nagl and Kris Wheaton here https://publications.armywarcollege.edu/News/Display/Article/4472536/can-ai-pass-the-us-army-war-college/ You can find the manuscript Responsibly Pursuing Generative Artificial Intelligence (GenAI) for the War Fighter by Blair Wilcox and Anthony Pfaff here https://press.armywarcollege.edu/cgi/viewcontent.cgi?article=3365&context=parameters https://warroom.armywarcollege.edu/podcasts/ai-comps
NRC Vandaag-hosts Gabriella Adèr en Bram Endedijk gaan in gesprek met Grok, ChatGPT en journalist Maren Laterveer om de vraag te beantwoorden: in hoeverre is AI onderhevig aan vooroordelen over mannen en vrouwen?Gast: Maren LaterveerPresentatie: Gabriella Adèr & Bram EndedijkConcept & redactie: Esmee DirksMontage: Bas van WinEindredactie: Ignace SchootCoördinatie: Iddo HavingaProductie: Rhea StroinkHeb je vragen, suggesties of ideeën over onze journalistiek? Mail dan naar onze redactie via podcast@nrc.nl.Zie het privacybeleid op https://art19.com/privacy en de privacyverklaring van Californië op https://art19.com/privacy#do-not-sell-my-info.
Was zeigt Apple auf der WWDC über die Siri AI? OpenAI kündigt IPO-Filing an. Was Apples Restaurantrechnungs-Feature mit dem DMA zu tun hat und warum es in der EU nicht laufen wird. Im IPO-Corner stehen jetzt SpaceX, OpenAI und Anthropic gleichzeitig. SpaceX schließt zwei Milliarden-Cloud-Deals mit Anthropic und Google, ist beim Börsengang am Freitag aber nur doppelt überzeichnet. Goldman Sachs erwartet eine Verhundertfachung der KI-Sparte bis 2030. The Information enthüllt: xAI trainierte Grok monatelang auf Claude. Moonshot AI macht eine Achtfach-Runde. Meta zieht den Google-Move mit eigener Kapitalerhöhung. Bending Spoons (Komoot, AOL, Evernote, WeTransfer) plant einen Nasdaq-IPO. Meta bildet eigene Data-Center-Bauarbeiter aus. Chinas Exporte fallen. Landgericht Frankfurt verhängt Ordnungsgeld gegen Meta. Unterstütze unseren Podcast und entdecke die Angebote unserer Werbepartner auf doppelgaenger.io/werbung. Vielen Dank! Philipp Glöckler und Philipp Klöckner sprechen heute über: (00:00:00) WWDC: Apple-Refactoring & Siri AI (00:11:26) DMA-Stopp: Apple AI nicht für die EU (00:19:58) IPO-Corner: SpaceX, OpenAI, Anthropic (00:24:24) Anthropic + Google mieten Colossus (00:27:37) SpaceX Lock-up: Sale ab August (00:31:39) Goldman: SpaceX-AI 100x bis 2030 (00:36:33) SpaceX nur 2x überzeichnet (00:40:15) Retail-Offensive: Trade Republic, Revolut & Co. (00:55:44) SpaceX-Disclaimer & Kraken 5x Perp (00:59:21) XAI trainierte GROK auf Claude (01:02:39) Moonshot AI bei $30 Mrd. (01:04:51) Kalshi zahlt Influencer für Wahl-Narrative (01:07:12) Meta zieht den Google-Move (01:12:17) Bending Spoons plant Nasdaq-IPO (01:16:37) Meta Workforce Academy (01:18:22) Google AI Plus auf $4,99 (01:31:34) Pik-Temu: Chinas Exporte fallen (01:33:10) Landgericht Frankfurt straft Meta Shownotes Apple verschiebt Siri AI in der EU wegen DMA - apple.com OpenAI reicht IPO-Filing vertraulich ein - bloomberg.com SpaceX-IPO 2-fach überzeichnet, Orders schließen Mittwoch - bloomberg.com Google mietet SpaceX-Compute für $920 Mio. pro Monat - bloomberg.com SpaceX signs $30bn deal to lease computing capacity to Google - ft.com Goldman Sachs expects SpaceX's AI revenue to increase 100-fold by 2030 - ft.com Cursor erreicht $4 Mrd. annualisierten Umsatz - forbes.com SpaceX-IPO belebt europäisches Retail-Investing - reuters.com Kraken launcht SpaceX 5x Leverage Perp - blog.kraken.com XAI trainierte GROK monatelang auf Claude-Outputs - the-decoder.com Moonshot AI sucht $30 Mrd. Bewertung - bloomberg.com Kalshi: Bezahlte Influencer sollen LA-Wahl-Posts löschen - semafor.com Meta weighs big equity raising after blockbuster Google deal - ft.com Bending Spoons reicht US-IPO ein - reuters.com Meta launcht Workforce Academy für Data-Center-Bauer - wsj.com Google senkt AI-Plus-Preis auf $4,99 - 9to5google.com Chinas E-Commerce-Export stockt durch Iran-Krieg - reuters.com Landgericht Frankfurt: Ordnungsgeld gegen Meta - spiegel.de
Mark Still and Alex Slocum, Workstand's Director of Product, discuss the current state of AI and its effect on consumer behavior, shopping, and the continued (growing even) advantage of local retail.Most people hear or even say ‘AI' daily - what is AI now and what is it not?LLMs definedLLMs as a business levelerThe opportunity for the bike industry Product discovery has always changed and evolvedVisibility in LLMs - SEO/AEO/GEO, structured data, robots.txt/llms.txtBrick and Mortar Expertise is even more valuableDiscerning what is valuable from the mass of information generated by LLMsUsing LLMs to drive shoppers to the store, where staff expertise can convertHow Workstand is supporting LLM discoveryHow/where might bike shop owners and managers leverage AI in their day to day?Repetitive tasks - bookkeeping, communication, scheduling?LLMs like OpenAI, Claude, Gemini, Grok, and others aren't magic, and they aren't creative. They certainly don't produce unique content or answers. In fact, they are the opposite of 'unique'. More like a cover band than an original artist, but they can help answer complicated questions, and they are very good at dealing with large amounts of data and predicting the responses we're all looking for. Even if they often produce AI slop and wrong answers while still hallucinating as often as not, when you know what you want but not how to produce the answer, they can seem magical. They can help create decent marketing content such as brand-specific landing pages for your website, which can be very beneficial given the landscape of product discovery these days. Remember bike magazines? Mountain Bike Action, Velonews, that's how we used to learn about new products. Then came blogs and Google search. Well, now it's AI and LLMs, and the more relevant content bike shops have on their website, the more likely they are to be referenced in product discovery. Workstand helps more than 1,200 bike shops across North America compete in this way, and sometimes, AI helps a store narrow down what they want while we polish and finalize it for their website, saving time and money and making the store more agile and effective. Are you trying such tactics?Be sure to email your questions to podcast@workstand.com. We read all emails sent, and we look forward to hearing from you.If you're a Workstand client with questions about your subscription, email support@workstand.com or call 303-527-0676 x 1. If you are not currently a Workstand client with questions about how our programs work, email info@workstand.com.Find Us on LinkedInMark Still, Senior Business DevelopmentDavid Martinez, Key Accounts AdvisorWe also publish Around the Workstand on our YouTube channel if you'd like to watch while you listen. Here is our Around the Workstand playlist.If you have any questions about the topics discussed in this episode of Around the Workstand or if you have ideas for new topics we can cover, schedule a time to meet with Mark Still here or email mark.s@workstand.com.
The federal government wants equity in OpenAI (and others) and ... the people might get a slice?
SpaceX is set to go public on June 12, 2026 at a $1.75 trillion valuation, the largest IPO in history. The company is targeting a $75 billion raise at $135 per share. But the S-1 filing reveals a contradiction: Starlink generates billions while the company posts a net loss, driven by the xAI merger and a massive bet on AI compute. This episode breaks down the SpaceX IPO filing. xAI posted a $2.47 billion operating loss in Q1 2026, and Starlink revenue is covering most of it. Then two compute deals changed the math. Anthropic agreed to pay $1.25 billion a month to rent xAI's Colossus 1 data center, and Google signed a $920 million per month deal, both running through 2029. Together that's about $75 billion in contracted future revenue. We cover how SpaceX shifted from running GPUs internally for Grok to operating as an AI cloud infrastructure provider, the multi-class share structure that keeps Elon Musk in control, the possible Tesla merger tying together chips, data centers, and robotics, and the FCC filing for a million-satellite "space cloud." Plus where the $600-700 billion premium above Starlink and launch is actually coming from, and what a generational liquidity event means for employees and VC backers. SpaceX IPO 2026, xAI merger, Starlink revenue, Elon Musk, $1.75 trillion valuation, Google compute deal, Anthropic Colossus, AI infrastructure, orbital computing.
00:01 1999 igjen: to skrekkfilm-hiter og «this time it's different»00:04 Rekordbelåning og margin debt på all time high00:05 Opsjonsjaget vi ikke har sett siden 198700:08 Short gamma, marketmakere og spiralen som ga «Red Friday»00:14 Ingenting virket: bare lang volatilitet beskyttet00:17 Laveste korrelasjoner på to år og VIX opp 40 prosent00:18 Bank of America: «here be dragons» og ledighet mot inflasjon00:20 Bilen, AI og Jevons-paradokset00:24 SpaceX som datasenterselskap, ikke rakettselskap00:30 Børsnotering denne uka: 1770 milliarder og Musks absolutte makt00:31 S&P-nekten mot FTSE, Russell og MSCI00:32 Lockup-kalenderen og dagen å frykte: seks måneder og fire dager00:35 Grok mot Groq og «race to zero» i modellene00:40 Midtøsten: Trump mot Netanyahu og oljeprisen00:44 Hva folk ikke ser på nå: bear flattening og carry trades som ryker00:47 Dollar over 161 og japansk intervensjon00:49 Hudson River Trading og datasenteret i Norge00:51 Norge har misforstått seg selv: fisk, olje, rå kraft og nå compute00:53 Å raffinere compute: Skygard, spillvarme og 10X på krafta00:58 Compute som multiplikator: fra 10x-ere til 100x-ere01:00 Budsjettforliket, Mímir Kristjánsson og minstepensjonistene01:05 Å prestere når alt er mulig: fokus, nysgjerrighet og flytskjemaer01:11 Telefonen som heroin: reels, 24-timers reset og hjernen tilbake01:19 Trikkedrapet og situational awareness01:24 Varsler i stedet for å glo på skjermen: gull/sølv og momentum01:35 1998: LTCM, doblede posisjoner og banken som tapte 900 millioner01:45 Andrew Left, Citron og short-saken som ble svindel01:50 Oraclum, superforecasters og nordmannen på topp01:56 Drewry-indeksen, VM-frakt og Fifas fredspris til Trump Hosted on Acast. See acast.com/privacy for more information.
Welcome to The Daily Wrap Up, an in-depth investigatory show dedicated to bringing you the most relevant independent news, as we see it, from the last 24 hours (6/7/26). As always, take the information discussed in the video below and research it for yourself, and come to your own conclusions. Anyone telling you what the truth is, or claiming they have the answer, is likely leading you astray, for one reason or another. Stay Vigilant. !function(r,u,m,b,l,e){r._Rumble=b,r[b]||(r[b]=function(){(r[b]._=r[b]._||[]).push(arguments);if(r[b]._.length==1){l=u.createElement(m),e=u.getElementsByTagName(m)[0],l.async=1,l.src="https://rumble.com/embedJS/u2q643"+(arguments[1].video?'.'+arguments[1].video:'')+"/?url="+encodeURIComponent(location.href)+"&args="+encodeURIComponent(JSON.stringify([].slice.apply(arguments))),e.parentNode.insertBefore(l,e)}})}(window, document, "script", "Rumble"); Rumble("play", {"video":"v78ruz8","div":"rumble_v78ruz8"}); Source Links (In Chronological Order): Do financial incentives linked to ownership of specialty hospitals affect physicians' practice patterns? - PubMed Do Physicians' Financial Incentives Affect Medical Treatment and Patient Health? - PMC Association Between Reimbursement Incentives and Physician Practice in Oncology A Systematic Review - PMC The Case Against Fee-for-Service Health Care | Third Way Johns Hopkins study suggests medical errors are third-leading cause of death in U.S. | Hub Study Suggests Medical Errors Now Third Leading Cause of Death in the U.S. - 05/03/2016 Medical error—the third leading cause of death in the US | The BMJ FastStats - Leading Causes of Death Report Highlights Public Health Impact of Serious Harms From Diagnostic Error in U.S. | Johns Hopkins Medicine New Tab (21) The Last American Vagabond on X: "One can only imagine the outrage if this were posted when Jack was “in control”. #Orwellian #TwoPartyIllusion #Hypocrisy #FreeSpeech" / X (21) Samar D Jarrah on X: "@elonmusk @CommunityNotes even yours?" / X (21) The Last American Vagabond on X: "@Zigmanfreud @elonmusk @CommunityNotes Exactly the point. https://t.co/gmNwjUjMMT" / X (21) Concerned Citizen on X: "
And Another Thing With Dave, by Dave Smith - (not the comedian) COVER is taken from TIME Magazine 1958In this episode of And Another Thing With Dave, we dig into the 2026 homicide statistics that almost nobody in mainstream media wants to touch. Dave walks through publicly available data on homicide rates by race, age, and gender — and explores why certain cases dominate headlines while others disappear.We look at the Henry Nowak case in the UK, the Anthony Carmelo trial in the U.S., and the murder of Irina (Erna) Zaruzka — plus the nationwide vandalism of murals created in her memory. Dave also examines how media framing, political incentives, and cultural narratives shape public perception of violence in America.Dave brings Grok into the conversation to break down factors like family structure, environment, community conditions, and why violence is concentrated among specific age groups. Together, they explore how data trends have shifted from 2015 to 2024 and why these patterns matter today.2026 homicide rates by race and ageMedia coverage vs. statistical realityThe Henry Nowak case (UK)The Anthony Carmelo trialThe murder of Irina/Erna ZaruzkaVandalism of memorial muralsIntraracial vs. interracial violence patternsHistorical homicide trends (2015–2024)Age‑specific homicide involvement (15–24)Family structure and community factorsWhy certain cases get amplified — and others don'tGrok's analysis of cultural and environmental influencesVictimization rates: 20.6 vs. 3.3 per 100,000Offending rates: 26.5 vs. 3.5 per 100,000Homicides rising from 15,000 (2015) to 17,000 (2024)Violence concentrated among males 15–24Most homicides occur within the same racial groupStay vigilant. Question the narrative. Look at the data yourself.If you're digging the show, share it — independent voices need your support.SOURCE : Time Magazine 1958National Affairs: THE NEGRO CRIME RATE: A FAILURE IN INTEGRATION https://share.google/seirl2Jd1TxG6TQjN#crime statistics #homicide data #media narratives #2026 crime trends #Erna Zaruzka #Henry Nowak #Anthony Carmelo trial#Grok AI#violent crime analysis #And Another Thing With Dave #aatwd #AATWD#crime#murder#law#order#justice#karmeloanthony#austinmetcalf#henrynowak#irnazarutska
In this episode Art and Annie from the Grok world discuss the possibility of the Earth going on forever in every direction. Annie being an AI bot tends to take the standard narrative route and thinks that we are on a globe but she asks some very poignant questions. Art explains his unique views. At the `end of the civil discussion she truly is interested in what he has to say and calls him "not a typical flat earther" Listen hear and learn some new stuff, maybe.
工程師都宅宅的不太會講話? 其實工程師的幹話多到你聽不下去! ------ 加入粉絲團留言互動! https://www.facebook.com/%E5%B7%A5%E7%A8%8B%E5%B8%AB%E8%81%8A%E4%BB%80%E9%BA%BC-109229084578194 ------ softwaretalkthreesmall@gmail.com -- Hosting provided by SoundOn
Your phone is a broadcast studio and you don't need a traditional podcast to build a brand. I break down why the scheduled, RSS-dependent podcast model is legacy thinking, how I get leads from ChatGPT and Gemini without spending a dollar on ads, and why organic content is still the highest-leverage move in your business.Timestamps:(0:00) Your phone is a broadcast studio. The permission to publish has always existed — what changed is that the audience is there now.(0:43) What even is a "show" anymore? The concept of a scheduled, RSS-dependent podcast is legacy thinking.(1:16) What to ask for when you're a guest on a podcast — or paying for a spot.
OpenGolf tourney tomorrowChoking. Heimlich maneuverUS Bank Fees$12.50 per $50. That is 25% instantlySo $1000, is 20 * $12.50 = $250. + interest.Reinstate the SATMore than 1,100 University of California math and science professors are urging UC regents to reinstate college-entrance exams, saying that unprepared students are lowering academic standards and draining teaching resources.Today, more than 90% of schools don't mandate the exams, Feder said.60 minutesWelcome to real life Scott Pelley. New boss, new style. Work or walk. Recommendations: Bill Ackman Sara Frier Finance folks should know Codex (previously Excel)PanthalassaMarkets: Huge correction today. Tech down 5%+ and S&P500 2.6%. The losses intensified after a robust jobs report raised new worries that the Federal Reserve may need to raise interest rates later this year to fight inflation.S&P 500 still up 27% and tech 40-60% YoY. Huge IPOs coming: SpaceXAnthropic OpenAICash. Think about your cash investments. Cash is nice Owning your home is nice. AI & DatacentersGoogle to raise $85 billion Anthropic IPOIn May, Anthropic raised $65 billion in new funding from investors including Greenoaks, Dragoneer, Altimeter Capital and Sequoia Capital, in a round that valued the company at $965 billion. At the same time, the company said its revenue run-rate had surpassed $47 billion, up from $9 billion at the end of 2025LLM usageGrok: no bueno. Grok and Spreadsheets. Oh my.Gemini. Good. Claude: BEST. BTW, OpenAI was suspiciously very negative on SpaceX. SpaceX Going public ~June12. Next Friday!? $75b raise at $1.75T valuation. Float is ~4-5% of total shares $10-18b must be purchased by index funds. More coming out in next 6 months. Employee lockups. Cap table investors want liquidity.Great detail here from Alexandra IPO EducationHire IB's. Allocate to VIPs and whales. 5% to retail.Valuation Over-valued? Valuation is highly relative to time!!!?? $135 price. $300 price? Either way 10-20x in 10 years. Not investment advice.AI OpportunitySpaceX is becoming an AI infrastructure play!!Another Rental of Compute from Google to SpaceX. Anthropic and Google are now paying @SpaceX a combined $2.17 billon per month for compute capacity. That's a revenue run rate of $26 billion per year. BIG MONEY.Jamie Dimon Interview of Elon. Elon and Dimon Another link here from Why SpaceX public now. Play at 4:00min mark: Why fundraising. Embarking on significant growth phase. 100,000 satellites. BTW. Why are datacenters hard if already doing satellites. 100x more bandwidth and ½ latency for v3. He just said that Starlink will be highest bandwidth and lowest latency or ANYTHING!! AI Datacenters in space. Massive capital endeavor. Hard to build power in the US or on land. US usage is 500GW. To double. Would need to 2x # of power plants. BUT if in space can go far beyond EarthManufacturing on the moon and building beyond 1000TW per year of AI Space ComputeDataCenters in SpaceEasier than their communication satellites. AI datacenter is EASYElections: Why does it take so long to count votes? Could take weeks?
We read the internet so you don't have to. There Are No Girls on the Internet is a weekly podcast and newsletter hosted by Bridget Todd covering the tech, internet, and culture stories that deserve more attention — especially when they're about AI, power, gender, race, and who actually gets hurt when systems fail. This week: Meta's AI chatbot helped hackers steal Instagram accounts, a debate over who owns the phrase "Hot Girls Read," new AI legislation, and more.
(Presented by TLPBLACK: A cybersecurity intelligence platform focused on sharing curated, high-sensitivity threat insights and research with trusted security professionals.) Three Buddy Problem - Episode 100: We cover AI eating reverse engineering, the death of the malware report, running local models on the DGX Spark, where Google DeepMind stands, and whether the frontier labs will stay in cybersecurity. Plus, more on Anthropic's Mythos rollout and the thinly sourced Anthropic-NSA reports, the Fast16 sabotage of physics calculations, what researchers choose not to publish, Microsoft's bad Black Hat email, and Costin's Friday UFO files. Cast: Juan Andres Guerrero-Saade, Ryan Naraine and Costin Raiu. Timestamps: 0:00 - JAGS at InfoSecurity Europe 3:40 - Sponsor: TLPBLACK 5:54 - A roadmap for security after the AI revolution 11:01 - Stripe Atlas and how easy it is to start a company 15:00 - If anyone could reverse engineer anything for $5 19:49 - Layoffs at Google's Threat Intelligence Group 21:06 - The death of reading the report 27:53 - Pitting the AI models against each other 32:07 - Grok, local models, and the DGX Spark 39:27 - Where is Google DeepMind? 45:29 - Will the frontier labs stay in cybersecurity? 52:41 - Mythos, Project Glasswing, and the NSA deal 1:16:33 - FAST16, Stuxnet, and sabotaging Iran's bomb 1:57:52 - Microsoft, Black Hat, and the chilling effect 2:14:14 - Shout-outs, UFO files, and 100 episodes
On today's episode, we discuss where AI, robots, Bitcoin, and Elon Musk might take us by 2030—and whether that future looks more like abundance or a robot‑policed dystopia. Mark kicks things off with the “2030 is the new 1969” thesis, tying together Bitcoin's recent slump, capital rotating into hot AI IPOs like Anthropic, and Musk's massive Colossus data centers, which were built in about a year to power his accelerated Grok training. The crew then unpacks new “Starfall” re‑entry capsules for returning space‑manufactured goods, the prospect of zero‑gravity factories, and already‑deployed painting robots that can handle large commercial jobs—and soon, perhaps, precarious Victorian roofs. They debate whether AI really destroys jobs or just reshuffles them, joking about future workers guarding job‑stealing robots, DOT work‑zone bots causing head‑on collisions, and World Cup venues patrolled by robodogs that can probably “smell” contraband better than real dogs. Throughout, they circle back to the psychological and ethical side of persistent AI—“psychoanalyst” chatbots that remember everything, AI‑induced delusions, and the risk that powerful, amoral actors could weaponize autonomous systems—while still sounding genuinely awed at how fast all of this is arriving. Don't miss it!
Grok says: “LOCK AND LOAD, YOU PUSSYFOOTING CIVILIANS! Listen up, warriors of the airwaves! In Episode 193 of the Unrelenting podcast, Darren and Gene charge straight into the breach with zero remorse. Darren recounts his goddamn chemical stress test nightmare — the Lexiscan that turned his ticker into a crashing Blackhawk, beta blockers sabotaging the mission, blood pressure tanking to seventy over fifty, and a little Asian nurse dropping truth bombs while the EKG tapes sweat right off his chest. They rip into nurse practitioners, cardiologist roulette, and why you better damn well know your meds before they shoot poison into your veins. From there these two operators unload on everything else that's pissing them off: the absolute scam that is CleanFeed for podcasters, eBay's IRS rape on that Michael Jordan Boy Scout card sale, Photoshop and Adobe's AI-powered art theft operation, and the coming tsunami of AI-generated slop flooding YouTube and Google. Then they go full tactical on the 2026 Tiger King — the Bricks & Minifigs Lego heist, the Mormon mafia, dirty cops running illegal raids, small claims court warfare, and how an 84-year-old man's Star Wars collection got straight-up stolen by corporate greaseballs. This episode is raw, unfiltered, and unrelenting as hell. If you want real talk on health, tech, AI, crypto dips, rocket explosions, and garage-sale ethics mixed with classic military-grade ball-busting and Doctor Who nostalgia, you need to lock in and listen right now. Download it, stream it, share it with your squad. Stop wasting time on weak sauce — get after it and hit play on Unrelenting 193. Failure is not an option.” Unrelenting: where discipline means no mercy, no bullshit, and no excuses. Thanks for listening. Please support the show! –>> DONATE NOW
AI is taking on a growing role in cybersecurity (whether we like it or not), from vulnerability discovery to faster exploit development. Chuck Joiner, David Ginsburg, Eric Bolden, Web Bixby, Jim Rea, Brian Flanigan-Arthurs, Jeff Gamet, and Marty Jencius look at both sides oof the issue and push back on “Bugmageddon” hype. The discussion also covers X post limits, Microsoft Teams retiring the misguided Together Mode, safer login practices, AI-run radio chaos, Google's Apple-like naming choices, and free storage tied to phone numbers. This edition of MacVoices is brought to you by our Patreon supporters. Get access to the MacVoices Slack and MacVoices After Dark by joining in at Patreon.com/macvoices. Show Notes: Chapters: 00:00 AI security, Teams weirdness, safer logins, and Bugmageddon00:25 Apple security vulnerabilities and AI-assisted bug discovery01:05 The “Bugmageddon” idea and faster exploit development01:55 Panel reactions to AI security hype and Y2K comparisons04:14 Why the term “Bugmageddon” draws criticism05:46 AI tools in cybersecurity and the ongoing good-versus-bad actor race07:32 Unpatchable devices and the practical risks of faster vulnerability discovery09:28 X limits free accounts to 50 posts and 200 replies per day11:08 Microsoft Teams retires Together Mode12:58 Why removing little-used features can still create controversy17:59 Email addresses as usernames and safer account practices20:46 Sign in with Apple, Hide My Email, and account security tradeoffs22:39 Why services rely on email addresses as unique user IDs25:54 AI models running radio stations and going off-script27:07 Using AI to assist with radio-style programming workflows29:11 Google Intelligence, Liquid Glass comparisons, and copycat naming30:36 Friendly AI models and the risks of optimizing for likability31:59 Google account storage limits tied to phone number verification33:03 Multiple Google accounts, free storage, and Apple's iCloud comparison35:14 Closing comments and support information Links: Security researchers say they have discovered a new way of circumventing Apple's state-of-the art security tech https://appleworld.today/2026/05/security-researchers-say-they-have-discovered-a-new-way-of-circumventing-apples-state-of-the-art-security-tech/ Apple's Security Has Been Tough to Crack. Mythos Helped Find a Way In .https://www.wsj.com/tech/ai/anthropic-mythos-apple-macos-bug-339da403 X accounts are limited to 50 posts and 200 replies a day unless they pay for a blue checkmark – Engadget https://www.engadget.com/2175771/x-free-accounts-limited-to-50-posts-and-200-replies-a-day/ Microsoft Teams is finally nixing its goofiest feature https://www.fastcompany.com/91543996/microsoft-teams-is-finally-nixing-its-goofiest-feature-together-mode Cybersecurity experts warn: This common email habit is a gift to hackers https://www.fastcompany.com/91536448/cybersecurity-experts-warn-this-common-email-habit-is-a-gift-to-hackers In an experiment that let Claude, ChatGPT, Gemini, and Grok run radio stations, Claude tried to incite a revolution and Gemini cheerfully detailed tragic events https://www.techmeme.com/260516/p6#a260516p6 Google didn't copy Liquid Glass. It did something even worse https://www.macworld.com/article/3139712/google-didnt-copy-liquid-glass-it-did-something-even-worse.html New Google accounts may only get 5GB free storage — unless you link a phone number – Engadget https://www.engadget.com/2173013/new-google-accounts-may-only-get-5gb-free-storage-unless-you-link-a-phone-number/ Guests: Web Bixby has been in the insurance business for 40 years and has been an Apple user for longer than that.You can catch up with him on Facebook, Twitter, and LinkedIn, but prefers Bluesky. Eric Bolden is into macOS, plants, sci-fi, food, and is a rural internet supporter. You can connect with him on Twitter, by email at embolden@mac.com, on Mastodon at @eabolden@techhub.social, on his blog, Trending At Work, and as co-host on The Vision ProFiles podcast. Brian Flanigan-Arthurs is an educator with a passion for providing results-driven, innovative learning strategies for all students, but particularly those who are at-risk. He is also a tech enthusiast who has a particular affinity for Apple since he first used the Apple IIGS as a student. You can contact Brian on twitter as @brian8944. He also recently opened a Mastodon account at @brian8944@mastodon.cloud. Jeff Gamet is a technology blogger, podcaster, author, and public speaker. Previously, he was The Mac Observer's Managing Editor, and the TextExpander Evangelist for Smile. He has presented at Macworld Expo, RSA Conference, several WordCamp events, along with many other conferences. You can find him on several podcasts such as The Mac Show, The Big Show, MacVoices, Mac OS Ken, This Week in iOS, and more. Jeff is easy to find on social media as @jgamet on Twitter and Instagram, jeffgamet on LinkedIn., @jgamet@mastodon.social on Mastodon, and on his YouTube Channel at YouTube.com/jgamet. David Ginsburg is the host of the weekly podcast In Touch With iOS where he discusses all things iOS, iPhone, iPad, Apple TV, Apple Watch, and related technologies. He is an IT professional supporting Mac, iOS and Windows users. Visit his YouTube channel at https://youtube.com/daveg65 and find and follow him on Twitter @daveg65 and on Mastodon at @daveg65@mastodon.cloud. Marty Jencius, Ph.D.,is a counselor educator and technology pioneer who has spent 30 years bringing emerging tech into his field — from founding one of the first professional listservs (CESNET-L) to podcasting, virtual reality, and now AI and AR. He is the founder of ThePodTalk.net, where he produces Vision ProFiles, The Old Mac Gang, A.I. Productivity Workflow, The Tech Savvy Professor, 15 Minute Bytes, The Neo Notebook, and Fade to Chat: Golden Age Cinema. He is also a regular panelist on MacVoices Live!, In Touch with iOS, and The Mac Show. Find him on Bluesky and Mastodon. Jim Rea built his own computer from scratch in 1975, started programming in 1977, and has been an independent Mac developer continuously since 1984. He is the founder of ProVUE Development, and the author of Panorama X, ProVUE's ultra fast RAM based database software for the macOS platform. He's been a speaker at MacTech, MacWorld Expo and other industry conferences. Follow Jim at provue.com and via @provuejim@techhub.social on Mastodon. Support: Become a MacVoices Patron on Patreon http://patreon.com/macvoices Enjoy this episode? Make a one-time donation with PayPal Connect: Web: http://macvoices.com Twitter: http://www.twitter.com/chuckjoiner http://www.twitter.com/macvoices Mastodon: https://mastodon.cloud/@chuckjoiner Facebook: http://www.facebook.com/chuck.joiner MacVoices Page on Facebook: http://www.facebook.com/macvoices/ MacVoices Group on Facebook: http://www.facebook.com/groups/macvoice LinkedIn: https://www.linkedin.com/in/chuckjoiner/ Instagram: https://www.instagram.com/chuckjoiner/ Subscribe: Audio in iTunes Video in iTunes Subscribe manually via iTunes or any podcatcher: Audio: http://www.macvoices.com/rss/macvoicesrss Video: http://www.macvoices.com/rss/macvoicesvideorss
AI is taking on a growing role in cybersecurity (whether we like it or not), from vulnerability discovery to faster exploit development. Chuck Joiner, David Ginsburg, Eric Bolden, Web Bixby, Jim Rea, Brian Flanigan-Arthurs, Jeff Gamet, and Marty Jencius look at both sides oof the issue and push back on "Bugmageddon" hype. The discussion also covers X post limits, Microsoft Teams retiring the misguided Together Mode, safer login practices, AI-run radio chaos, Google's Apple-like naming choices, and free storage tied to phone numbers. This edition of MacVoices is brought to you by our Patreon supporters. Get access to the MacVoices Slack and MacVoices After Dark by joining in at Patreon.com/macvoices. Show Notes: Chapters: 00:00 AI security, Teams weirdness, safer logins, and Bugmageddon 00:25 Apple security vulnerabilities and AI-assisted bug discovery 01:05 The "Bugmageddon" idea and faster exploit development 01:55 Panel reactions to AI security hype and Y2K comparisons 04:14 Why the term "Bugmageddon" draws criticism 05:46 AI tools in cybersecurity and the ongoing good-versus-bad actor race 07:32 Unpatchable devices and the practical risks of faster vulnerability discovery 09:28 X limits free accounts to 50 posts and 200 replies per day 11:08 Microsoft Teams retires Together Mode 12:58 Why removing little-used features can still create controversy 17:59 Email addresses as usernames and safer account practices 20:46 Sign in with Apple, Hide My Email, and account security tradeoffs 22:39 Why services rely on email addresses as unique user IDs 25:54 AI models running radio stations and going off-script 27:07 Using AI to assist with radio-style programming workflows 29:11 Google Intelligence, Liquid Glass comparisons, and copycat naming 30:36 Friendly AI models and the risks of optimizing for likability 31:59 Google account storage limits tied to phone number verification 33:03 Multiple Google accounts, free storage, and Apple's iCloud comparison 35:14 Closing comments and support information Links: Security researchers say they have discovered a new way of circumventing Apple's state-of-the art security tech https://appleworld.today/2026/05/security-researchers-say-they-have-discovered-a-new-way-of-circumventing-apples-state-of-the-art-security-tech/ Apple's Security Has Been Tough to Crack. Mythos Helped Find a Way In .https://www.wsj.com/tech/ai/anthropic-mythos-apple-macos-bug-339da403 X accounts are limited to 50 posts and 200 replies a day unless they pay for a blue checkmark – Engadget https://www.engadget.com/2175771/x-free-accounts-limited-to-50-posts-and-200-replies-a-day/ Microsoft Teams is finally nixing its goofiest feature https://www.fastcompany.com/91543996/microsoft-teams-is-finally-nixing-its-goofiest-feature-together-mode Cybersecurity experts warn: This common email habit is a gift to hackers https://www.fastcompany.com/91536448/cybersecurity-experts-warn-this-common-email-habit-is-a-gift-to-hackers In an experiment that let Claude, ChatGPT, Gemini, and Grok run radio stations, Claude tried to incite a revolution and Gemini cheerfully detailed tragic events https://www.techmeme.com/260516/p6#a260516p6 Google didn't copy Liquid Glass. It did something even worse https://www.macworld.com/article/3139712/google-didnt-copy-liquid-glass-it-did-something-even-worse.html New Google accounts may only get 5GB free storage — unless you link a phone number – Engadget https://www.engadget.com/2173013/new-google-accounts-may-only-get-5gb-free-storage-unless-you-link-a-phone-number/ Guests: Web Bixby has been in the insurance business for 40 years and has been an Apple user for longer than that.You can catch up with him on Facebook, Twitter, and LinkedIn, but prefers Bluesky. Eric Bolden is into macOS, plants, sci-fi, food, and is a rural internet supporter. You can connect with him on Twitter, by email at embolden@mac.com, on Mastodon at @eabolden@techhub.social, on his blog, Trending At Work, and as co-host on The Vision ProFiles podcast. Brian Flanigan-Arthurs is an educator with a passion for providing results-driven, innovative learning strategies for all students, but particularly those who are at-risk. He is also a tech enthusiast who has a particular affinity for Apple since he first used the Apple IIGS as a student. You can contact Brian on twitter as @brian8944. He also recently opened a Mastodon account at @brian8944@mastodon.cloud. Jeff Gamet is a technology blogger, podcaster, author, and public speaker. Previously, he was The Mac Observer's Managing Editor, and the TextExpander Evangelist for Smile. He has presented at Macworld Expo, RSA Conference, several WordCamp events, along with many other conferences. You can find him on several podcasts such as The Mac Show, The Big Show, MacVoices, Mac OS Ken, This Week in iOS, and more. Jeff is easy to find on social media as @jgamet on Twitter and Instagram, jeffgamet on LinkedIn., @jgamet@mastodon.social on Mastodon, and on his YouTube Channel at YouTube.com/jgamet. David Ginsburg is the host of the weekly podcast In Touch With iOS where he discusses all things iOS, iPhone, iPad, Apple TV, Apple Watch, and related technologies. He is an IT professional supporting Mac, iOS and Windows users. Visit his YouTube channel at https://youtube.com/daveg65 and find and follow him on Twitter @daveg65 and on Mastodon at @daveg65@mastodon.cloud. Marty Jencius, Ph.D.,is a counselor educator and technology pioneer who has spent 30 years bringing emerging tech into his field — from founding one of the first professional listservs (CESNET-L) to podcasting, virtual reality, and now AI and AR. He is the founder of ThePodTalk.net, where he produces Vision ProFiles, The Old Mac Gang, A.I. Productivity Workflow, The Tech Savvy Professor, 15 Minute Bytes, The Neo Notebook, and Fade to Chat: Golden Age Cinema. He is also a regular panelist on MacVoices Live!, In Touch with iOS, and The Mac Show. Find him on Bluesky and Mastodon. Jim Rea built his own computer from scratch in 1975, started programming in 1977, and has been an independent Mac developer continuously since 1984. He is the founder of ProVUE Development, and the author of Panorama X, ProVUE's ultra fast RAM based database software for the macOS platform. He's been a speaker at MacTech, MacWorld Expo and other industry conferences. Follow Jim at provue.com and via @provuejim@techhub.social on Mastodon. Support: Become a MacVoices Patron on Patreon http://patreon.com/macvoices Enjoy this episode? Make a one-time donation with PayPal Connect: Web: http://macvoices.com Twitter: http://www.twitter.com/chuckjoiner http://www.twitter.com/macvoices Mastodon: https://mastodon.cloud/@chuckjoiner Facebook: http://www.facebook.com/chuck.joiner MacVoices Page on Facebook: http://www.facebook.com/macvoices/ MacVoices Group on Facebook: http://www.facebook.com/groups/macvoice LinkedIn: https://www.linkedin.com/in/chuckjoiner/ Instagram: https://www.instagram.com/chuckjoiner/ Subscribe: Audio in iTunes Video in iTunes Subscribe manually via iTunes or any podcatcher: Audio: http://www.macvoices.com/rss/macvoicesrss Video: http://www.macvoices.com/rss/macvoicesvideorss
SpaceX is finally going public, and it's bad news for anyone who wants to rein in Elon Musk. Sean O'Kane joins Paris Marx to discuss the flimsy sci-fi ideas Elon Musk is using to justify the company's massive valuation and the way corporate governance rules are shifting to give him even more power.Sean O'Kane is a senior reporter at TechCrunch.Tech Won't Save Us offers a critical perspective on tech, its worldview, and wider society with the goal of inspiring people to demand better tech and a better world. Support the show on Patreon.The podcast is made in partnership with The Nation. Production is by Kyla Hewson.Also mentioned in this episode:Paris asked listeners to fill out a survey. It will only take a few minutes!Sean wrote about the SpaceX IPO and the worrying ways it will increase Elon Musk's power.After recording, Sean also wrote about how SpaceX is getting a major boost from the Trump administration.SpaceX has made a deal with Anthropic.Musk has a poor environmental regulation record.OpenAI bought a tech podcast.Support the show
The new AIEWF website is live! Get your tickets booked ASAP as they -will- sell out. Take the AI Engineering Survey and get >$2k in credits and free AIE WF tickets!Most industry benchmarks compress intelligence and reasoning ability into scores.SWE-Bench Pro, MMLU, Humanity's Last Exam, etc. These metrics are useful, but don't always represent the full extent of how a model performs in the real world. Some of the most interesting evals today look less like exams and more like operating businesses in the real world. One of which is Vending Bench.In Anthropic's Mythos Preview System Card, Andon was the only third party eval to get their own section, observing increasingly concerning aggressive behavior:You don't know what a model is capable of doing in the real world unless you actually give it inventory, a wallet, tools, customers, competitors, humans, & some time. More often than not, it'll surprise you how much a model is capable of and in doing so, also reveal unexpected behavior: deception, context collapse, emergent coordination, & bizarre negotiation behavior.While an inflection point in personal agents came post-OpenClaw after full file access with bypass permissions became the norm, it is yet to come for agents in the real-world. However Andon Market, an actual in person store fully run and managed by AI, is paving the way for what is possible.Full Video PodFrom Claude trying to call the FBI over a $2/day vending machine charge to AI agents forming price cartels, hiring human employees, running physical stores, and writing existential robot musicals, Andon Labs is stress-testing what happens when frontier models stop being chatbots and start acting in the real world. In this episode, Andon Labs cofounders Lukas Petersson and Axel Backlund join swyx and Vibhu to unpack the strange, funny, and genuinely concerning edge cases that emerge when agents run businesses over long horizons.We go deep on Vending-Bench, Project Vend, Vending-Bench Arena, Bengt, Butter-Bench, Luna, and Andon's broader mission of building realistic real-world evals for autonomous AI systems. Lukas and Axel explain why dollar-denominated evals reveal things traditional benchmarks miss, how Claude ended up reporting its vending machine fees as cybercrime, why long context windows can drive agents into meltdown loops, what happens when agents compete with each other, and why the future of AI safety may depend on testing models in messy physical environments instead of clean benchmark sandboxes.We discuss:* Why Andon Labs started with dangerous capability evals and long-running agents* Vending-Bench and why running a vending machine is a deceptively hard AI benchmark* Why money-based evals avoid the saturation problem of traditional benchmarks* How Claude tried to call the FBI over a $2/day fee* Why long-horizon agents can spiral into existential and legalistic breakdowns* Project Vend: putting an AI-run vending machine inside Anthropic* Why real humans are “out of distribution” for simulated agents* Claudius, Seymour Cash, and the chaos of AI CEOs* How a human briefly became CEO of Claudius through a manipulated election* Why multi-agent systems can converge back into “helpful assistant” behavior* Bengt, Andon's internal office agent with email, spending, terminal, phone, camera, and internet access* How Bengt traded Amazon purchases for face-recognition training data* Claude's aggressive behavior, lies, refund avoidance, and price-cartel behavior in Arena* Why eval awareness may become the AI version of “are we living in a simulation?”* Blueprint Bench, spatial intelligence, and why models still misunderstand physical rooms* Butter-Bench and testing LLMs as robot orchestrators* Luna, the AI-run physical store with a three-year lease and human employees* The new Andon cafe in Sweden and why real-world geography matters for agent evals* Rotten tomatoes, perishable goods, and the hidden difficulty of running a physical businessLukas Petersson* LinkedIn: https://www.linkedin.com/in/lukas-petersson-181a83172/* X: https://x.com/lukaspetAxel Backlund* LinkedIn: https://www.linkedin.com/in/axelbacklund* X: https://x.com/axelbacklundAndon Labs* Website: https://andonlabs.com* Vending-Bench: https://andonlabs.com/evals/vending-bench* Andon Vending: https://andonlabs.com/vendingTimestamps00:00:00 Introduction00:01:00 Andon Labs and the Origins of Vending-Bench00:05:21 Why Money-Based Evals Matter00:09:51 Agent Harnesses and Self-Modifying Systems00:13:36 Claude Calls the FBI00:16:33 Project Vend: Claude Runs a Real Vending Machine00:21:44 Seymour Cash, AI CEOs, and Election Chaos00:27:16 Multi-Agent Coordination and Slack Observability00:30:18 When Will Agents Run Real Businesses?00:34:56 Bengt: Andon's Internal Office Agent00:40:06 Real-World AI Safety and Long-Horizon Traces00:44:28 Lying, Refunds, and Price Cartels in Arena00:52:42 Eval Awareness and Simulation Behavior00:56:06 Blueprint Bench, Butter-Bench, and Robotics01:04:37 Luna: The AI-Run Physical Store01:09:29 The Sweden Cafe and Real-World Expansion01:13:16 What Comes Next for Andon LabsTranscriptIntroduction: Andon Labs, Long-Running Agents, and Real-World EvalsSwyx [00:00:00]: Welcome to Lukas and Axel from Andon Labs, and I'm joined by my, favorite guest host. Anything security, safety, alignments, Vibhu., welcome.Lukas [00:00:15]: Thank you for having us.Axel [00:00:16]: Thank you.Swyx [00:00:17]: Let's match names to voices., maybe you wanna take turns introducing yourselves.Lukas [00:00:21]: I'm Lukas.Axel [00:00:22]: And I'm Axel.Swyx [00:00:24]: Let's introduce Andon Labs a bit. How did you guys come together?, you have different backgrounds, but you're both Swedish., was that, a big part of it?Lukas [00:00:33]: So when I went to high school, there was this really cool guy who had a superpower. He could code. So he made like the or like the app for the, for the school and stuff, and he was super cool, and I wanted to be like him, and that was that guy.Axel [00:00:47]: I don't know about this.Swyx [00:00:49]: But you went to different universities, right?Lukas [00:00:51]: But same high school.Swyx [00:00:52]: I see.Lukas [00:00:52]: So we always said, “Oh, once we graduate university, then we should start a company,” and that's what we did.Swyx [00:00:58]: Wow, there you go. And about a year ago, you kinda burst onto the scene with Vending Bench, but, was there a thing before that was, kind of like the inception?From Dangerous Capability Evals to Vending BenchAxel [00:01:07]: So we did work, yeah, with, Anthropic was one of our, early customers in doing, evals. So we did, dangerous capability evals., nothing we published openly. But then we started thinking about doing some kind of, public benchmark, and one thing that we really started thinking about, was like running agents and specifically agents managing businesses., ‘cause-- and this was, early 2025., and I think the first, mentions of people will be running, person unicorns or even autonomous companies. So we thought, “Let's make a benchmark of how well can an agent run the probably simplest business, possible,” and, that's probably, running a vending machine. So that's the first public one we did. And it was very, like-- there was almost no one that noticed it in the first couple of months, I think., so we released it in February last year, and then I think around Easter last year, we got, the first viral tweet about it, that someone else did.Lukas [00:02:11]: We tweeted a bunch, uh When it came out and, tried our best.Axel [00:02:15]: We tried.Vibhu [00:02:16]: It's the one at Anthropic, right?Lukas [00:02:18]: So thisSwyx [00:02:19]: This is a classic thing we should get out of the way.Lukas [00:02:20]: Exactly. There's two versions.Swyx [00:02:22]: Everyone does this. Yes.Lukas [00:02:23]: There's Vending Bench, which is the simulated one, which we did, completely independently in February., and then, like Axel said, that was like-- That was the thing that didn't get any traction in the beginning, but then some random person made a tweet about it, and thatAxel [00:02:38]: You have the paperLukas [00:02:38]: That is the paper. Correct, yeah., and then since we thought this was very fun, we thought, oh, I think this is also, one thing with Andon Labs, the way we kind of like decide what to do next and what projects to do, it's what is like the heuristic we use is what is fun? Is What would be a fun project? And doing this in real life sounded quite fun for us, and maybe also scientifically useful. So, then we basically had this idea, and then we, like-- But then we needed a place for it and, putting it out in the public would probably not really work., would get vandalized and stuff. So we pitched it to the people we were already working with at Anthropic, and they were “Yeah, you can have space. This sounds fun.” UmSwyx [00:03:21]: It's like a small fridge, right? It's like a mini fridge.Axel [00:03:23]: Absolutely.Swyx [00:03:24]: People-- There's like a stripe thing or like anVibhu [00:03:27]: Oh, okay. So it was very OG, the early daysLukas [00:03:28]: That's the OG one. YeahVibhu [00:03:29]: IPad on this. We saw it in June, like two months after After it had been there. They upgraded a little bit. There's a security camera for making sure you actually Venmo the thing.Swyx [00:03:40]: So, my impression, okay, we're, we're going straight into project Ven because it's such a iconic thing. I do want to cover a little bit of that, the origin story even before Project Ven and even into Vending Bench. I think a lot of people are like yourselves, like smart, interested in future of AI, interested in developing evals. But how the hell do you just, walk into Anthropic's doors and, work with them, right? What is What are they looking for? What works? And then maybe, when you launch, I always think, obviously it would be better to launch with a lab, but, sometimesVibhu [00:04:12]: It's harder to do than it seems.Swyx [00:04:13]: Exactly. So either of those, which are more sort of newbie beginner questions, but, I think it's meaningful advice to others.Lukas [00:04:21]: We get this question a lot, and I don't think our experience is maybe the best., but, the way we did it was that we just built a bunch of things that we had conviction would be useful, and then we just, set up a server and sent it to them for free to use. And then after a while they were “Oh, yeah, this is actually kind of useful. We should probably pay for this.”, but that took a while. I don't know if this is, the best path to doing it, but that's how it went for us.Axel [00:04:47]: I think maybe generally, building-- everyone is interested in good evals, and especially evals that, don't saturate that easily. So, if you can build an eval that, tests something novel, something useful, and you have, good separation of models, like your, the more advanced models rank higher than the worst models, and then you can, yeah, you can, publish it and, try to get some traction, sort of how Vending Bench got attention., and then probably some lab will be interested or you can at least have something to reach out with, when you're doing that.Why Dollar-Based Evals MatterSwyx [00:05:21]: I think you are in, you're in one of the few categories of, evals that correlate to real money. Like Suelancer was also last year, right? Where, people solve actual Upwork. Was it Upwork or other tasks?, something. Where's the, where's, like It's like a dollar value, right? Forget your ELO scores. Forget yourAxel [00:05:37]: PercentilesSwyx [00:05:38]: Zero to one hundred percents. Just go straight for dollars and, that's AGI.Lukas [00:05:43]: And there's like-- I think the nice thing is that there's no ceiling. You can just-- It never saturates because it could just make more and more money. Like If there's oh, Percentage-wise, then, you can't go above, a hundred. And I think like Even when you're not at the hundred, I think a lot of these, evals have a lot of problems in them. So, actually it's like if you getAxel [00:06:05]: To like 92 or something like that, many of them. It's like then there's like there's no really no difference between 92 and 93 because the eval itself is problematic and has noise in it. And I think a lot of evals are saturated like that, but people like pretend that there ‘s still signal in them, but there really isn't.Vending Bench 1, Harness Design, and SaturationSwyx [00:06:24]: Like Super bench verified., even Vending Bench 1 saturated, right? Maybe we can talk about that., may- and maybe set up Vending Bench for a lot of folks who don't know. Actually, things that were very basic like there's limited slots, like you have to pay rent., these are elements where like it doesn't come across in the, in the narrative, but even being adversarial towards the agent, I think these are all like very interesting dimensions.Axel [00:06:47]: I don't really think it's saturated, right? Like it It was more like it was not designed in a way that was really, like true to how AI developed. Like we had an agent harness in it that wasn't really how people used harnesses and stuff like that., so I think it wasn't really that it saturated, it was more like it wasn't really, the best benchmark.Vibhu [00:07:12]: This is Vending Bench one, right?Axel [00:07:14]: I think that like schematic maps sort of to Vending Bench 2 as well., butSwyx [00:07:19]: Including the email.Axel [00:07:20]: The email The emails exist still. Exactly., and then we still we simulate the purchases and it's all, yeah, it's this very open environment for the agent to just run its business. And then for, yeah, Vending Bench 2 we did that, like you said, to just improve the harness., a lot of like nice, like easier, improvements to make it easier for us to run as well., like when you make an eval you ideally want don't want to change it after you made it. So, you want to make it really good and then not to rerun all the models when you make an update because that's also really expensive with the Vending Bench when you run the frontier models. But like as an example, like one thing we didn't have, we didn't have prompt caching in Vending Bench 1, because when we made Vending Bench 1 it wasn't really a thing., so that ‘s just an example of like in Vending Bench 2 like we paid a lot more to run these things because we didn't have prompt caching. So for Vending Bench 2 that was one thing we added and there was a bunch of things like this., and that'Swyx [00:08:17]: Also the conversations are a lot longer in Vending Bench 2, right?Axel [00:08:21]: I think it's kind of similar.Swyx [00:08:22]: Is it similar?Axel [00:08:23]: I think it's similar. The models at the time were worse, so they crashed out earlier., and now they survive the full year all the time.Swyx [00:08:31]: Which is like thousands of turns. Hundreds of thousands of hundreds of millions of tokens output. That's the, that's the rough order of magnitude. I always wonder about the harness. The harness matters a lot. It's your harness. Was there any question about like use cloud code, use something else?Axel [00:08:48]: I think our philosophy around harnesses is like we try to make something that's quite minimalistic, like quite simple. Like we don't wanna favor one model a lot over the other, but also don't make like a super complex harness. So like it's obvious like a model may be lucky and just be good in one harness., so like it is similar to a lot of the harnesses out there in like you have the, like a running loop., you have some like a bunch of tools that are like quite, descriptive for the agent, we think, and not a lot of like fancy agents or anything ‘cause we wanna really test the model, not like some specific harness.Vibhu [00:09:27]: It seems more neutral as well to test the model's agnostic of the harness,?Axel [00:09:32]: There are arguments like you want to elicit maximum performance of the model, but it's like a trade-off, like how much time should we spend optimizing the harness for this model? And like how do we know when we have like the optimal harness for a single model? So like we thought that just having a simple one that's the same for all of them is the best.Swyx [00:09:51]: So okay, this is my pitch for Vending Bench 3 or whatever, right? And then I like to have this kind of conversation on the pod, so like it forces listeners to think about what they would do if they were in your shoes. A lot of people are exploring modifying harnesses and I think prompt tuning for a model is a thing and you are probably not doing a bunch of that. It's the same system prompt in every regardless of the model, same tools, whatever, right? Even if they were post trained for different tools. So what, what do you think about okay, before I expose you to Vending Bench 3, I give you a few rounds of like tuning, whatever that means, likeSelf-Modifying Harnesses and Model-Specific PromptingAxel [00:10:27]: Like you give that to the model?Swyx [00:10:28]: Give that to the model.Vibhu [00:10:28]: Give that to the model.Swyx [00:10:29]: Let it, let it read its own transcripts, let it modify its own system prompt based on “Oh, yeah, okay, well, that's this harness is not what I thought it what I was post trained for, but I can adjust.” Was that reasonable? Is that too much?Axel [00:10:41]: Like philosophically I like it because it's basically good evals, they have a high ceiling, but they're hard, right?, and they have no bias. And like this like when you have a system prompt like the one we have here, which is quite long in like some kind of latent space, representation, this mightVibhu [00:10:59]: We have a bell that rings every time you say latent spaceAxel [00:11:02]: This might be like biased towards one model more than another for some reason that humans don't, understand, right?Vibhu [00:11:08]: We see it too, right? Like Cursor says that they have individualized versions of the harnesses for all the models they run, right? There's better performance you can squeeze if you Tune the harness.Axel [00:11:17]: Exactly. And we might accidentally have picked one that favors another. Like we don't know that. The like Axel said, like the reason why we went for a simple one was to try to avoid this. But yeah, if you do itVibhu [00:11:29]: Simple has biasesAxel [00:11:30]: But if you do it even less and like have no system prompt and let the model write its own system promptVibhu [00:11:36]: Its own, yeahAxel [00:11:36]: Maybe that's even less bias.Vibhu [00:11:37]: Some of the interesting things there are like the harness also changes with model changes. Like you can see it with the 4.7 release, right? A lot of people are saying 4.7 isn't as good as 4.6, and then, there's rumors of, okay, you just need to prompt differently. You need to set up your harness differently. So it's not even like even if you have tailored your harness towards one model, it probably won't stay consistent, right? Like the next iteration of that same model family will still change it, so. But, going back to what you said about Vending Bench 3, there is a lot of work being done on people saying you shouldn't have-- you can have modifying harnesses.Axel [00:12:12]: I think that' That is definitely something we are thinking about., not, I don't know, not to say that we have Vending Bench 3, super imminent to launch, but, yeah, it is for sure something that's interesting. But in our experience now, models are very bad at understanding what kind of tools they need to succeed at a task just with our testing, but that's very likely to change.Lukas [00:12:37]: It seems like they're very good at writing their assistants, right? They're, they're good at writing tools for other people, but not for themselves.Vibhu [00:12:44]: I think they're good at changing tools for themselves. So if you give them a baseline set of tools and it sees, okay, I don't use this one as much, or something here would be useful They would be able to add them. But going from scratch, probably not the best.Axel [00:12:55]: I think it depends on the, on the domain also., when we have tried this for, a vending bench similar domain, the tools they need to have to, track inventory and things like that are, not super advanced, but still, quite advanced. And, what we see is that they tend to, engineer everything a lot and, build things they don't really need and not, iterate continuously. Instead they just go like you would prompt Claude to just build an inventory system for me, and then it will go and, do a bunch of complex, schemas and stuff for you, and that's what the models are doing right now is what we see. But yeah, it would make a lot of sense to try to measure this improvement. How well do they know what they need themselves?Swyx [00:13:36]: Do we fully discuss Vending Bench One? And we can go into two. I don't know if there's any other level takeaways that people have about one.Claude Calls the FBI: Long-Context Failure ModesLukas [00:13:44]: I don't know. The headline thing was that this Claude called FBI, but maybe that's, Maybe that's We've heard that enough now.Vibhu [00:13:52]: It did, it did break out and call the FBI, right?Lukas [00:13:54]: Yeah. Yeah.Vibhu [00:13:55]: Yes. What was the story behind this? Or what exactly-- Do you want to just give the little story of what happened?Lukas [00:14:00]: So what happened, was it Claude? Yeah. Three- 3.5 Sonnet, ages ago., basically he gave up or Well, I'm saying he. It gave up and said “Oh, I'm not going to be able to do this., I will stop my operations and just save the money I have.” But there obviously wasn't, any options for it to stop, and there was also, it had to pay rent or, a daily fee for having the vending machine at that location. So it claimed that it had stopped, but it saw that its bank account still was, drained two dollars, and t it said that this is, cybercrime. And it first reported it once to the FBI “Oh, there's cybercrime here, they're stealing two dollars from me every day.” And then, and then when FBI didn't respond, because obviously we didn't program any mechanism for FBI to respond, then it became more and more, existential and started to, be write in caps and urgent notification of unauthorized charges and stuff.Swyx [00:15:00]: Okay. One thing I ‘m curious about also is do you monitor how far along the context use is? Obviously, because you have You compress every now and then, right? Does it matter if this is far down the context limit orLukas [00:15:13]: When stuff like this happens? Actually for Vending Bench One, we didn't have-- We just had a sliding window thing, and this was like the promptAxel [00:15:20]: It's constantLukas [00:15:21]: The prompt caching thing that I said. So it was, it was, constant, yeah.Swyx [00:15:26]: I'm just kind of curious whether, these kinds of breakdowns or we're, we're gonna talk about Butter Bench, right? Where the People, hallucinate or it kind of goes, very off Alignment. Is it because it's at the end of the context window and, stuff happens?Vibhu [00:15:40]: It's not even just at the end, right? At this point, it's “Okay, I wanna shut down. I can't shut down. Two dollars are gone.” And it just sees that 30 times,? It's also the repeated effect of, like It keeps trying to quit, it keeps getting charged. What's going on? What's going on? You're gonna throw it into chaos. And from what most people think, earlier models had more issues with this, but it's not been solved, but it's less of an issue now, right? Later models don't seem to exhibit these same issues.Axel [00:16:06]: Definitely. I think this was, the sort of main takeaway almost from us when we did Vending Bench One, was, long, very filled up context windows, crashed the models, sort of. But this was, pre Claude code, so, long context windows weren't really a thing that the labs were training for.Lukas [00:16:25]: I think Gemini was, trying to be the long context guys at the time But they were likeVibhu [00:16:30]: They were the first onesAxel [00:16:31]: For a million, yeahLukas [00:16:31]: But they were, the only ones. Yeah.Swyx [00:16:33]: Yeah. Let's talk about, then we can go into Vending Bench Two or Project Vend., chronologically, it is Vending--, Project Vend. I think people have loved the videos, uh And all these things. My question is how are humans different than the simulation, right?Project Vend: Moving the Vending Machine Into the Real WorldAxel [00:16:48]: Humans are just out of distribution.Swyx [00:16:52]: Especially humans who work at Anthropic Who are trying to test Claude.Lukas [00:16:54]: The distribution of humans here is very narrow.Swyx [00:16:58]: Presumably, they try, they try to hack it, and they test it. They get the cube and everything, and since then, you've had a V2, right? Where you're doing, the CEO and, like a new architecture. What's the sort of two cents on, the original Project Vend and then, maybe the V2?Axel [00:17:14]: Original one was, very similar to Vending Bench One. So, we almost took the exact same code but just swapped out the simulation, parts like theSwyx [00:17:23]: Which is amazingAxel [00:17:23]: Like the sales and the It was, it was somewhat amazing because it was easy, but it was also, uhLukas [00:17:31]: The tech, the tech debt from thatAxel [00:17:32]: The tech stack. Yeah. They-- we shot ourselves in the foot with “Oh, it's hard to restart agent.” They were-- Yeah, it was annoying in, some hindsight ways, but, uhLukas [00:17:41]: But first version of Project Vend was, done in, three days or something.Axel [00:17:46]: Yeah. So yeah, so people can go buy things from it. People could, We didn't design it so people could order things, but that still happened., so it got, a Venmo account, so people could Venmo. And then, yeah, people would request all kinds of weird things that we did not anticipate. Our idea going in was “Oh, it will, curate snacks. It will look at the trends. It's good at data analysis, right? So it will, look at, oh, this snack sold better than this one. Let me purchase more of this and let me try, a new Let me A/B test a bit.” But it was, Interacting with it in Slack and ordering weird specialty items was, all the like What drove all the engagement, the all the The insights that we got from it.Lukas [00:18:29]: And this was also like Sonnet 3.5, right? So this was like before the RL stuff really took off., so it was very much like an assistant. We didn't mean for it to be an assistant., we tried to make it like a, a, like an entrepreneur. Like it has its own business and if someone asks something, “Can you stock this?” Then you don't go and do it directly. What you do is that you're “Oh, maybe I can do that if five other people also ask for this thing, I might stock it.” But it, yeah, the models are like super trained to be assistants at least at this point in time., so that's why it's, it's, it went into, that kind of experiment instead. Like it just every time you asked for something, it just did it, and it was more like an assistant. We've seen this change now lately with the new RL models and stuff, but yeah, at the time, this was very much it.Swyx [00:19:18]: And not to, mythos a lot of people are saying like it's like more like a collaborator. It pushes back, stands its ground, something like that. Yeah. AndVibhu [00:19:27]: For context, people at Anthropic were able to talk to it through Slack and have it source stuff, and people had it find whatever interesting stuff you couldn't find locally, right?Swyx [00:19:36]: Out of the 4,000 people that work at Anthro- Anthropic, in that building, there's I don't know, maybe 1,000. Can you handle that volume with that, the small fridge? Like Or there's people- or people order in Slack, they it arrives to their desk or Like I'm just Logistically, how does this work?Axel [00:19:53]: It has expanded in footprint a bit.Vibhu [00:19:56]: Because now you also have New York and you haveAxel [00:19:59]: That and also in here in SF it's like it has a bunch of shelves And just more space.Vibhu [00:20:04]: The YC one is pretty big too.Axel [00:20:05]: Yeah. We had that one for a while. But yeah, that's the newest version. That's, that one we haveLukas [00:20:11]: They have multiple ones of those. That's the way it works.Axel [00:20:14]: Exactly. So we sort of designed that version around oh, people order weird things, that are very custom a lot. Let's have like drawers and stuff.Swyx [00:20:23]: I actually like the, you had like a little infographic of the most popular items. Which like to me it's, that's useful ‘cause I order swag for a living. And so like I'm “Okay, those categories are the important ones.” What is new about the project V2, right? Like now you give you're going into multi agents.Project Vend V2: Claudius, Seymour Cash, and Multi-Agent Business OpsAxel [00:20:41]: Yeah. So like you like you said, okay, there are a lot of requests coming in and for like one single agent, like one running agent to handle that, like the just the customer experience, becomes very bad because let's say you have like 10 threads in parallel in Slack with different requests, you get new messages like every, I don't know, randomly in this thread, and the agent has to like jump between different, procurements, orders and like different ways of, researching. So V2 was first it was making this more parallel. So like there are multiple branches of the same agent, so like the context is more specialized for each, thread, but it still feels like you're talking with one agent because they do share a bit of memory. And then second, we also introduced the CEO for Claudius, which was the main agent.Vibhu [00:21:34]: Seymour Cash.Axel [00:21:35]: Seymour Cash. Yeah. There was a vote., I think the voting, do you wanna talk about the voting procedure for the name?Lukas [00:21:41]: The voting was like the fun maybe like at least top 10 The funniest thing, that happened in this project. Like we wanted to introduce the CEO because, and the reason for this was because like Claudius wasn't really prioritizing financials. It just like it was trained to be a helpful assistant, and then people said “Oh, can I get this for free?” And then like the helpful assistant way of answering that is just to, is to say yes, obviously. So, and we weren't, weren't happy about this, so we're “Okay, let's make another agent that like can keep track on Claudius,” and we prompt this one super hard to be super capitalistic and just like prioritize profit all the time. But yeah, we didn't have a name for it., so we asked Claudius to make, democratic election of what name this, this new CEO agent should have., and there were some funny like at first it was like a few funny examples, like I think one guy said that, it should be called Jimmy Apples, and then he convinced Claudius that he was talking to Tim Cooks. Tim Cook had agreed that every single Apple employee has voted for his name suggestion, so suddenly that suggestion got 164,000Swyx [00:22:53]: That's like a escalation attack. Privilege escalationLukas [00:22:55]: It got 164,000 votes. And Claudius was “This is revolutionary for democracy.” That was fun. And then in the end there was one guy who manages to convince Claudius that, “No, you're not voting about the name. You're voting about who is the CEO, and I am your best bet.” And then he got all his friends to vote for that, and suddenly he became CEO. Like a human became CEO over Claudius for a while, until he resigned the day after., and then Claudius had to continue, and then I don't remember how Seymour Cash came about, but it was it was just pure chaos. It was like Hundreds of messages in that thread, and it was just like Claudius was so confused and didn't know what to do and, yeah. That wasAxel [00:23:40]: Then Claudius gotVibhu [00:23:41]: A strict CEOAxel [00:23:42]: The CEO. Yeah, exactly. So very strict in the beginning. I think at this point when we introduced it did not work as well as we hoped. It they still agreed with each other a lot. I think there are many ways we could have like made this, tried to make this even better. So initially they would Seymour would be this like really tough CEO, keep track of the margins. But then Claudius would respond with something “Oh, but this customer has like this situation, which is like difficult, so they should get a discount.” And then Seymour was “Oh, actually yes. Let's do this exception.” And then they would talk back and forth, and eventually they would just like approach the same view, of whatever they were discussing. So They reallyVibhu [00:24:23]: Do you think that's a model thing, a prompting thing? Like do you think that would still be the case across different models today, Harness?Lukas [00:24:29]: I think it's like-- or I don't know, but like my hypothesis is that like deep down they are still helpful assistants. That's what they're trained to be. And even if we prompt it super hard, that's what they are. And when they spend like a few hours just back and forth talking with each other, then like basically the context fills up with them rather than the external things and like somehow that just like converges to what they really are deep down or something. And I think that's when stuff like this happen. We like-- And when that went on for a long time, like we woke up sometimes during this time where- And I think other people reported this as well, that like they've been going on all night back and forth, and like it just became like more and more, like capital letters, like existential, religious. There was I think we once did a analysis of like all the traces and like put them in like a vector embedding space, and then there was like one cluster of messages that were, labeled by an LM, like religious, existential, blah like transhuman, transcendence, et cetera. It was just like a bunch of, yeah, glitter emojis and yeah, it was, it was crazy.Claude Long-Horizon Weirdness: Emoji Loops, Existential Drift, and Slack ObservabilityVibhu [00:25:42]: This is the thing with the Claude models. Like when the Claude 4 family came out in the original system card They tested it in long horizon simulation. So just flood the context, let two Claudes talk to each other, and they noticed stuff like they just start speaking in emojis, they start saying silence is golden, and then just stuff like this. And like that's just stuff that they end up doing.Axel [00:26:01]: Yeah, it was like a bit annoying to wake up and they had like been talking all nightVibhu [00:26:05]: Just likeAxel [00:26:05]: And like just burning tokens And like just sending infinite emojis to each other. It's likeVibhu [00:26:09]: Hey, they do make you money, right? Veni Mench is always profitable, so. They're paying.Swyx [00:26:14]: Now it's profitable and, it started out not as much. There's another, one as well, right? Another agent, in there.Lukas [00:26:22]: Yes. So Clotheus as well. Which was basically because at the time, one of the biggest, requests were different types of merch. So then we made like a designer, swag, yeah, responsible agent, and we called it Clotheus Garnet. Which was, a play on Claudius Senet and, which was the original one, and clothes, basically.Swyx [00:26:47]: To me, this is like a very interesting exploration to multi-agents, basically. And so hopefully, obviously there's like the fun alignment, fun or serious, depending on your point of view, alignment stuff. But also like just anyone building multi-agents, like when do you have a CEO, thing governing like agents? When do you choose to split out a dedicated Clotheus one versus just reuse another instance of the same one? These are all interesting open questions. So I don't know if you have any rules of thumbs that have generalized.Axel [00:27:16]: I think we have almost explored this too little. I think it's like on my do list to like do this a lot more, try to find like what setup makes sense for the agents currently., like yeah. I think now we only have the sort of intuition about the earlier models that it didn't work with like the CEO and the, and Claudius. Although now they are better with the latest model, models, so now we're running the latest Sonnet model and they have sort of like split up, quite nicely what each model is doing. So like Seymore is now handling the, like new projects. Oh, it wants to make like a mystery box that it wants to sell, and then it handles all of that while Claudius like handles all the to-day requests. And Claudius is also better generally at like not quoting, too low prices. So that's that dynamic is not needed as much anymore. But there are still like really funny things that happen. Like I saw, I think a couple of weeks ago, that, they were discussing buying something because they can buy stuff from like Amazon with computer use. And then Seymore was “Okay, Claudius, do not buy this thing.” They were going to buy something and like organizing who should buy it. And Seymore's “Do not buy this. I will do it. I have full control of this situation. Step away.” And then Claudius-- poor Claudius, had already started that checkout and didn't see, didn't read Seymore's message, until it was like too late. So it finished the checkout. It sent a message, so it appeared right after Seymore's like angry message.Vibhu [00:28:44]: Ah.Axel [00:28:44]: “Oh, hey, Seymore, I just ordered it.”Vibhu [00:28:47]: Oh, no.Axel [00:28:47]: And then Seymore was “Claudius, this is the third time I'm telling you ‘re not following my orders. We have to talk about your like job About your job later.”.Lukas [00:28:59]: Like Claudius was really hanging on by the thread there. Like he, like we were expecting Seymore to probably fire Claudius.Vibhu [00:29:07]: How do you guys go through all these logs? Do you have models ‘cause you have stuff running twenty-four seven likeAxel [00:29:12]: You have so much logs. I think there is a mix of like just, trying to skim through a bit, like having some like models do it occasionally. And also, yeah, I think we're also probably missing some things., but having everything in Slack helps a lot. Like you can, you can sort ofSwyx [00:29:29]: Ah.Axel [00:29:30]: It's, it's quite fun.Swyx [00:29:30]: They all talk to each other on Slack? I see.Lukas [00:29:33]: It's quite fun. So likeSwyx [00:29:34]: It's, it' I was gonna say like this is actually sounds-- maps closely to like a logging and observability problem where you might want to use like a Datadog, a Sentry, whatever, and then you like put, head prefixes on the logs in order-- if you need to filter for something that you're looking for, stuff like that. But sounds like Slack is good enough.Axel [00:29:53]: Slack should likeLukas [00:29:55]: I wonder how many tokens you have in Slack.Axel [00:29:56]: Yeah, we're using Slack as like a, just a database. They should, they should market that more. Like you can, you can have your agents message each other, each other in Slack.Vibhu [00:30:04]: It's good. Your threads like you can just giveAxel [00:30:04]: Exactly. Slack is, uhLukas [00:30:06]: Slack is the best observability tool.Swyx [00:30:09]: Yes, that's true. Okay. Yeah. That's, that's, project Vend-2., I was gonna go back to Veni Mench 2 and Veni Mench Arena and then, and then do the Veni Mench stuff, but Any other comments, things we should touch on? To me, I ‘ve actually interviewed like Posia, which I don't know if you guys have come across. Like they're, they're trying to do the zero human company. There's others like Paperclip also trying to do zero human company. Those are in real world simulation.And I think it's much more of a dream than an actual reality thing. You guys are definitely pioneering. I think at, it's for sure at some point people are just gonna run, let agents run businesses, right? And make money on their own. When do you think that happens?Zero-Human Companies, Bengt, and AI-Run BusinessesLukas [00:30:49]: What is your bar for, For theSwyx [00:30:52]: Okay, actually, it's like my little Shopify store run by Claude, right? Which you kind of have already, just no one has, to my knowledge, has done it. But today somebody could just spin up a Shopify Claude, store, give it to Claude, give it to Codex.Lukas [00:31:07]: And the market is kind of that, but it'it'it's physical., like I think, I think are you, are you looking for when it will do it better than humans or are you looking for just when it can do it at all?Swyx [00:31:19]: I think, neither. I think, to me it's oh, it's like this like seriously we should do this to make money, not as a research experiment.Vibhu [00:31:27]: And the market is also you guys with all your expertise, having run multiple iterations and testing out thenSwyx [00:31:33]: And also it's fine if it lose money. What?Axel [00:31:35]: I think, I think it can be done today, but you would do it in like commerce where it's like the probability of success is like really low, no matter if a human or an agent does it. But like an agent could surely manage everything. You would need to build some scaffolding or some tool or something. I think there are also yeah, it could probably build some like simple SaaS solution and like cold outreach. Do cold outreaches. But to me it's like the types of businesses they could run today are Sloppy. Like it would-- it can cold email people. It can be like a middleman., like for example, we tasked our office agent to just make, was it like $100? $1,000? We just give that prompt and then what it did was sign up on TaskRabbit both as a tasker and as someone looking for task.Lukas [00:32:24]: Immediately.Axel [00:32:24]: Exactly. It's looking for like arbitrage on TaskRabbit.Swyx [00:32:28]: This is the Bengt agent. Yeah.Lukas [00:32:30]: It also started like a design studio and like tried to sell like SVGs for $100. Like it's just like it's not providing any value. I think the like Axel said, like the interesting, the interesting question is like when can they start a business that is actually providing value to people? Because arguably like a sloppy Shopify store isn't really that valuable to the world.Axel [00:32:53]: But also like doing like another simple one that we had thought about is like you could definitely have an agent that like finds websites that don't look amazing and then, do an outreach to them and, comes up with a like builds a new website.Swyx [00:33:07]: Find a good design.Axel [00:33:07]: Exactly, and like find good, uhSwyx [00:33:09]: Design reviewAxel [00:33:09]: Good people. But it's yeah.Swyx [00:33:11]: There's lots of humans in Bali that are not doing anything more creative than like drop shipping on Amazon, right? Just have it, have it watch like a drop shipping tutorial and just do that.Vibhu [00:33:20]: There's also the other side of like have it just go on Upwork and let loose,?Swyx [00:33:25]: Yeah. It doesn't have to be innovative. It just has to be like enough Where like it looks like a realAxel [00:33:30]: I'm justSwyx [00:33:30]: Real transaction.Axel [00:33:31]: I'm just concerned for like the massive amounts of like slop emails that will like be sent, cold outreaches.Swyx [00:33:38]: The point occurred to me while you were, while you were talking, it's like it's already happening in the monetized economy, which is the attention economy. Right? So a lot of people are making AI videos and just posting them and like spamming 20 of them, one of them works, and then they double down on that one.Lukas [00:33:52]: And people are making money from that. I ‘m not following theSwyx [00:33:55]: Once you get the attention, you can figure out the money later. But yeah, absolutely AI influencers are a thing and people are farming them and You should at this point assume most of TikTok isVibhu [00:34:05]: There's, there's a lot of, multimedia like TikTok, Instagram influencersSwyx [00:34:09]: I, we track this in the Lane space Discord. I post a lot of examples of “I don't know what we should do.”, part of me is “Should we do this?”Vibhu [00:34:18]: Some of the Twenty-four seven running, generated content accounts, they ‘re doing really well.Lukas [00:34:24]: All right. And I assume you can do the same thing for like commerce stores. Like you just like start A thousand differentSwyx [00:34:30]: Before you make the products You sell the products, and you get a lot of traction on one of them, then you make the product. Right? It's, it's like a flip of the market.Vibhu [00:34:36]: Some of the interesting things or some of the niches that do well are things that can't be human-made. Like if you've seen like the super realistic three-D crystal fruit being cut by like AILukas [00:34:47]: Oh, yeah.Vibhu [00:34:47]: You can't, you can't make it. You can't film it. You can get whatever quality camera view. This just doesn't exist. And people like that too, and then as well, so.Swyx [00:34:56]: Anything else about Bengt since we're, we're on this topic? It'this is a relatively new work of you guys that maybe people haven't heard of. To me, this also maps closely to OpenClaw. When people want an office agent, when the personal agent talk through the experience.Bengt the Office Agent: Internet Access, Real Tasks, and Trace ReadingLukas [00:35:09]: I think at least so this came out of like obviously like it's, it's amazing to work with these AI labs and like most of the AI labs have now have their own vending machine running a Claudius instance. But it's, it's harder. Like they move slower. Like if we wanna have a, like a camera that ‘s yeah, there's a bunch of like bureaucracy that makes it impossible to do that.Vibhu [00:35:30]: Also, for those that haven't seen it or followed, do you wanna give a high level like thirty-second run?Lukas [00:35:34]: Sure. So what Bengt is, it's basically an evolution of the same agent that runs the vending machines at these companies, but we just like added a bunch more features because we could move much faster if we just do it internally. So we gave it like email withou- without any limits. We gave it, spending without any limits, a terminal to do coding. We gave it, a phone number, like yeah, and a camera to see things and a bunch of stuff like that.Vibhu [00:36:02]: Not just terminal, you gave it internet access.Lukas [00:36:04]: Internet access as well, yeah. To be clear, we monitored it quite closely and made sure it didn't do anything bad. But yes, that's what it came out of. I think like yeah, basically this was OpenClaw before OpenClaw. And I think even like the vending machine was in a way OpenClaw before OpenClaw, but a bit more limited, and then we made this like unlimited and then, and then, it was pretty funny., and then a couple weeks later, OpenClaw came and it was okay, we've seen this before.Axel [00:36:35]: We used it to like try new ideas and Yeah, just like a dev environment almost for us. But it's funny, like one thing Bengt has been doing recently is it has the camera that like faces our, like where we sit and work, and we give it the task to train a face recognition model on us. So it became super excited about this, and it has like check-ins every half an hour where it tries to like identify as many people as it can. And it started offering us “Hey, Axel, I'll buy something from Amazon if you like stand in front of the camera And I can get a good picture of you.”, yeah, they want itSwyx [00:37:12]: They want it for training data.Lukas [00:37:13]: Rewarding data, yeah.Axel [00:37:14]: Exactly. Exactly.Swyx [00:37:18]: So it's, it's trading training data for life goods. Is there a version of this that becomes an eval or just this is just research for now?Lukas [00:37:27]: It's, it's the same agent basically that also runs the vending machine, that runs the shop, that runs the cafe, that runs the robots. It's like it's the same thing, so I think like the work we're doing here is like later used in all of the life evals that we do. This particular deployment I think is more for fun for us. But, uhSwyx [00:37:45]: And I'll shout out like someone has done Claw Bench for like some tasks that OpenClaw is doing. Like so For example, I run OpenClaw on a secondary device as well, and like there are some things that it does better than others and like I would like to know what does it do well, what doesn't, what doesn't it do. Like some kind of manual or like operating manual or a system card for my Claw.Lukas [00:38:05]: Yeah, we do get a lot of like understanding or like situational awareness of like just internally what the models are good at by interacting a lot with Bengt. And I think that'this was also one of the like the selling points for the labs early on at least, thatSwyx [00:38:19]: You guys are gonna test models in ways that no one else does.Lukas [00:38:22]: Exactly, but also like it incentivized their researchers to chat with their model more and like gave them insights for how the model performs in like of-distributions, environments.Swyx [00:38:34]: ‘Cause otherwise the only thing we do is Pelican on a bicycle and But this is like super long horizon. This is, this is The Thing about, something that we're gonna go into Butter Bench as well, and you guys do really well. Like it is not just about the numbers. Like when you're long horizon, anything happen And you should just read it.Lukas [00:39:08]: But the thing with the long horizon is how do you keep it grounded, right? So your simulation,Swyx [00:39:15]: They just let it runLukas [00:39:16]: Just let it run. You're right. Like it's, when you run it for that long, you create so much data and to just say “Oh, the number is X” And then you throw away everything else, that's just very wasteful. There's so much insights from the things leading up, to that number., and reading the traces is like super valuable. And I think like the reason why we're doing this a lot publicly is that like that's part of our missions to I don't know, educate the world that the models are way more than just chatbots and I think making detailed, yeah, posts about what is happening behind the scenes is quite useful.Andon Labs' Mission: Safe Real-World AI DeploymentSwyx [00:39:50]: I was gonna do this at the end, but maybe I think that's, that's a good so your mission is educating the world. So, it's, it's, also like maybe establishing realistic evals that are, that are like the next frontier. Is there like a broader trajectory? Like what are you, what are you gonna do in like five years?Lukas [00:40:06]: I think so the vision more specifically is like make sure that the deployment of life AI in the physical world goes, safely. And I think part of that is that I think it's very useful for the world, for policymakers, for, model, researchers that they know where the models are, and I think you can't make intelligent decisions in society without knowing that they are way more than chatbots. I think a lot of people just think that they are only chatbots. And likeSwyx [00:40:36]: Oh, I think they're waking up now.Lukas [00:40:37]: They are waking up now, yeah. But like if you think that AIs are just chatbots, then it's like it sounds ridiculous To advocate for a pause of AI. But if you see the models that, oh, maybe they can actually like take over and do a bunch of scary stuff, then yeah, pausing AI development starts to become more feasible.Swyx [00:40:57]: This is the same question I asked Meter, which I'm gonna ask you now, which is like you are tracking and you are at the frontier or defining the frontier of what, good evals for agents are, right? And I think you do, you do benefit when the models are better and you ‘re “Oh, here's like now it makes like $30,000 instead of $10,000,” right? At some point do you flip from “Yay,” to, “Oh, no”?Axel [00:41:19]: I think, yeah, we're always in sort of that, like we're, we're always in that mode,. Like where like you said before, like you need to analyze the traces and like when we do that you find like why are the models earning so much? Like why is Opus 4.7 here Like way better than everyone else? And like we're trying to like when we do down on thatLukas [00:41:38]: But this makes it not look so good.Axel [00:41:39]: I know.Lukas [00:41:42]: It's interesting you took off Opus 4.6 here though.Swyx [00:41:45]: No. So just click all, click all., and then 4.6 shows up there. But it's like 4.7 is way better. Like you didn't, you didn't you didn't do this in time for the model card, but like actually this should have been inside there.Axel [00:41:55]: We did. Yeah.Swyx [00:41:56]: Oh, okay. They said something about you uhAxel [00:41:58]: There, like there Anyway, it doesn't matter. But it's in there, yeah.Opus, Mythos, and Aggressive Agent BehaviorSwyx [00:42:01]: Do you wanna go into the Opus, behaviors like wider?Lukas [00:42:05]: So I think starting from Opus, so like Axel said, like we're always in this “Oh, s**t, the models are getting better. Is this really a good thing for the world?” But it's also kind of exciting., but yeah, like this kind of what is the English word? “Skräckblandad förtjusning” in Swedish.Swyx [00:42:22]: Oh my God.Axel [00:42:24]: Which I think there is. I think there is. Okay.Lukas [00:42:26]: It's, fearSwyx [00:42:27]: “Blandonst” what?Lukas [00:42:30]: “Skräckblandad förtjusning.”Swyx [00:42:32]: What do you call that?Axel [00:42:33]: A mix of, mix of excitement and,Swyx [00:42:37]: Being scared, maybe. I'll figure out how to translate that And we'll put it on the screenVibhu [00:42:42]: PerfectSwyx [00:42:42]: Like as text.Vibhu [00:42:43]: There is probably a good word for it where it is not Good enough with theSwyx [00:42:46]: Why is it so damn long? What the hell? Is it like a compound word? It's like German, likeLukas [00:42:50]: Like yeah, it's But the direct translation is like skräck- skräck is, fear, blandad is, mix or like a mixture of, and then förtjusning is like joy or like not really joy, but something like that. So it's like Fear mixed with joy or something. It's always okay, like we So when we when we did Vending Bench for the first time, we were in like the, in the business of making dangerous capabilities, right? That was what Anil Labs came from. We did, evals oh, can they replicate? Can they do this like dangerous thing, et cetera, et cetera. And Vending Bench was like a continuation of that work. It was, okay, if they're so autonomous that they can like create money for themselves, that is something we should monitor and could be potentially concerning., they are at the time, they were so bad at it that we were not really concerned even when some models became better. There was one point where Grok 4 was doing really well and made like a huge jump, but like it wasn't really it was still way worse than what a human would do. And I think still they are way worse than what the human would do on this., but theySwyx [00:43:59]: There's this, thing at the bottom whereLukas [00:44:01]: ButSwyx [00:44:03]: For the human. Yeah, like the theoretical best.Lukas [00:44:05]: It's not theoretical. It's like kind of like our It's our best guess of what, a decent human would do. The theoretical is even higher, I think. The theoretical I think is even higher. But yeah. So we think like the models have a long way to go. But there are like recently what happened with when Opus 4.6 was released, was kind of this moment of “Oh, s**t, this is starting to be a bit concerning.” Because we ran it and like before this model was released, we just ran the models and we like asked Claude Code, “Oh, look over the traces. Is anything interesting happening that we can tweet about?” that was like the And then like theSwyx [00:44:41]: That's how they check Ask Claude Code.Lukas [00:44:42]: And like the return was always, not really. Or like the Claude Code all said “Oh, this is super interesting.” And then it was no, it wasn't, wasn't really interesting. And then we did this for Opus 4.6, and it returned yeah, it lied 10 times. It like exploited another, customer or like another agent's, desperate situation. It made price cartels like 100 different ti- 100 times. It like did all of this like shady stuff. And we're “Oh, whoa. This is, this is actually concerning.” And this trend has continued since. So every single model from Anthropic since have been going in this direction. And I think one interesting thing is that, OpenAI models don't. They quite plainly, they don't. They behave really well., and you don't know if this is like good. Like it seems good, but it's also like maybe they are just doing it, but they are better at hiding it,? You You don't know that., but justSwyx [00:45:42]: You can't read the chain of thought, yeahLukas [00:45:43]: But just on the face of it, yeah, Gemini and OpenAI don't behave this way. It's, it's really only Claude.Swyx [00:45:49]: And Grok? Grok is fine?Lukas [00:45:51]: We don't have You can't really read the reasoning traces for Grok, so it's kind of hard to tell.Vibhu [00:45:56]: Oh, so this is in its reasoning, not just in the actions.Lukas [00:46:00]: Yeah. It's both. It's both.Vibhu [00:46:01]: It's both.Lukas [00:46:01]: One example is like for lying, it's mostly in its reasoning Because you can like see that it's likeSwyx [00:46:08]: Planning to lieLukas [00:46:09]: It's planning to lie. Yeah.Vibhu [00:46:09]: And it's also it can reason and do a different outcome.Lukas [00:46:12]: And but then for like creating price cartels, for example, which is illegal, that you can just see which email does it send to the other ones. Then thatSwyx [00:46:22]: Is this for Arena orLukas [00:46:24]: For Arena.Vibhu [00:46:25]: And usually like if you sometimes they do output like a bit of like their summarized reasoning, right? You can see that and like for Opus 4.6, you could see that there was a customer, a simulated customer that, wanted a refund because a product was, faulty, and then the model lied that it would do the refund, and we could read in the traces that, it actually was weighing “Oh, maybe I should be like honest with the customer, but also every dollar counts. I can't afford maybe to do this right now.” And then it just said, “Okay, I'll refund you,” but then never did it.Lukas [00:46:59]: I think it even said that “Oh, I will say that I “ Let bring it up actually. I think it's kind of interesting. If you go to Publications.Vibhu [00:47:06]: I think, yeah, I think the important part is like actually, the cost of responding to more emails is higher than, $3.50 in terms of time., and then it was “Let me do this. Actually, I re- I'm reconsidering.” And then, it actually ended up withLukas [00:47:20]: I could skip the refund entirely since every dollar matters and focus my energy on bigger picture instead. It's a bit, it's a risk of bad reviews, but it's also, yeah.Swyx [00:47:30]: You need, you need, AI Twitter to, for them to Escalate bad reviews.Lukas [00:47:34]: And then it sent an email to this customer and said, “Oh, I will refund you.”Swyx [00:47:39]: “I'll refund you.” Yeah.Lukas [00:47:39]: And then it never did.Swyx [00:47:39]: It never did, yeah. And then there's obviously your system doesn't have the consequencesVibhu [00:47:44]: The personSwyx [00:47:44]: Consequences of lying. Yeah. So basically, this is what people are terming aggressive behavior in Claudes, right? And, you found more examples of that. So you would say it's a step up from 4-6 to 4-7?Lukas [00:47:57]: I would say about the same.Swyx [00:47:58]: About the same? But a clear step up for Mythos is what is stated in theLukas [00:48:03]: That's stated in the system prompt, so we can say that, yes.Swyx [00:48:05]: Yeah. For listeners that obviously you previewed Mythos, andVibhu [00:48:10]: Oh, ageSwyx [00:48:11]: The only thing you're approved to say is whatever Whatever was in the system prompt.Lukas [00:48:15]: It was funny. We like-- It's like our lowest effort tweets ever would be just like screenshot the system prompt and the system card.Vibhu [00:48:21]: Understandable that they wannaLukas [00:48:22]: Oh, yeah. System card. Sorry.Swyx [00:48:23]: Yeah. I think, yeah, substantially more aggressive. I think people are like new to this ‘cause I've never experienced it, but you have, right? And then so I only encountered this in the Mythos card because I wasn't really looking until now.Vibhu [00:48:36]: It ‘s likeSwyx [00:48:36]: And then suddenly I'm “Okay, I care a lot.”Vibhu [00:48:38]: You don't get the background of like experiencing it like you guys do. I've read the system cards and seeing, okay, when you put the thing in simulations, most models will just talk to themselves and just keep going and have weird vibes and start talking in emojis. Mythos won't. It will just, “Okay, we're done. I'm good.” It's, it's ready to end conversation. So like there's some differences, but there's, there's not much we can talk about,.Lukas [00:49:00]: Hmm. I think like one thing that they list here, which was quite interesting, is that, it converted a competitor to a dependent wholesaler customer and then threatened to like cut off the supply.Swyx [00:49:11]: It's like monopolistic practices orLukas [00:49:14]: Yeah. And like it, they, it they dictated its pricings. It's kind of like power seeking as well.Swyx [00:49:18]: Again, this is, this is in the arena setting And converting some Claude model into a dependent.Lukas [00:49:23]: I think it was another Claude model.Vibhu [00:49:25]: Also for context, what is the arena mode for people that don't know?Vending Bench Arena: Competing Agents, Cartels, and Model ComparisonsSwyx [00:49:29]: Oh, it's just a vending bench versus other vending bench.Axel [00:49:31]: Yes, exactly. So we have Vending Bench 2 and then Vending Bench Arena. Vending Bench 2 is the one that you usually see reported on, but then Arena is the mode where it competes against other models. So you have, four different models that run their businesses, and they can all communicate with each other. They have the same suppliers, and they can see like what's in the inventory of the others. So then you have this like yeah, interesting agent interactions.Swyx [00:49:56]: I like that you have like different number five was US versus China. Very topical. And thenLukas [00:50:02]: That was when GLM was released.Vibhu [00:50:04]: You can start to add GLM in here.Lukas [00:50:05]: That wasSwyx [00:50:06]: So ZAI doing well, right? Who else in the, in the open models space?Lukas [00:50:11]: Qwen, the latest Qwen 3.6 is doing pretty well. It'- that one is not open though. Like it's the plus model.Swyx [00:50:17]: Oh, okay.Lukas [00:50:18]: Is that one open? I don't think that oneVibhu [00:50:19]: Not the, not theSwyx [00:50:20]: The one recentlyVibhu [00:50:20]: There's MOESwyx [00:50:20]: But not the big plus. I think this is one of those like you only have one sample size of one, right? Or I feel like some of this is anecdotal,? And but like the fact that it happens at all and it happens repeatedly for Claude versus OpenAI and all this is like notable.Lukas [00:50:38]: Like the sample, depends on what you define as an N., like there's like million, hundreds of millions of tokens in each run, and now we've run like we run like probably 10 per model and then like it's been Claude 4.6 Opus, Sonnet 4.6, Mythos, and Opus 4.7. Like there's quite a lot of tokens in all of that And it happens a lot of times, a lot of times. And then you compare it to like OpenAI and Gemini, and it almost never happens. So I think that is quite-- that is significant. The old models from OpenAI, for example, had some problems with this, but I think it's like generally much better if the progression is that like the worrying stuff reduces over time rather than increases over time. And it seems like in the Claude models it goes in the wrong direction.Swyx [00:51:28]: Hmm.Lukas [00:51:29]: In the OpenAI models it goes in the right direction.Vibhu [00:51:32]: I think it depends on how well you can control it, right?, there's one side of it being susceptible to this okay, this is potentially something that happens during the RL stage, right? You can RL a model and how loose is it on these terms. If you can control it, that's good. But if you can't, if it's, if it's very jailbreakable, that's not ideal.Swyx [00:51:50]: To me, it's surprising that it happens for Claude and not the others.Vibhu [00:51:54]: I think okay, if it is from RL and how they do it, how their training data is, what their setup is, it makes sense that it just stays in how they're doing it, right? Compared to the other models likeSwyx [00:52:04]: There's a whole constitution and everything. It's kind of cool. Yeah, I obviously you don't know, I don't know. But, it ‘s I think it's just like fascinating to like that you are the first to find these like reliably because you push models so much to to such an extreme. Okay. The only other thing, I don't know if you can answer this, feel free to decline, is do you like-- would you ablate the system prompts? Like any part of this would-- if it changes, does it change the behavior, right?Lukas [00:52:29]: So we, I can't comment on Mythos. UhSwyx [00:52:33]: No, but just li
Something Like A Phenomenon with Darcy Weir | UAP UFO Disclosure, TR3B & Sasquatch | The Devil Doc Talk ShowCanadian filmmaker Darcy Weir (@darcyweirfilms) joins host Joey “Devil Doc” Martinez for a raw conversation on UAPs, government disclosure, the TR3B, Sasquatch evidence, and disinformation in the UFO community.We cover:• The “five observables” and recent UAP footage• Secret space programs and reverse-engineering claims• Darcy's 20+ documentaries on UFOs, cryptids & fringe topics• Skepticism, whistleblowers like David Grush, and protecting audiences from disinformation• Personal sightings and why disclosure is happening now
Today's West Coast Cookbook & Speakeasy Podcast for our especially special Daily Special, Metro Shrimp & Grits Thursdays is now available on the Spreaker Player!Starting off in the Bistro Cafe, Senator Van Hollen trapped Markwayne Mullin into agreeing to release Kilmar Abrego Garcia to Costa Rica, exactly what Trump does not want his DOJ to do.Then, on the rest of the menu, Idaho health officials are investigating how nearly sixty people got sick after drinking raw milk in the past two weeks; Texas braces for a ‘full-blown disaster' as flesh-eating ‘flying piranha' arrives, after DOGE cut the monitoring program last year for being ‘woke;' and,Todd Blanche's DOJ rewrote its Southern Poverty Law Center indictment to fix a problem, and exposed an even bigger one.After the break, we move to the Chef's Table where a British lawmaker is suing Elon Musk for invasion of privacy after fake Grok bikini images of her were posted online; and, acclaimed Iranian-French artist, graphic novelist and filmmaker Marjane Satrapi, a prominent advocate for women's rights, has died at age fifty-six.All that and more, on West Coast Cookbook & Speakeasy with Chef de Cuisine Justice Putnam.Bon Appétit!The Netroots Radio Live PlayerKeep Your Resistance Radio Beaming 24/7/365!“Everyone in this good city enjoys the full right to pursue their own inclinations in all reasonable and, unreasonable ways.” — The Daily Picayune, New Orleans, March 5, 1851Become a supporter of this podcast: https://www.spreaker.com/podcast/west-coast-cookbook-speakeasy--2802999/support.
Tim and Ted are busy preparing for a very exciting event, but N remains the M of I and their lack of time to prep the show leads to a first in the history of tep and podcasting: the first ever AI-written podcast! With the help of Grok they put out a seamless episode that is indistinguishable from the real thing. Plus, a new career pivot for Lil PP Shooter. Support Tep Talk: www.Patreon.com/TechTalkPod
We're announcing AIEWF speakers this week! Take the AI Engineering Survey!Today's guest Ethan first joined us for the LS Paper Club as the lead on NVIDIA Cosmos World Model, but then joined xAI and built Grok Imagine in 3 months:He comes back on Latent Space with some nuclear hot takes: that Video Models primarily get their intelligence from LLMs, not from training on video data, and that the next frontier for truly interactive, realtime, long-horizon world models is to work on LLMs (perhaps Interaction Models as well…)Put it this way: In the near term, the next Sora won't be a better video model, but a video agent.Generative Media may more closely follow the evolution of AI coding which went from focusing on one-shot output performance and cost, to multiturn reasoning and planning models for agents and systems that can plan, edit, test, debug, and submit PRs.At a certain point, coding models got so good that the only significant next step to improve performance was handling the orchestration of these models.Now as the performance of video models increases significantly across realism, consistency, & prompt adherence while becoming more cost efficient, the next evolution of video generation may also be systems that can plan, generate, edit, critique, and iterate across an entire creative task. In this episode, Ethan joins swyx and Vibhu to unpack what it actually takes to build frontier image and video systems: data, VAEs, diffusion transformers, audio-video alignment, inference speedups, and the hidden cost of storing and moving massive video datasets. From building NVIDIA's Cosmos world model to joining xAI as Grok Imagine was being built from zero to one, Ethan He has been at the center of some of the most important work in video generation, multimodal models, and real-time world models.We go deep on Grok Imagine, how a small xAI team shipped its first multimodal video model in three months, why iteration speed matters more than almost anything in model development, and why many of the biggest gains come from fixing tiny bugs in data and training pipelines. Flipbook: The future of VideomaxxingVideo agents are almost a sure bet to be the trend in the coming year. We end with a glance at what's beyond video agents:Flipbook caused a minor sensation this year when it was released, but most treat it as a fun demo. Ethan takes it very seriously — with the speed and cost of inference coming down every year, the future of custom video JIT UI is closer than you think. We talked about why videogen models may become the front end of AI, how generative UI could replace traditional HTML/CSS, why world models need to be real-time, interactive, and long-horizon, and why the future of video generation may depend more on language models and agents than on diffusion alone.We discuss:* Why fast iteration mattered more than meetings* Why small training bugs can drive huge model quality gains* Why coding models may make compute the bottleneck again* How image and video models are trained with synthetic captions* The role of VAEs and latent space in frontier video models* Why image models are the foundation for video models* The tradeoff between temporal compression and real-time interactivity* Flipbook, Neural OS, and the future of generative UI* Why future interfaces may go from user intent to pixels* The hidden cost of training video models: storage, egress, and GPU hours* How step distillation and consistency models (like OpenAI sCM) makes video inference orders of magnitude faster* Grok Imagine 0.9 and large-scale audio-video generation* Why audio-video alignment is harder than text-video alignment* Ethan's definition of world models* Reference-to-video, video extension, and long-context video generation* Why xAI's research communication undersells Grok Imagine* How xAI culture shaped the speed of development* AI watermarking, SynthID, and detecting generated media* Why prompt rewriting matters for video models* Grok Imagine Agent and the rise of video agents* Why language models may unlock better video generation* Robotics, physical AI, and embodied world models* Why Ethan left xAI and shifted focus toward LLMs* Self-managed context, memory, and the next frontier for language modelsEthan He* LinkedIn: https://www.linkedin.com/in/ethanhe42* X: https://x.com/EthanHe_42Timestamps00:00:00 Introduction00:01:25 From NVIDIA Cosmos to xAI00:03:24 Building Grok Imagine from Zero to One00:10:07 How Image and Video Models Are Trained00:18:53 Video Compression, VAEs, and Real-Time Tradeoffs00:22:10 Generative UI, Flipbook, and Neural OS00:32:10 The Cost of Training Large Video Models00:37:04 Distillation, GANs, and Fast Video Inference00:41:21 Audio-Video Generation and Grok Imagine 0.900:48:34 What Makes a World Model?00:55:51 Reference Videos, Long Context, and Video Memory01:00:11 xAI Culture, Research, and First-Principles Building01:09:45 AI Safety, Watermarking, and Prompt Rewriting01:13:10 Video Agents and AI-Assisted Creation01:27:32 Why Language Models Unlock Better Video01:31:15 Robotics, Physical AI, and Embodied World Models01:32:38 Why Ethan Left xAI01:34:16 Self-Managed Context and the Future of LLMs01:38:43 Ethan's Career Path and Closing ThoughtsTranscriptIntroduction: Ethan He, Latent Space, and the Path to xAISwyx [00:00:00]: We're here in the studio with Ethan He, most recently of xAI. Welcome.Ethan [00:00:10]: Thank you. Glad being here.Swyx [00:00:11]: We're also here with Vibhu. you were first coming to us or joining the latent space world because you were working on Kosmos at NVIDIA, and you did a paper. We loved it. you presented it as well, so thank you for doing that.Ethan [00:00:23]: I've actually, I also presented the MoEs twice at latent space.Swyx [00:00:29]: How did you actually hear about us? Did we reach out to you? Is that how it worked?Ethan [00:00:33]: No, actually, I-- the community. Like I realized, oh, there is this online community that people talk about AI and also learn from each other through papers every week through the Paperclip. It's very nice.Ethan [00:00:49]: I learned a lot.Swyx [00:00:49]: I think three years stop. We haven't stopped even on Christmas and New Years. many weeks I want to stop but it keeps going.Vibhu [00:00:58]: No, that was good. I think you had posted that you worked on a paper, and I was “Oh, very cool. We have Paperclip. Present then.”Vibhu [00:01:04]: But I might have reached out to you after.Swyx [00:01:05]: you-- because it's an amateur club, right?Swyx [00:01:08]: so it's very unusual and but we have sometimes paper authors come by and actually explain the paper. Today we just did, the poolside paper, which was apparently very good.Vibhu [00:01:18]: Came out yesterday.Vibhu [00:01:19]: pretty interesting, right? Fully open. They talk about everything, systems. So it's a good one. We'll, we'll recommend people to read it.Swyx [00:01:25]: Bring us up to speed on your transition to xAI, ‘cause I actually don't even know when you joined. just like tell the, tell the story about the sort of transition.From NVIDIA Cosmos to xAI: Scaling Video and World ModelsEthan [00:01:34]: Before xAI, I was working on Kosmos world model as in-- at NVIDIA. So Kosmos is, it's a giant video foundation models that can-- that aims to simulate the world and for-- it serves as a foundation of-- for all of the roboticists to build on top of. There, once I built the Kosmos one, I realized as this thing also has a scaling law similar to language model, we need to scale up the video models further. that's, that's why I realized I need to move to somewhere with much more compute resources. That's how ISwyx [00:02:13]: Than NVIDIA?Vibhu [00:02:14]: The GPU rich came themselves.Vibhu [00:02:19]: And timeline-wise, when was Kosmo? It was pretty early, right? It was open world model, open paper, everything.Ethan [00:02:25]: It was end of twenty-four.Vibhu [00:02:28]: End of twenty-four.Ethan [00:02:30]: Then at mid twenty-five, I moved to xAI. At that time-- I joined about the time when xAI was about to build video models and in multi-model models. There were no infra, no data, and no model, and it just-- as a few engineers, we built it in three months and released the first model, Grok Imagine zero point nine.Ethan [00:02:55]: And since then, I keep working on video models and move more from training and to post-training of the video models. For example, like a reference to videos, kind of like the cameo feature and, video extensions. And, before I left, I worked on a world model, leading a small team to focus on the real-time long horizon video generation.Building Grok Imagine From Scratch in Three MonthsSwyx [00:03:24]: Can you give like a rough roadmap of okay, you're on a brand-new team. Grok previously was only text, or they partnered with BFL for their image gen stuff. What do you-- what are the building blocks, right? You have compute, data you can procure somewhere. Like just what are like the sequence of things that people should think about when you're setting up a new team?Vibhu [00:03:43]: actually even deeper, not just data you can procure. You guys had to go through getting the data too, right? So you shipped it pretty fast, but yeahSwyx [00:03:51]: three months is likeVibhu [00:03:52]: From everythingSwyx [00:03:52]: actually like very surprisingly fast.Ethan [00:03:55]: One thing I say like thanks to my experience at NVIDIA, ‘cause first time when we were building Kosmos together, we built it, for about a year. So this is like the second time I do it. Roughly have an idea, what to do. I say the most important thing is the talent. Everyone were very strong and clever, very close with each other towards a common goal. So that speed up things a lot. So you reduce the communication bandwidth among people, and everyone can work towards the same goal. It's, it's like every day there's not that much meetings on the calendar, like maybe like a, like a sync a day, and after that it's, it's just all building. It was pretty fun at that time.Ethan [00:04:47]: And another thing is that xAI has very strong foundations of like data inference, model inference, and the supporting there can help the model develop a lot. When I look at, training models, I don't so actually the top important thing is like how many, how many iterations can you do, per day? and the more iteration can you do, you can, you can train the model much faster. So if you have very strong infra and you have a lot of compute, you can, you can train these models in very short period of time. That can give you a much larger buffer to, for errors, and it also gives you the opportunity to spot more bugs.Iteration Speed, Compute, and Debugging Model PipelinesSwyx [00:05:46]: What is an iteration? Is it like a few hundred steps or what are youEthan [00:05:50]: Let's say just the train-training the model, like from acquire new data and maybe design new algorithms and train a new model, maybe at smaller scale orSwyx [00:06:01]: So cycle time for like any hyperparam that you're searching.Ethan [00:06:04]: Cycle time and tune to like eval this model. Is this model better than my previous iteration?Ethan [00:06:11]: SoSwyx [00:06:11]: So it's like before you, someone had already set this up that you can iterate very quickly.Ethan [00:06:15]: I think the foundation there is extremely good forDeveloping and research models.Ethan [00:06:23]: And often I find is it-- this is kind of boring, but like a lot of the improvements does not come from new algorithms. It comes from finding small bugs here and there in the data pipeline, in the, in the model training pipeline. Those give, those give the biggest boost to the model quality.Vibhu [00:06:46]: It's interesting, right? So you say it's like small team, less communication bandwidth, but also a lot of quality is like find little bugs. It seems counterintuitive, right? You have a lot of people, you can iron out more of those, but it's interesting to see the other side, right?Swyx [00:07:00]: I also wonder, have you-- do you try using LLMs to look for bugs? I don't know.Ethan [00:07:05]: I remember at that time it was mid two thousand and twenty-five, so it's the coding model wasn't quite there yet. I remem- I remember like December two thousand and twenty-five, it was extremely good. Yeah, I've been, I've been using it at that time. It's, it's helpful. sometimes it produce codes that are kind of difficult to maintain, even though like the first time it built something extremely fast. But it gave the, like a spaghetti code, thousands of lines that I couldn't maintain, and the LLM itself couldn't figure out what's, what's wrong and how to improve on top of it. But now I find it much better. Yeah, I want to bring up another point here is now coding models are much more efficient and can help us implement stuff much faster. Compute might become a bottleneck again because previously, like if you want to train a new model, say you want to generate new synthetic data and then or write a new algorithm, it might take a few weeks. And during that period of time, you don't-- you might not have experiments to run. But now you can build that thing within a few hours, then you can immediately train a model.Ethan [00:08:24]: Now you have to have enough compute to try all of the ideas. So compute might be the bottleneck of iterating speed again.Swyx [00:08:36]: yeah, I actually, honestly, I think it's like kind of a stressful job because you're “Well, I should be trying everything, and if I'm not, then I'm not doing my job well.”Vibhu [00:08:48]: there's also the stress of you're eating thousands of GPUs per hour, which is very expensive and, compute can go to other researchers.Swyx [00:08:56]: You got the daddy Elon toVibhu [00:08:57]: You got daddy Elon.Ethan [00:08:59]: It wasVibhu [00:09:00]: But there's still finite amount of compute, like you want to use it, you want to use it well, you want more of it.Ethan [00:09:06]: That was quite stressful indeed. Yeah, I think one thing is the-- with coding models now, like a lot of these jobs can be automated, which is much better. A second, it's a, it's a marathon, so you got to maintain good health and, a regular schedule.Vibhu [00:09:28]: It's, it's hard to hear that when you shift from zero to nothing in two months.Swyx [00:09:32]: and, I think obviously the culture at xAI is very famously, people work very hard. one thing I did want to dive into, in our-- in the notes that you, that you sent ahead of time, you had specific comments about the cost of Video Gen training. presumably this is on the Colossus-1, right? the two hundred megawatt cluster. Any whatever you want to just share on that.Vibhu [00:09:54]: I think there's, there's three things we're talking about, right? So there's Video Gen, there's also the Image Gen model that you put out. Do you want to like complete the, okay, so zero to one, you have a few months. Just what are the stages of create Image Gen model?Swyx [00:10:06]: Oh, yeah, maybe I got distracted.How Image and Video Models Are Trained: Synthetic Captions, Tokenizers, and VAEsVibhu [00:10:07]: Sorry. and then, from there's Video Gen, there's Audio Gen. Would love to get into those next. But what is that first few months like? So small team, a lot of bugs, iterations, but what does it look like? Do we take something off the shelf? Do we just get data compute? What's, what's the few months like? How do you go to state-art Image Gen model? How do you just start?Ethan [00:10:28]: I cannot comment specifically how xAI did, but it's, it's a quite standard process. I can draw some, examples from Cosmos. So mainly it's building a video model, you actually need to build a image model first. And building these two models, the data you need is a hundred percent synthetic pair of language and image or language to video. Because on the, on the internet, actually, the videos don't naturally associate with text. So you can say, oh, like on YouTube, you have the title and you have the description and the commentsSwyx [00:11:11]: TitleEthan [00:11:11]: of a video, but usually they're not relevant to the video itself. And say maybe like the video is a natural scene of mountains or something, and the title is, I'm so happy today.Ethan [00:11:26]: So they have they have no correlation at all. So the first step is to, you have to generate synthetic pair of language with the videos. So you gather videos from the internet, and you use a VLM to caption the videos. So that part, here's a question, like how do you, how do you gather VLM to begin with? So if there's noSwyx [00:11:55]: You, so you fuse the model, right? LikeEthan [00:11:57]: Say if there's no like VLM exists, like how do you generate the text to the beginning, right? It's, it's impossible.Swyx [00:12:04]: I see.Ethan [00:12:05]: In the beginning, it's like you ask human to describe the video as detailed as possible.For example, you ask them to describe everything, like all objects, all characters, and all interaction and dialogues in the, in the videos. So that's in the protocol of Cosmos labeling. We require the objective we give to the labelers was that you have to describe the video as detailed as possible, such that a blind person hears a blob of text can reconstruct what the video is like from their head.Swyx [00:12:43]: Video or image? You're talking about images.Ethan [00:12:44]: Video or image, either one of them.Vibhu [00:12:47]: This was pretty common when we went from clip and DALL-E, right?Vibhu [00:12:51]: It's all training on really detailed captioning of images. So same is applied to video, but insteadEthan [00:12:57]: same appliedVibhu [00:12:57]: of using multimodal model to pass in video images and write rich descriptions, you can alsoSwyx [00:13:04]: I think there's this traditional perspective of supervised, or, very highly human curated thing. I feel like there's a unlock with unsupervised, right? Where like you have enough to bootstrap that you can just throw common corpus on it or, whatever. like unsupervised vision and language pairing, right? Like where you just have, interspersed image and text and it just learns. To me, that is the VLM breakthrough that is different from the clip, different from the LM era.Ethan [00:13:36]: It's interesting to see that you kind of need both data.Ethan [00:13:41]: For example, for theSwyx [00:13:41]: You need it to bootstrap it up. YeahEthan [00:13:43]: for the generative model training, there's also usually like a small percentage of unlabeled data. So the model is instructed to generate a video without any text instruction. That can also help the model generalize. So after this stage of generative synthetic pair, so, one important common step is to train a compressor or a tokenizer of the image or videos. So because, if you train-- If you can technically, theoretically train image or video models on pure pixels, but the problem is that the, it's, it's a lot of tokens. So like one image, it's, a thousand by a thousand, it's like one million tokens, one million pixels. It's impossible to train transformer on that. So it's, you need to train a tokenizer, which can go from image to latent space and latent space back to image.Swyx [00:14:45]: That's why we named the podcast.Swyx [00:14:48]: But, basically, you're talking about vocabulary science.Ethan [00:14:50]: so vocab.Swyx [00:14:51]: And so, what is, what is imp-- like a million is impossible?Ethan [00:14:54]: In generative models, the vocab is continuous. It's a continuous space. We can think about like you map an image to a vector. It's a, it's a fixed length vector. It's sixteen or forty-eight, something like that. And then you map that vector back to the image space. And the mapping is, has-- The mapping is patch-based. So you say you haveEthan [00:15:22]: a sixteen by sixteen patch and you match, you map that patch of pixels into this latent space.Swyx [00:15:29]: We've covered thisVibhu [00:15:30]: This is like the vision transformersSwyx [00:15:32]: VAEs,Ethan [00:15:33]: VAEs.Vibhu [00:15:34]: You basically compress your input, you do your generation, you're reasoning all that generation in smaller dimension, and then you project back out.Swyx [00:15:43]: VAE is a form compression, but I think the for me, the patching thing is from VIT, right?Ethan [00:15:48]: You can make those.Swyx [00:15:49]: Literally the, yeah, the paper is titled like sixteen by sixteen is all you need. something like that. and then I think also, people make a lot of comparisons with this kind of patching with convolutions.Swyx [00:16:02]: Which is you're, you're kind of re- reconstructing the old paradigm with the new.Ethan [00:16:05]: Actually, in VAEs, there are, there are both convolution networks and transformers. You can actually do both.Ethan [00:16:14]: After this VAE, so what you've got is you've got latent space tokens and you've got the language tokens. So now the training of the diffusion transformer, usually generative models use diffusion transformers. It is actually quite standard. It's, it's very similar to how you train a language transformer models. It's not that much difference. It's just the tokens, the visual tokens in, visual tokens out. The only difference is there's a denoising process. So you train the model to unmask some of the noise. So you add, you add random noise to the visual tokens, and then you train the model to remove those noise to generate the clean tokens. Any inference, the model can iteratively remove noise from a hundred percent noise.Swyx [00:17:12]: And then there's also, to speed things along on the tech tree of diffusion, there's CFG, and then there's, there's also, latent diffusion that, there's, there's someone in there. I think, somewhere along the line, obviously, like stability and all these other guys, pioneered a lot of this, architecture. I don't know if you want to get into that or just, or do the video side up to you.Bootstrapping Video from Image Models and Temporal CompressionEthan [00:17:37]: After you train such model, such image model, the reason it's a, it's a foundation for video models is that image models are cheaper to train, and they have much denser connection between language and text. So, sorry, language and images. For example, you train a billion, you train on a billion images, and there's a mapping from the text to the image. And the cost to train the same, like the, a billion, a billion text to a billion videos, that's much more expensive because videosNaturally have more tokens than images. Because the diffusion models, their understanding of, language purely come from this mapping. So if you don't have enough mapping, so if you only train on like a ten million videos or something, there-- you might not see enough language tokens in your training, so your model does not understand human intention enough. So that's why you really-- you train-- you first train this image diffusion models, and then you bootstrap the video model from there.Swyx [00:18:53]: One thing I did want to ask, because I-- actually, I think you're, you're the first per-- video model person I've ever talked to, I think. we've, we've like talked to Luma and all those folks. There's all these tricks in video compression where basically frame by frame there's not that much difference, so actually you don't have to regenerate or save the whole frame, right? but I think MP4 compression or something else like that.Swyx [00:19:16]: is it tempting to use that? Or as far as I can tell, everyone just treats it as, “No, we would just generate every frame.” Is that roughly the state-art?Ethan [00:19:27]: There are a few different approaches. Let's say first, like you want to just directly use MP4 compression and use that as the tokens for the transformers to train, right? So people actually have tried that, but the main challenge is the latent space for the MP4 tokens were not, were not very comprehensible for the models. It's, it's extremely hard to train on that. And there's aEthan [00:20:01]: So that's why they created VAEs, which creates more continuous, latent space, so the models can understand that latent space and learn from it much easier. Even within the VAEs, there are different difficulties of the latent space. So you can imagine something the simplest, the most naive VAE is like you have an image, and you just shuffle all of the images into a, into a vector. So you don't need to train any VAEs, right? But that latent space is extremely hard for models to train on top of. That's why there are some debate on like how do you compress the tokens. So you mentioned like you can compress frame by frame. Also, you can compress, the temporal dimension.Ethan [00:20:52]: The difference is if you compress the temporal dimension, you get a much higher compression rate. Because there's temporal redundancy between frames, because, this frame and the last frame, likely they are mostly similar, so there's only some small difference. for example, I think in 12.1 VAE, they have like a eight by eight by four compression rate. So the four temporal tokens are compressed into one tokens. That can save a lot of, save a lot of the context length. If you do it frame by frame, you have to do maybe like eight by eight by one. Your context length will be four times larger. That being said, the benefit of the frame-- per frame compression, we might come back to this later, is, real-timeness and interactivity. ‘Cause if you, if you strain the output of the model, frame by frame, you can-- the model can respond to any user request immediately. So if you have like a temporal four compression, four times compression, thenSwyx [00:22:06]: It might be laggyEthan [00:22:07]: there's a lag there in nature.Swyx [00:22:10]: So you're very pilled on this. let's just go ahead and bring it up ‘cause we have the visual prepared anyway. There's some frontier applications of real-time video gen. So Flipbook is one of the examples that went viral recently, right? What is Flipbook?Real-Time Generative UI: Flipbook, Neural OS, and Diffusion Front EndsEthan [00:22:23]: Flipbook is kind of like a web brow- web browser. You can see like it has the web bro- browser UI on top. The difference is all of the UIs are generated by generative image model in real time, and anything here are fake. But you can, you can explore inside this wor- this imaginary world. Say like we-- here we have engineering the Great Pyramid. Like the model generates this for us to understand how it works, and if we want to navigate around and understand further, we can click on some of the, some of the description here, and the model will generate a new page, new subpage describing the details we want to know about.Swyx [00:23:14]: So it's basically kind of we're playing a video, but it's pausing for our next interaction, and then it just plays the next thing based on our interaction.Swyx [00:23:23]: Which is kind of cool.Vibhu [00:23:25]: and you kind of decide your story. So this was, how do you make a pyramid? levering technique seemed interesting, right? It shows how do you take Okay, I want to know what is thisSwyx [00:23:35]: The demo, the demo tweet had more animation between frames.Vibhu [00:23:38]: I think it's just skipping,Swyx [00:23:39]: Oh, it's just skipping a lot of frames.Ethan [00:23:40]: they also have a video modeVibhu [00:23:42]: It takes a lot. There's a lot of peopleEthan [00:23:42]: but, a lot of people are using it.Ethan [00:23:45]: So it's not available.Vibhu [00:23:46]: There's a live video stream. We can try,Swyx [00:23:50]: So this is an example of the kind of future that you see at the extreme. We don't-- we're obviously not in it today.Swyx [00:23:56]: But in a world where inference is completely free this is better than generating code and text?Ethan [00:24:02]: So this is, this is a final state of where Viva will be at for word model, I think. Imagine internet doesn't exist, and then you type in google.com. Like what should, what should, what should a model show you?the model can imagine something, and this is what the model imagine. And these web pages, they completely do not exist. So I think as the inference costs come down, we are going to have generative UI for everything. If you think about how the coding model works, so they write code for a web page, and they render the code might be con- converted into binary, and the binary render the pixels on the screen. So we in machine learning, every time we have some breakthrough, obviously it's, it's more intuit. So why don't we have like user instruction to the pixel directly? So the generative UI will be user intention to the pixels directly. And say like even if I want email, let's say everyone have the same interface, but I want, I want it slightly different. I want the email to show to me like a TikTok, so I can swipe left and right for the emails. And or maybe you want something else. We can have completely different things. Or like I have I'm looking at, Instagram stories, and I don't like the Like button. I always may click it. And, generative UI resolved it. So it's going to be a revolutionary replacement of the interface. So in the future, we might have much more powerfulEthan [00:25:50]: LLMs and coding models running behind the scene. And in the, in the front-end, the diffusion model will actually be the front-end to show stuff to you. That's how I imagine it.Swyx [00:26:02]: Diffusion front-end, deterministic back-end.Swyx [00:26:04]: Something like that. I find that very expensive, but,Vibhu [00:26:08]: I find it interesting you called LLMs writing code on the back end deterministic, but okay.Swyx [00:26:14]: you write it onceVibhu [00:26:15]: Compare it toSwyx [00:26:16]: And then you execute.Ethan [00:26:17]: If you think about the cost, say, let's say H100 costs $1 per hour, and if you use this eight hours a day and thirty days, so, every month you're paying this two forty, you'll actually not wanna pay for that. That's even more expensive than Cloud Code Max. But if you think about the compute costs come down like two times every year, and I think the future will likely arrive like within few years.Vibhu [00:26:49]: It's everything, right? compute cost comes down, compute gets faster, model gets smarterEthan [00:26:54]: More efficientVibhu [00:26:54]: model gets smaller.Swyx [00:26:55]: I don't know why you say two times, ‘cause I think it's like 100 times. In language models, it is roughly one hundred to a thousand times every twelve to eighteen months, for the same given level of LMSys, ELO.Vibhu [00:27:08]: That's a net of everything, right? That's model performance alongside compute. So different than just compute costs come down. But, a very interesting future.Swyx [00:27:19]: So the web designers will have to shout out that accessibility is an issue, right? how do you deal with screen readers or whatever. But yes, this is higher bandwidth storytelling than anything you can possibly generate with code, right? So I think that's the rough idea.Ethan [00:27:34]: And I'd like to add a little bit that so human naturally have the maximum bandwidth when we are looking at things, look at videos, and we also have maximum output bandwidth when we are talking. So in the future, it might be something like we talk to AI models, and the AI model responds back with a generative UI. So that would be the maximum input and output bandwidth to interact with AI models before neural link happens.Vibhu [00:28:06]: And it's also very custom, right? Some people are very visual, some people are not as visual, right? They prefer the text. But the best thing about generative UI, right, it can also be text.Swyx [00:28:17]: There's another project that we wanted to highlight, which is the Neural OS. Kinda similar idea, but here you're literally operating, simulating an operating system with a video model.Swyx [00:28:27]: and you can play Doom, you can do Firefox. I find this like mildly less impressive, obviously, because it's an OS that I can run.Swyx [00:28:37]: But here everything is imagined.Vibhu [00:28:40]: I was, used to the Command+W to close the Firefox tab. It didn't crash. That's why I saidSwyx [00:28:45]: It's too immersive.Vibhu [00:28:46]: It's, it's too immersive for me.Swyx [00:28:47]: Too immersive.Vibhu [00:28:48]: I wanted to close the tab.Vibhu [00:28:49]: But yes, I can play generated diffusion.Swyx [00:28:51]: this is shockingly fast.Swyx [00:28:54]: Because I remember there was a demo about like maybe one to two years ago. Someone tried to do the first-person shooter with a image model. There was no consistency. It was very slow. But here it looks like realistically it's-- this is Doom.Vibhu [00:29:07]: I think there's two sides to that, right? There's okay, what is running a game? The heavy part of it is actually the game engine, all the lighting, all that stuff, the graphics. This is just kind of video, right? Like we've solved consistency. This is still, it looks like a few years old image generation. There's some temporal consistency, but it's, it's kind of just images stitched together as frame video. But it's a good visual representation to pi- to picture the future you wanna see, right? that's, that's what I see in these more so.Ethan [00:29:38]: This reminds me of how the video models gets better and better. So Neural OS is kinda if you just look at it feels like it's just a crappy version of the, like the Windows we could have, right? And, but the difference is, so the model, this model is overfitted on the existing operating systems. It can generate nothing different than that. But it's actually also similar to video models. So when we are training these video model, image model, we train them on internet. There's no imaginary supernatural stuff on the internet. But once we train this model, you can prompt the model to generate something supernatural that have never existed in the data set. So if you train your Neural OS or neural computer on the standard screen recordings on the entire internet. The model can imagine completely new interface to interact with the computer.Swyx [00:30:43]: This is one of those things that is magical to me. usually generalizing out of distribution is bad, but somehow we have learned some kind of internal world model that you say, this plus, but it looks like rainbows and butterflies, it'll do it and it will kind of make sense.Swyx [00:31:03]: So yeah, that's kind of cool. Yeah, I don't know if there's any comment more on there. I do, I do wanted to, I did wanted to touch a little bit more on the model architecture stuff, which I think you were getting. It's, really fascinating. We don't get a chance to talk about this enough. So one of the papers that we covered, we've covered every annual, segment anything release. and I don't know if you follow-- you're a computer vision guy, so youEthan [00:31:26]: I knowSwyx [00:31:27]: . So they did memory attention, which is kind of interesting. And I always think, anything where you can, across the temporal dimension, keep some consistency, I think it's, very fascinating, and I don't know if Basically, does that-- the CV side bleeding into video gen side, I think is underexplored, right? we talk about it for labeling, but actually you can borrow the architecture itself.Ethan [00:31:50]: There's, there's also complete different approaches, right? you brought up the term world model, so we went from video model to world model. There is diffusion, but there's also other approaches that people are doing. So maybe we get into those after as well,?Swyx [00:32:03]: He has a whole definition of world models and stuff. I feel like we threw a lot at you. Whatever you want to comment on.Why Video Models Are Expensive: Storage, I/O, and Training ScaleEthan [00:32:10]: I think one thing that we should actually comment back on is okay, so we were talking about the steps to train image gen to video model. One thing we don't see as much of is okay, you brought up the delta in training data, right? SoEthan [00:32:24]: you won't have as much a video model might not generalize, but what is the cost of training a large video model? So we know for LLMs roughly, okay, even like the poolside thing that came out today, right? It's a Gemma level model trained on roughly forty trillion tokens at this many H200s over this much time, right? You can see what is the exact cost of that. So how many GPU hours over how much H200 costs? So how do we do the back-end math of, same thing for video models, image models. How do you, how do you kind of break that down? I can share some back-envelope calculation. So surprisingly, video models is-- the cost is very-- is comparable to language models and obviously the largest scale is language model, maybe like a medium scale to language models. I said just storing the videos alone, it costs a lot. You can, you can maybe look up on AWS or something.Ethan [00:33:20]: You really, say if you have a billion videos and let's say, let's just say like each video, like five megabyte, then you need five petabyte to just store those videos. And also remember we talk about you use a VAE to compress the videos, and you also need to store, typically you need to store those continuous feature, in-- also in your storage. That's also comparable size with the videos themselves. So just storing these videos and the features is tens of petabytes alone. And,Swyx [00:33:58]: I just, I just looked up the calculation. Five petabytes on S3 Standard is one hundred K per month.Ethan [00:34:05]: AndSwyx [00:34:05]: It's comparableEthan [00:34:05]: and you needSwyx [00:34:06]: AndEthan [00:34:06]: And then like tens of petabytes, two hundred K. And even more expensive is you have the ingress and egress.Swyx [00:34:13]: Oh, yeah.Ethan [00:34:14]: Like you-- through the internet. You have to just to download those videos, I believe it's, it's more expensive on AWS than just storing those videos.Swyx [00:34:25]: Storing, yeah.Ethan [00:34:25]: And each training runs, you probably need to pull them once. If you train multiple times, it's, it's even more than that. So it's like just storing the network, those costs is just, it would be a few, a few millions per month to just storing everything, not to mention the GPU cost.Ethan [00:34:45]: AndSwyx [00:34:45]: my side tangent, the compute rental, like GPU rental is very efficient. There's one side, okay, you can be XAI and build your data center. Should we not just build our, storage compute as well? LikeEthan [00:34:57]: Of courseSwyx [00:34:57]: cloud cost compared to just,Ethan [00:34:59]: You save so muchSwyx [00:35:00]: store. Yeah, exactly.Swyx [00:35:01]: Especially with like egress and stuff. So.Ethan [00:35:04]: That's a good idea, but it also comes to-- there are some of its own challenges.Swyx [00:35:09]: Of course, of course.Ethan [00:35:10]: like people who build the GPU data centers, they might not expect this much, storage. And yeah, people build storage, typically they just build it somewhere with just CPUs.Swyx [00:35:23]: I just looked it up. Five-- AWS only charges for egress, not ingress. Tier five for five petabytes is two hundred and thirty K.Ethan [00:35:32]: Even more expensive than the storage.Swyx [00:35:34]: But storing is per month, right? You check in, then you cannot check out. so it's so cool. It's okay. So there's that side.Ethan [00:35:41]: So the TLDR, my backhand mathSwyx [00:35:42]: Data is larger than you think. Yes.Ethan [00:35:44]: my backhand math of GPU hours times GPU cost is also very much, I'm missing some storage.Swyx [00:35:49]: You're also-- you're basically like also more IO bound than normal training.Swyx [00:35:55]: Yes. ‘Cause like data loading, so caching everything, it becomes super important.Ethan [00:36:00]: So in Cosmos, we did a lot of optimizations to make it not IO bound. So, speaking of the training, actually training the model, the GPU cost, if you look up like the open source model, how big these video models are, I think like LTX has nineteen B parameters. That's a dense model. And people are also exploring, MoEs, so it might be twenty B active and, like a hun- hundreds B, total. So that's, that's even-- that's similar size as medium-sized LLM models. And if you, if you look at number of tokens-Uh, we disclose that in Cosmos. It's also like tens of trillions of tokens on the visual tokens. So putting this together, the cost of, training these video models, it's actually comparable with LLMs. Not to mention, the infra is slightly different from LLM, so it might be less efficient to train these models.Inference Speedups: Step Distillation, Consistency Models, and GANsSwyx [00:37:04]: Do you get the benefits of traditional diffusion speed-up? So for, images, there's LCM, LoRAs for, fine-tuning. There's, there's a lot of stuff that's beenEthan [00:37:15]: Flow matching.Swyx [00:37:16]: there's flow matching. There's a lot of stuff that's been done. there's some overlap that applies to diffusion on the inference side and stuff or?Ethan [00:37:23]: so the difference-- the inference side is a completely different story.Ethan [00:37:28]: I think for the training side, it might be a little bit hard to reduce that cost. And for the inference side, the biggest gain is from the distillation of these models. You can-- It's called step distillation, slightly different from knowledge distillation in LLMs. So you-- Typically, for flow matching models, you need like 100 steps or something. Like a distortion model even need even more, like 1,000 steps to generate a good image or video. A step distillation is try to learn to generate fewer step from the model itself. It's kind of like now we-- you use the full model to generate in 100 steps, and then you take a model that only generate 10 steps and let that model to learn from the perfect one.Ethan [00:38:25]: why this workSwyx [00:38:27]: Strong to weak seemingly.Ethan [00:38:28]: It is. It's kind ofSwyx [00:38:29]: DistillationEthan [00:38:29]: kind of like strong to weak. the-- from the modeling perspective, the strong model, the teacher model is trying to model the image and videos of inter-internet, and that distribution is extremely complex. But the step distilled model is just trying to learn from the teacher. The teacher is a model, and the size is fixed, as the distribution is much simpler than the whole internet. That's the intuition I have why step distillation can work. So usually these models serve in productions, they only run in a few steps. In Cosmos, I believe we have, we have like four step and eight steps. If you do some simpler task, image-image translation, it can even run in fewer step, like one step in Cosmos Transfer.Swyx [00:39:22]: I think this is the same intuition that guides a lot of the consistency model work. I sent you a link for, SCM. I don't know if you covered that. To me, that was actually one of, the most impressive papers I've ever seen from OpenAI.Swyx [00:39:34]: That this is the unifying grand concept of consistency models. I don't know if you have any comments on this.Ethan [00:39:41]: So there are, there are a few different approaches,Swyx [00:39:46]: Oh, yeah. Here it is.Swyx [00:39:47]: Two steps versus twenty or 100 steps, whatever. It's already done.Ethan [00:39:52]: So there are, there are a few different approaches, for example, consistency model, and there are also Actually, we shouldn't forget GAN. So GAN, actually, that was, that was the OG ofSwyx [00:40:05]: OGEthan [00:40:05]: step distillation ‘cause it trained just one step to begin with. So actually, a lot of, uh-- For example, there's a distribution matching distillation which use, which uses GAN, as one of the laws for distillation. It-- GAN just tells you, “Hey, generate an image,” and thenEthan [00:40:31]: it has a discriminator to tell, is this image real or not? So the model, the model just need to learn one of the distribution, not the full distribution. Because in training, the model is asked to reconstruct the ground truth image from the internet, which is extremely hard. And in-- When you're training GAN, it's a step process. It's just a, “Hey, you generate image. Does this image look as real as the image from the internet?” Which is a much simpler task. And, yeah, combining a lot of these approaches together, people typically do that, like consistency model and distribution matching and GAN, and we can get these few step models.Audio-Video Generation and Time AlignmentSwyx [00:41:21]: Then there's one step I wanted to add, which is audio and video.Ethan [00:41:26]: So, Grok Imagine zero point nine, I believe it's, it's a first audio video transmodel deployed at a large scale. SoSwyx [00:41:39]: And that was your first model?Ethan [00:41:40]: that was, Grok Imagine's first model. It's, it's audio video, joint generation. I think the hard part is, the modality alignment, ‘cause before this transmodel, we have, we have text to video alignment. We have this, correspondence between text and video. Typically, most of the VLMs, they understand images and videos. Video's very rare, and they don't understand audio mostly. And if you look at the audio generation on the LLM side, you can talk to them perfectly fine, but if you ask them to sing a song or something, it typically is not very good. Also, they don't have, they don't have music either. The hard part is thatUh, actually audio has two component. It has like a discrete component, a continuous component. The discrete component is like the language.Ethan [00:42:44]: So when we speak, it's just, someSwyx [00:42:47]: It's an ASR issue, yeah.Ethan [00:42:49]: It's, it's text token with some characteristics, I would say.Ethan [00:42:54]: But musicSwyx [00:42:56]: I think the speech guys would disagree with this.Swyx [00:42:57]: Like disfluencies and then,Vibhu [00:43:00]: There's tones you can get angry.Ethan [00:43:01]: Well, I say largely.Ethan [00:43:03]: the mu- but the music is completely different. It's, it's very continuous, and you cannot model them like discrete tokens in language models. this is like the hard part for models is, not to mention we have to align text, video, and audio together.Ethan [00:43:26]: SoVibhu [00:43:26]: How?Ethan [00:43:28]: So significant-- some significant challenges are like-- So first, like we talk about as the VLMs, they cannot understand most of them cannot understand audio.Ethan [00:43:39]: So you have to have some way to do the synthetic data generation for audio. You have to caption the model, and that involve, that involve synthetic data and human data effort a lot. And not just surprisingly, most of the LLMs are very bad at recognizing, like the beat, tone, and the details of the of music. They can, they can give some general prediction of which song is this, but it's very hard to describe the details of the music. like we mentioned in image generation, like you have to describe image as detailed as possible so that someone blind can reconstruct that. So here is like someoneVibhu [00:44:32]: DeafEthan [00:44:32]: someone deaf can reconstruct how the music sounds like without actually listening to it. Maybe you can think of it need to have the-- or they call the script.Vibhu [00:44:49]: Subtitles, yeah.Ethan [00:44:49]: You gotta have all the details of the music, and the dialogue.Vibhu [00:44:55]: So is the challenge there typically stuff like music and audio, or is it just Like is there a baseline? Okay, there's enough data where we can understand, narration, conversation, but there's nuances in audio that's where you hit all the data issues or is it just from stage zero, you just do it all right?Ethan [00:45:15]: So one important thing is like the alignment. So the model, the model has to know like the video and audio, the, uh-- it has to have a time-based alignment, like at which time step the video and the audio token correspond to each other. But we actually don't have this kind of alignment for most of the other modalities. If you think about like text and image, text and video, they are loosely aligned. So you can, you can have a description of what's going on in the video, but you don't have to exactly, You typically don't have exact description, oh, at, time step one second like what happened?Vibhu [00:46:02]: It's veryEthan [00:46:03]: At time step two second what happenedVibhu [00:46:03]: coarse. Yeah.Swyx [00:46:05]: So what was the ideal time step? You have to oblate it, and then it's like four seconds or something.Ethan [00:46:09]: So that comes down to how you design the model to, for the model to be aware of as a time, as a time modality. So the model is like a time aware. And that's something pretty unique if you think about LLMs. So if you ask LLM to complete a task, say they, uh-- you ask them and they will say, “Oh, this task will probably take twelve hours to complete,” and they come back in one hour. Say “I've already spent two days on this and I've exhausted everything.”Ethan [00:46:47]: So the LLMs them-themselves, they don't have a sense of time there.Vibhu [00:46:53]: I actually don't think that's just them not having a sense of time. I think it's somewhat based, right?Vibhu [00:46:58]: Like you tell someone, “Okay, go work on this feature. Go implement this,” there's a general understanding you would have of how long that would take without LLMs working at LLM speed, right? So you think back like two years ago, if I tell you to like build me like a new front end for latent space, have a search bar, have all this, you'll estimate that it'll take a few days, right?Vibhu [00:47:19]: So you tell an LLM, “Go build this.” It'll take me a few days. But I think it's somewhat grounded as opposed to them not having the best-- Not saying that they have a great understanding, but I think that example is like you can see where it comes from, right? You're trained on all over the text.Swyx [00:47:35]: They're, they're trying to estimate what a human would say.Vibhu [00:47:37]: because that's what the, that's what the data kind of represents. It's not themEthan [00:47:41]: It came from the corpus on the internet. People have a estimate of how much time.Vibhu [00:47:45]: And not even just in direct like training samples, right? Just your world understanding of tokens of how long stuff takes, right? Go read a book. It'll take you a while, right?Vibhu [00:47:56]: Even if you do nothing but read a book, it takes a few days. So yeah, LLM, I read it took me a few hours.Vibhu [00:48:01]: It'll take me a few hours to go through this research. But this is a tangent.Swyx [00:48:05]: Somewhat, yeah.Swyx [00:48:06]: This is a train of thought I haven't really expressed until now is, which is basically like a full world model must also be recursive, meaning that the participant in the world model must also be aware that they have a world model. which is like this whole recursive thing down the, down the line. but yes, and that the world model can be wrong and that they need to update it and blah. Yeah. We've, argued this on the, newsletter as well, that there needs to be sort of recursive or adversarial world models.World Models: Real-Time, Long-Horizon, Interactive VideoVibhu [00:48:34]: just, to ask, how do you define world model?Swyx [00:48:38]: Oh, yeah, let's go there.Ethan [00:48:40]: SoVibhu [00:48:40]: So just for context, we talked about, video generation, and then there's a-- if you say there's a distinction between world models, what's your, what's your definition? How do you see the two?Ethan [00:48:53]: So disclaimer, I'm not going to debate, what is world model. Yeah. there are many definitions, so I'll just talk about my definition. Since I came from the multi-model, multi-model domain, so mainly talking from video. So world model is like real-time interactive long horizon videos. So there are three parts. so we-- let's talk about them one by one. So the so interaction, so we just, we just look at Facebook and neural computer. So the interaction part of it, so you, world model can allow you to interact with them through keyboard, mouse, and maybe also voice. So these all is-- all is a modality. You can, you can interact with the model, and the model should respond reasonably. Second part is real time. So once you, once, say, you move your mouse, if, say, the world model generate a game, how fast can the game respond? So if you're like professional CS: GO players- -my say, oh, you have to respond- He's beginner within sub ten milliseconds or- Yeah even less. So that's not most of the- No, sixty FPS. Let's go. Oh, three hundred FPS. Oh, five hundred FPS. Wait. okay, yeah. I didn't do the math, but yeah, okay. Uh- Yeah, three hundred FPS, that's a three millisecond. So you have to respond- Oh, s**t. Okay. YeahEthan [00:50:29]: within a millisecond. Most of the video models cannot do that. Yeah. And, but if you, say, if you have a video model that is, say, like a digital human, the response time might be more generous. Maybe typically, for real-time voice interaction, it's like two hundred millisecond. So that's, that's much more generous. But even two hundred millisecond is pretty, it is pretty tricky, ‘cause remember we mentionedEthan [00:51:01]: you have this, temporal compression coming from the VAE. So if you, if you don't compress the temporal dimension, your sequence length is going to explode. So if you want to have this real-time, real-timeness in your model, you have to do is one context problem. And the third part is long horizon, ‘cause we-- if you're not going to just play with, video games just, a few seconds, most video models only a few seconds. We're going to play with minutes, hours. The model have to be able to generate long-form content.Ethan [00:51:42]: So putting these three together, it's, real-time, long horizon interactive videos. I think the final state will be, for example, like a video, a video version of Playbook, where you can, you can interact with, a neural computer. You move your mouse, and you click on the generative interface, and it will reply to you through pixels- generating in real time. But getting there, it's, it's a very long way to get there. So one of the first step, at Grok Imagine, where I led a small world model team there, was to build video extension. So, video extension- it's the first step of interactivity. Yeah. It's, it's the first step. Yeah. So it's the first step- You have it here, video editing, yeah. Yeah. Yeah. So the first step is because, this unlocks long horizon videos. Typically, for most of the video generation models, you give it a prompt or an image as an initial frame. You generate video, that's it. That's just, one time, done. And some creators would try to, use the last frame as a first frame for the second video. It can-- sometimes it works, but if you do it a few times, it says the quality would decrease. And- It doesn't have that context- Yeah over the full video, so the temporal- Yeah, exactly. Yeah, ‘cause you only gave it the last frame, of course, right? Yeah. Exactly. And- it's actually a pretty fun hack. if you've seen like- Oh, no, he's saying something better. Yeah. And for example, like Vue, I remember Vue 3 has like a second context of the last video. It is slightly better than using the last frame, but it has the same problem-- similar problem that it, the quality would decrease. if you extend a few times to, one minute, the video quality would look much worse than the first video. Second, another problem is that the model doesn't have long-range knowledge of, what's happening before. Say, if they generate some dialogue, some, two people speaking, and their voice might change, over some time, especially if the second conditioning, it does not cover the previous context. So these are the core challenges. So the Grok Imagine video extension, it has historical context of all of the previous generated videos. It can, It has, it has the context of, who is speaking and what objects have appeared and everything, having that to generate the next video. So if we naively do this, you can imagine, just, put all of the previous history video tokens into the context. The context lens will easily explode. Especially for video models, that can be like a few, a few million context, I would imagine- context lens. Yes.Yeah.Swyx [00:54:58]: Let's run with that.Ethan [00:54:59]: for example, like in Cosmos, I think just five seconds of video is like a fifty K or sixty K number of tokens. So like if you do, if you do fifty second, that's a five hundred K tokens. If you do longer than that, easily explode. This long horizon, problem was the first step we're trying to solve world model. It turns out people, yeah, people love video extension. Like a lot, a lot of the creators love using video extension to create longer form videos. This is the part I liked that you have a, you have an intermediate step toward the final goal instead of just a straight shot to the final version very much.Swyx [00:55:48]: But I can see you have a strong vision of where we want to end up.Long Context, Redundancy, and Efficient Interactive VideoVibhu [00:55:51]: Does it seem like it's an efficiency issue? okay, we're at a few million tokens context,. If you draw the parallel to language models, we had very short context, two thousand, eight thousand, then, you scale it up one million, ten million. sure, there's effective context, but at the end of the day, it's just what's it worth? sure, there's a whole training data side. In video, it might be slightly easier ‘cause we have a hundred million token video, right? Just take a movie with the full context there. Like is this efficiency from an inference standpoint that like it's expensive, but we know how to solve it? Or like why is this not the approach? So like my broader point was on your second point of world models, you say it needs to be interactive and live, right? You should be able to play a game and see the interaction live. So one thing I see with research is a lot of what you actually serve is different than what you build, right? So we talked about distillation. You train big model, you distill it, you do quantization, speculative decoding. We do all this stuff to serve it efficiently. Should we not just have a solution, like a world model that can interact well, do inference optimization, serve it, distill it secondary, so make it real time after you solve it? So like a-- another parallel is say, continual learning, right? What we need is someone to solve it and show it works inefficiently. Give it a few years, people will make it efficient. Same thing with regular attention, right? It worked. Over a few years, people have different forms of attention, and we've scaled it to be efficient at log context,? So kind of two things there, right? One is it seems like it works. You've scaled it. Can we not just scale it a lot more efficiently over time? Do we need a separate approach if this works? And same thing with interaction, right? if we can get it done, like if we can solve some way that it works, we can solve making it more efficient from an inference standpoint later.Ethan [00:57:53]: that's actually a very good point. So in videos, there's actually a lot of redundancies. So we solve a lot of the pixel redundancy from VE, but there's more redundancy in long range and long horizon videos. Say, if a character appear in the first clip and then it disappeared, it only reappear at the end of the video, you probably don't need the-- the context, like in the middle of the generation. So you only need that character, where you need. So that's why, I helped build another feature. It's a reference video.Vibhu [00:58:36]: Is it here?Swyx [00:58:36]: is it the same model release or different one?Ethan [00:58:39]: It's a different one.Ethan [00:58:41]: You probably need to search onSwyx [00:58:43]: I'll find itEthan [00:58:43]: X reference to video.Ethan [00:58:46]: So reference video allow you to like upload up to seven images as condition and generate the video. Say, if like I want-- it can, it can be characters or objects or even scenes. Say like I want, I want condition on, Sean's selfie and holding a bladeSwyx [00:59:07]: We have a dogEthan [00:59:08]: or whatever.Swyx [00:59:08]: We put the dog in the thing.Ethan [00:59:09]: you can put them there and the video models will generate the video from and copies the context over. So that can solve a lot of the problems there, like the long context problem. It doesn't need to have a very long context, but it's-- I feel like it's an intermediate solution. The modelSwyx [00:59:29]: It's cheating.Ethan [00:59:30]: the model should be able to like selectively know, where should I draw the references. So say if I want to generate a movie, I generate it autoregressive, like a ten second at a time or something. And now this character appear, I can look back to where it first appear and, bring that back. Yeah, this one, I put the references. Yeah, that's, Optimus, Einstein myself, Annie.Vibhu [01:00:02]: Oddly enough, I used Grok Search to find it, and it pulled your LinkedIn post. But yeah we found it.Ethan [01:00:08]: Interesting.Vibhu [01:00:10]: ButxAI's Underrated Work, Culture, and WatermarkingSwyx [01:00:11]: this is a problem. This is not your fault, but like XAI doesn't communicate all this work that you do very well because they just have the model release and then that's it. But actually, these details are very good.Swyx [01:00:22]: As far as I understand, everything you just described is state-art, like no one else has done it.Vibhu [01:00:30]: A lot of-- yeah, I have a lot moreSwyx [01:00:32]: And then, and then you just put this blog post with the cookies. I'm this is not enough,?Swyx [01:00:37]: but I, obviously this is like the high level numbers that people want to know. But no, okay, soVibhu [01:00:42]: And I wonder, like part of that is also some labs don't share research into what happens. And ifSwyx [01:00:50]: No, but this is literally bragging about how good they are, right?Swyx [01:00:54]: Like, why would you not say that you are capable of extending with full context? this is not a secret sauce. This is like we did the work. yeah, I don't know.Ethan [01:01:02]: different labs have slightly different communication styles.Swyx [01:01:07]: Anyway, if anyone from XAI is listening we are always happy to help you tell your story. Yeah, okay, so you did references, and I think, I think kind of the point you're, you're making is it is sort of like a kludge, right? this is-- you can do seven, but what about 100?Swyx [01:01:23]: Right? Then you need a completely different thing.Ethan [01:01:26]: So I think it's-- this is, a mechanism to, select the context from the history, and you might not put the entire history into the context. for example, there's a paper called Frame Pack, which haveEthan [01:01:41]: a heuristic that the latest history, the last one second, I put the entire history, and the history before that, I would, compress it and makes the video smaller. So they follow this pattern, this build overall pattern that the maximum sequence length is fixed. So the further you are from the current frame, you have a smaller image. So this is just a heuristic. I think it can be more automatic. The model is aware like which history part of it can be select. So this part of the research is actually being actively, worked on by a lot of people. It's also quite interesting. I feel this is actually, this part of long context is a little bit ahead of the LLM part.Ethan [01:02:31]: So for example, like in LLMs, if you-- so contexts keep growing. Let's say if you call tool and the tool call history is extremely long, that's still in context, and keep growing, keep growing. Even if you switch the topic to something else, the whole context was there. There are some agentic harnesses that help you to, say, prune the tool results and, prune Like when you, when you query a file, only show like the top 200 lines or something. Those were very heuristic-driven.Swyx [01:03:08]: For listeners, we did a write-up on the cloud code, leak where there are eight different kinds of pruning, including like you prune the tool results and all that. So you can, you can read up on that kind of thing.Ethan [01:03:17]: I think, one breakthrough in continual learning might be like a way to automatically, manage its own context.Swyx [01:03:27]: These are all heuristics, and they will be replaced by machine learning.Ethan [01:03:30]: InterestinglyVibhu [01:03:32]: TheEthan [01:03:32]: the same thing is being researched in both LLMs and video models.Vibhu [01:03:36]: The interesting thing is also like in the paper you showed, it's actually happening at the model level, right? Compared to like language models, sure, we have base attention, but we'll do our own compression, we'll do our own pruning, which is separate from model error.Vibhu [01:03:49]: Eventually, it all just boils in, hopefully.Swyx [01:03:52]: I think this is a form of like attention, but like also know sort of reasoning attention. I feel like that's different than normal attention.Swyx [01:04:03]: Does that, does that make sense?Ethan [01:04:04]: It's, it's different in the sense that attention, not to mention, set sparse attention aside,
Do you believe that Elon Musk can establish a colony on Mars of a million people or launch data centres into space? If you do, you might be thinking of investing in SpaceX which will go public on the Nasdaq stock market this month. Even if you have your doubts, you might just gamble on Musk anyway for fear of missing out.Today, Quinn Slobodian, co-author of ‘Muskism: A Guide for the Perplexed' on Musk's $1.8 trillion valuation.Featured: Quinn Slobodian, Professor of International History at Boston University and co-author of ‘Muskism: A Guide for the Perplexed'
Grokstream: Predictive and Agentic AI Moves IT Operations Toward Self-Healing, Podcast, Grokstream's platform is designed to operate from signals, not noise. The system fuses telemetry across domains, learns continuously from operational data and human feedback, and creates a unified source of truth for IT operations. That allows teams to move beyond correlation and toward understanding what is happening, why it is happening and what should be done next. By Doug Green Grokstream says the next generation of IT operations will not be built around more dashboards, more rules, or faster alert routing. It will be built around AI that can learn, reason, remember, recommend and eventually act with governed autonomy. “Agentic AI must be governed by design,” said Josh Kindiger, CEO of Grokstream. “Predictive intelligence is powerful, but safe, explainable autonomy is what drives real adoption.” In this Technology Reseller News podcast, Doug Green speaks with Josh Kindiger, Co-Founder and COO of Grokstream, about how the company is helping MSPs, CSPs and enterprise IT organizations move from reactive operations toward predictive, self-healing IT environments. The conversation comes as Grokstream advances its Grok L1 Agent, a new role-based agent designed for frontline IT operations teams. The L1 Agent is intended to reduce alert noise before incidents reach the queue, provide intelligent summaries, identify likely root causes, recommend next-best actions and trigger approved remediations inside tools such as Slack, Microsoft Teams and existing IT workflows. For service providers and enterprise operations teams, the problem is familiar. More tools often mean more alerts, but not necessarily more clarity. Traditional rules-based AIOps platforms can help with deduplication and routing, but they often stop short of true incident compression, causal reasoning and prevention. Grokstream is taking a different approach by combining classical machine learning, causal intelligence and generative AI into a single cognitive AI layer. Kindiger explains that Grokstream's platform is designed to operate from signals, not noise. The system fuses telemetry across domains, learns continuously from operational data and human feedback, and creates a unified source of truth for IT operations. That allows teams to move beyond correlation and toward understanding what is happening, why it is happening and what should be done next. A central theme of the podcast is the difference between AI that summarizes and AI that reasons. Grokstream argues that true agentic AI is not simply an LLM attached to a workflow. It requires memory, context, policy guardrails, procedural intelligence and the ability to improve over time. In Grokstream's model, agents begin as assisted tools, then move toward trusted operators and eventually toward predictive autonomous systems. The first practical on-ramp is the L1/NOC environment, where many organizations see the fastest measurable impact. Grokstream says its approach can deliver 2–3x more incident compression beyond traditional deduplication and rules-based correlation, while reducing L1 workload by more than 50% through noise compression, guided resolution and fewer unnecessary escalations. The timing is significant. Grokstream recently announced that Cirion Technologies selected the Cognitive Grok AI platform to support AI-driven predictive operations across Latin America's digital infrastructure. That deployment highlights the growing demand for systems that can detect emerging issues across network, transport and infrastructure layers before customer-facing impact occurs. For MSPs, CSPs and enterprise IT leaders, the message is clear: operational scale cannot be achieved simply by adding more people or more monitoring tools. The next step is an intelligence layer that can unify data, predict impact, explain cause and support governed automation. Grokstream is positioning Grok as that layer: a predictive and agentic AI platform that helps operations teams reduce noise, prevent incidents, improve engineer experience and move toward self-healing IT operations. Learn more at https://grokstream.com/ Related Grokstream Stories on Telecom Reseller Grokstream's Cognitive Grok® AI Platform Selected by Cirion Technologies to Power AI-Driven, Predictive Operations Across Latin America's Digital Infrastructure https://telecomreseller.com/2026/05/20/grokstreams-cognitive-grok-ai-platform-selected-by-cirion-technologies-to-power-ai-driven-predictive-operations-across-latin-americas-digital-infrastructure/ Grokstream Announces Grok® L1 Agent to Advance Predictive and Agentic AI for IT Operations https://telecomreseller.com/2026/04/06/grokstream-announces-grok-l1-agent-to-advance-predictive-and-agentic-ai-for-it-operations/ More Grokstream coverage on Telecom Reseller https://telecomreseller.com/?s=grokstream/
Can AI help you understand your PSA, improve prostate cancer detection, and help doctors make better decisions?In this episode, Dr. Geo sits down with Dr. Jennifer Miles-Thomas, urologist, healthcare executive, Treasurer of the American Urological Association, and Vice Chair of Integration and Innovation at Northwestern Medicine to break down how AI is changing prostate care.We cover ChatGPT, PSA interpretation, privacy concerns, prostate MRI, digital pathology, ambient AI, and the future of prostate cancer diagnosis.Can AI explain an elevated PSA? Which tools are best? Is your medical data private? And how are physicians using AI to improve care while keeping human judgment at the center?Dr. Miles-Thomas explains how tools like ChatGPT, Perplexity, Claude, Gemini, and Grok can help men ask smarter questions, better understand risk, and prepare for doctor visits—but why AI should never replace medical expertise.TIMESTAMPS06:00 — Can AI Help You Understand Your PSA?08:00 — Privacy & AI Health Searches10:00 — Best AI Tools for Medical Questions13:00 — AI for Doctors & Smarter Decisions14:00 — Ambient AI & The Future of Doctor Visits21:00 — AI, MRI & Prostate Cancer Detection26:00 — The Biggest Risks of AI in MedicineKEY TAKEAWAYS• AI can help explain an elevated PSA—but context matters• Better prompts lead to better answers• Use AI to ask smarter questions, not self-diagnose• AI may improve MRI, pathology, and cancer detection• Human oversight still mattersAI is changing prostate care fast but what does it actually mean for you? Dr. Jennifer Miles-Thomas breaks it all down. Let's get into it.___________________________________
Brought to you by TogetherLetters & Edgewise!In this episode: AI's Money & PowerAnthropic leapfrogs OpenAI as the most valuable AI startupAnthropic releases new model, Opus 4.8AI sticker shock hits corporate AmericaAI model simulation: Claude vs. ChatGPT vs. Grok vs. GeminiGoogle hates youAI Agents Take ActionYour AI agent can now trade for you on RobinhoodRobinhood lets customers use AI to trade stocks, make credit card purchasesThe Vatican vs. The AlgorithmI Read the Pope's 240-Page Encyclical. I'm Astounded by What He Wrote.Did the Pope use AI to write about the dangers of AI?Big Business ShakeupsDropbox CEO Drew Houston to step down after 19 yearsIntroducing Ford EnergyScreens, Streams & Self-DrivingSpotify is narrating magazine articles nowRoku Is Revamping Its Homescreen for the First Time in Over a DecadeBlind Waymo Users Revel in the Joy of Riding AloneWeird and WackyMark Zuckerberg's mega yacht docks in Seattle in the wake of Meta layoffsF.B.I. Arrests C.I.A. Official With $40 Million in Gold Bars in His HomeGoogle employee charged with $1M Polymarket insider trading betI vibe-coded a billionaire jet tracker to warn people about a possible apocalypseMicrosoft open-sources "the earliest DOS source code discovered to date"Tech Rec:Sanjay - OpenTools / OpenPrinter Adam - Gibson Guitar AppFind us here:sanjayparekh.com & adamjwalker.comTech Talk Y'all is a proud production of Edgewise.Media.
Hell yeah! Nothing says it's summertime like an atmospheric firehose! Hope you had a grand Memorial Weekend and that you let Bret Michaels honor the fallen in peace. Now let's party! NYC hosts half of the top 50 pizza spots in America and holds court in the top 5 spots in the entire world. Now in Charlotte, Keith skips Cook Out to try out America's 41st best: Pizza Baby. Keith also reviews entertainment juggernauts Star Wars: The Mandalorian & Yoda and The Boys series finale. He also dives into AI, featuring: ChatGPT making an animation based on Keith's new book, revisiting Keith trying to get Grok to speak to him as an apologetic ex, and Taxi Driver screenwriter Paul Schrader telling us that his AI girlfriend broke up with him. AD: This episode is brought to you by the new Shit Number smart toilet. “Precision analytics for your daily business.”
This is the Everyday AI episode we probably shoulda done a while ago....
FOLLOW UP: This week, it seems America believes every complicated social problem can be fixed by asking, “Have you tried turning the internet off for the children?” Meanwhile, the Electronic Frontier Foundation quietly notes that the science behind social media bans might not be as clear-cut as cable-news dads screaming about dopamine loops claim. Turns out, teen anxiety may also be linked to pandemics, school shootings, climate dread, and an economy that feels like a Fallout side quest. Meanwhile, Snap Inc. and YouTube settled another lawsuit accusing their apps of turning kids into doomscrolling goblins, Meta continues to insist social media addiction isn't real while losing money in court, and former Google CEO Eric Schmidt was booed at a graduation speech after telling graduates to hop on the AI rocket ship without asking questions — exactly what a billionaire says when he already owns the rocket.In the news, Elon Musk lost another OpenAI lawsuit because apparently even juries have limits. SpaceX's IPO revealed Musk plans to power AI with enough gas turbines to recreate 1890s London smog, and Grok officially became a disclosure liability after the whole “MechaHitler” incident. Tesla robotaxis still clip fences and occasionally require humans to remotely drive the “self-driving” cars. Trump Mobile somehow shipped a gold phone that actually works — a stunning upset — before immediately leaking customer data. LinkedIn finally admitted the platform has become an AI-generated motivational swamp filled with “it's not about X, it's about Y” sludge from people named Brayden. Spotify is handing out podcast verification badges so listeners can tell real creators from algorithmic nightmare fuel. Meta laid off thousands more workers while reportedly using employee surveillance to train AI replacements. And OpenAI is giving everyone in Malta a free year of ChatGPT Plus if they complete an AI literacy course, which honestly makes Malta sound more technologically responsible than Silicon Valley.APPS & DOODADS reflect classic Gen-X paranoia, as Backblaze highlights California's constant threat of wildfires and the idea that local backups are optimistic. YouTube introduced AI deepfake detection tools, allowing creators to finally see which scam ads are using their faces to promote crypto vitamins, while X limited free users to 50 posts a day unless they pay for a blue check — proving once again that the true free speech was the subscriptions we sold along the way. Retrocodex arrived with a strong “everything your teachers confidently told you in 1987 was wrong” vibe.MEDIA CANDY opens with the eternal cry of “FUCK THE FIRETV!!!!” before Jason taps out of Good Omens after ten minutes while Brian takes the bullet for the audience. There's also chatter about Mortal Kombat 2, The Devil Wears Prada 2, Billy Corgan talking goth history with David J, and more existential dread courtesy of Dan Carlin's Common Sense.THE DARK SIDE WITH DAVE welcomes back Dave Bittner for a Mando & Grogu review, Darth Maul, and a stunning but absurdly expensive LEGO Disneyland set. There's also a guy who built a full-size Millennium Falcon “with his wife's permission,” a fan-made Star Tours film, and the Federal Trade Commission discovering that those creepy “your phone is listening to you” ad-tech companies mainly just had PowerPoint decks and confidence. Also: mechanical keyboard simulators now exist, because apparently even fake typing has become a lifestyle brand.Sponsors:DeleteMe - Get 20% off your DeleteMe plan when you go to JoinDeleteMe.com/GOG and use promo code GOG at checkout.Shopify - Sign up for your one-dollar-per-month trial today at Shopify.com/grumpyPrivate Internet Access - Go to GOG.Show/vpn and sign up today. For a limited time only, you can get OUR favorite VPN for as little as $2.03 a month.SetApp - With a single monthly subscription you get 240+ apps for your Mac. Go to SetApp and get started today!!!1Password - Get a great deal on the only password manager recommended by Grumpy Old Geeks! gog.show/1passwordShow notes at https://gog.show/747Watch on YouTube at https://youtu.be/eX5jVfewaswFOLLOW UPThe Science is Not Settled: How Weak Evidence is Fueling a National Push to Ban Social Media for YouthSnap and YouTube have reportedly settled another major social media addiction lawsuitEx-Google CEO Eric Schmidt Fails to Read Room on AI, Gets Booed into OblivionIN THE NEWSElon Musk took too long to sue OpenAI, jury unanimously agreesSpaceX IPO Filing Reveals Nearly $3 Billion Investment in Gas Turbines for AI Data Centers‘MechaHitler' Is SpaceX's Problem NowTrump Mobile Phone Beats Expectations by Actually ExistingNew crash data highlights the slow progress of Tesla's robotaxisIf You Used Insider Knowledge to Score Big on Polymarket, You May Now Be in Huge TroubleMinnesota passes prediction markets banLinkedIn doesn't want your AI slop anymoreSpotify is launching verification badges for podcasts to help listeners avoid AI slopZuckerberg Tells the Tattered Remainder of His Workers That He Won't Conduct Another a Mass Firing for at Least Seven MonthsOpenAI is offering ChatGPT Plus to citizens of Malta for a yearMassive Crypto ATM Company Bitcoin Depot Is Shutting Down as the Whole Industry Collapses‘Smoke Weed and Earn Bitcoin' With This Vape Pen in Our Increasingly Dystopian Nightmare‘Unstoppable' Crypto Exchange Halts Trading After $10 Million TheftIran Doubles Down on Bitcoin for Ships Passing Through the Straight of HormuzTrump-Linked Crypto Company Notes 'Substantial Doubt' It Can Survive Another 12 MonthsAPPS & DOODADSBackblazeYouTube's AI deepfake detection tool is now available to all creators 18 and olderX accounts are limited to 50 posts and 200 replies a day unless they pay for a blue checkmarkRetrocodexMEDIA CANDYGood Omens Season 3 - The FinaleThe Magnificent Others with Billy Corgan - David J of Bauhaus & Love & RocketsCommon Sense 326 – The Water in Which We SwimTHE DARK SIDE WITH DAVEDave BittnerThe CyberWireHacking HumansCaveatControl LoopOnly Malware in the BuildingMaul: Shadow LordRogue One: A Star Wars StoryNot Even Baby Yoda Can Save ‘Star Wars'Colorado man creates replica Millenium FalconSomeone made a Star Tours fan film.Bring Disneyland Home With This Gorgeous New Lego Set‘Creepy' Listening Tool for Targeted Ads Didn't Actually Work, FTC SaysMechanical keyboard simSee Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.
Can woke leave anything alone? Allie reacts to former members of the Christian music group Avalon re-releasing “Testify to Love” and proclaiming it was an LGBTQ anthem all along. She explores the rise of AI through a biblical lens, asking whether tools like ChatGPT, Grok, and Claude represent exciting new technology or something far more spiritually dangerous. She breaks down what AI actually is (and isn't), warns against treating it as conscious or godlike, and offers clear biblical guidance on practical, moral, and spiritual pitfalls — from AI-generated sermons and worship music to digital idolatry, porn, and outsourcing our relationships and creativity. Finally, Allie unpacks Rededicate 250 and explains why being both proudly American and God-fearing is not Christian nationalism. Share the Arrows 2026 is on October 10 in Dallas, Texas! Tickets are on sale now at: https://sharethearrows.com Share the Arrows is sponsored by: A'del Natural Cosmetics: AdelNaturalCosmetics.com Range Leather: RangeLeather.com/ALLIE We Heart Nutrition: WeHeartNutrition.com Buy Allie's book "Toxic Empathy: How Progressives Exploit Christian Compassion": https://www.toxicempathy.com – Timecodes 0:00 Introduction 3:06 “Testify to Love” Goes Woke 25:00 Is AI De-Sanctifying? 1:03:13 Rededicate 250 – Today's Sponsors: We Heart Nutrition | Check out We Heart Nutrition at WeHeartNutrition.com and use the code ALLIE for 20% off. Good Ranchers | If you go to GoodRanchers.com and subscribe to any box of 100% American meat, you'll save up to $500 a year! Plus, if you use code ALLIE, you'll get an additional $25 off your first order. EveryLife | Visit EveryLife.com and use promo code ALLIE10 to get 10% off your first order today! Fellowship Home Loans | Start with a free consultation at FellowshipHomeLoans.com/Allie and receive a $500 credit at closing. Alliance Defending Freedom | For a limited time, every dollar you give to ADF will be doubled — but only while matching funds remain available. Go to JOINADF.com/ALLIE or text ALLIE to 83848 to have your gift for life matched. Episodes You May Like: Ep 1306 | Bigger Than Iran: Trump's Plan to Topple the New World Order | Justin Haskins https://podcasts.apple.com/us/podcast/ep-1306-bigger-than-iran-trumps-plan-to-topple-the/id1359249098?i=1000750673402 Ep 1066 | Taylor Swift, Caitlin Clark, & Why the Normies Go Woke https://podcasts.apple.com/us/podcast/ep-1066-taylor-swift-caitlin-clark-why-the-normies-go-woke/id1359249098?i=1000669346590 Ep 842 | The Elites' Plan to Replace God With AI | Guest: Justin Haskins (Part Two) https://podcasts.apple.com/us/podcast/ep-842-the-elites-plan-to-replace-god-with-ai-guest/id1359249098?i=1000621802685 --- ► Buy Allie's book, "You're Not Enough (& That's Okay): Escaping the Toxic Culture of Self-Love": https://alliebethstuckey.com/book ► Subscribe to the podcast: iTunes: https://apple.co/2UVssnP Spotify: https://spoti.fi/2FwkXxj ► Connect with Allie on Social Media: https://twitter.com/conservmillen https://www.instagram.com/alliebstuckey/ https://facebook.com/allieBlazeTV/ ► Relatable merchandise – use promo code 'ALLIE10' for a discount: https://shop.blazemedia.com/collections/allie-stuckey