Design, construction, operation, and application of robots
POPULARITY
Categories
Industry 4.0 is moving beyond factory walls and into farms, forests, and fields.David Potere, a senior tech leader in BCG's Industrial Goods and Climate Change and Sustainability practices, explores AI's move into the outdoor world. Robotics and connected systems are changing how farming and other outdoor activities get done.You'll Learn:Outdoor automation requires AI systems that can operate with constant uncertainty.Leaders should rethink long-held operating models as AI and robotics reshape how physical work gets done.The most valuable AI systems may be the ones that simplify complexity rather than add more dashboards.Learn More:David Potere: https://www.linkedin.com/in/davidpotere/What 1,000 Farmers Told Us About Tech Adoption: https://on.bcg.com/4euA76VClimate-Smart Agriculture Needs a Better Yardstick: https://on.bcg.com/4ejIfH6David on the Climate Rising Podcast: https://podcasts.apple.com/us/podcast/david-potere-at-bcg-x-using-ai-satellites-in-climate/id1482781075?i=1000767537614AI Foundation Model for Extreme Weather: https://on.bcg.com/4vKiwyzChapters00:00 – How Will AI Impact Outdoor Industries?04:26 –The Challenges of Taking Tech Outside06:11– What Would a Farm That Thinks for Itself Look Like?08:27 – Is AI Rescuing Agriculture?10:55– Will AI Only Help Big Farms?14:39 – Who Owns the Data?16:16 – What Can Leaders Learn from the AI Outdoors?18:51 – Next Steps to Truly Benefit from AIThis podcast uses the following third-party services for analysis: Podtrac - https://analytics.podtrac.com/privacy-policy-gdrp
Robotic technology is expanding what's possible in reconstructive urology, prompting surgeons to rethink traditional approaches and consider new minimally invasive procedures. In this episode of BackTable Urology, Dr. Ziho Lee joins host Dr. George Koch to explore the rapid evolution of robotic reconstructive surgery and its expanding role in complex pelvic and retroperitoneal procedures.They discuss the the role of single-port platforms, new strategies for managing ureteral strictures and urinary diversion, and how research, training, and patient-reported outcomes are shaping the future of minimally invasive urologic care. --- Get the BackTable apphttps://www.backtable.com/app --- Timestamps00:00 - Introduction03:10 - Bridging Reconstruction and Robotics07:41 - Where Robots Help Most14:33 - Ureteral Rest16:41 - 20% Nephrectomy Rule19:04 - Single Port Indications20:55 - Ileal Ureter Tips and Tricks22:34 - Robotic Reconstruction Course28:55 - Case: VUAS Repair39:35 - Final Advice --- ResourcesUreteral Rest is Associated With Improved Outcomes in Patients Undergoing Robotic Ureteral Reconstruction of Proximal and Middle Ureteral Strictureshttps://pubmed.ncbi.nlm.nih.gov/33639184/ --- BackTable Urology is the go-to podcast for urologists, urologic oncologists, and urogynecologists. Download the free BackTable app to get early access to new episodes, cases, and courses curated by physicians in your specialty.► https://www.backtable.com/app
In this laid-back, on-the-road episode of the Side Hustle Squad Podcast, I clipped on the microphones, hit the road for a robotic mower installation, and brought you along for the ride. I'm sharing updates on everything happening behind the scenes, including the progress at the shop, the continued growth of Coastal Fertilization of North Jersey, and the latest developments with ROPED as we bring robotic mowing technology to more customers across New Jersey. I also talk about the recent updates to our mobile showroom trailer, upcoming installs, and some of the exciting events on the horizon, including attending Naylor Taliaferro's Profit Accelerator event and other opportunities to connect with entrepreneurs and contractors throughout the industry. No formal interview this week—just a real-time conversation from the driver's seat about business growth, new opportunities, lessons learned, and what's coming next for Coastal, ROPED, and the Side Hustle Squad community. If you've ever wondered what goes on behind the scenes while building multiple businesses, this episode is for you. Buckle up and come along for the ride.
AI is transforming medical robotics, but not in the ways many people expect. In this episode, KUKA Robotics USA's Silke Wendt, Global Marketing Medical Robotics, and Corey Ryan, Director of Medical Robotics, discuss how labor shortages, increasing quality demands, and growing interest in AI are shaping the future of medical robotics. While AI is becoming more prevalent in areas such as simulation, collision detection, and path planning, they emphasize that regulatory requirements limit its role in autonomous clinical decision-making. They also highlight KUKA's focus on quality, strategic partnerships, global expansion, and the development of new robotic platforms to support a growing range of medical applications. Tune in to hear how KUKA is navigating the opportunities and challenges of AI, automation, and innovation in healthcare robotics! Resources: Connect with and follow Silke Wendt on LinkedIn. Connect with and follow Corey Ryan on LinkedIn. Follow KUKA Robotics USA on LinkedIn and explore their website!
The word “intelligence” was never neutral. It was the sales pitch.This episode argues that the systems sold as artificial intelligence are not minds, thinkers, or neutral judges. They are privately owned prediction and sorting machines trained on human data, wrapped in language that makes people trust them, defer to them, and surrender power without asking who owns the system, who profits from it, or who answers when it is wrong.Name the machine correctly before it names you.
The robotics industry is quietly emerging as one of the most undervalued opportunities for real estate investors today. While mainstream attention focuses heavily on software and AI, physical automation is simultaneously transforming how assets are constructed and operated. The global robotics market currently sits at roughly $70 billion and is projected by McKinsey to cross $260 billion by 2030. This exponential growth mirrors the e-commerce warehouse boom of 2010, offering massive upside for investors positioned ahead of the curve.In this episode, we break down the two primary avenues robotics will impact real estate: significantly lowering hard construction costs and drastically reducing ongoing operational expenses. From 3D-printed homes by ICON cutting building costs by 20% to 30%, to humanoid robots reducing hospitality labor expenses by up to 35%, the financial implications are profound. Listeners will learn exactly how to capitalize on this shift, including specific publicly traded companies, REITs, and upcoming IPOs directly exposed to real estate automation.Key Topics DiscussedThe current $70 billion valuation of the robotics industry and projections reaching $260 billion by 2030.How ICON Technology's 3D-printed homes are decreasing traditional stick frame construction costs by 20% to 30%.The impact of autonomous rebar-tying robots reducing structural labor needs by 40%.Keen Robotics and Figure AI streamlining commercial facility management and cutting hospitality labor costs.Why Prologis is capturing a 200 basis point occupancy premium for robotics-enabled industrial facilities.Specific actionable investment vehicles including REITs, automation infrastructure stocks, and upcoming AI IPOs.Key TakeawaysA 30% reduction in labor costs for a standard 200-room hotel can translate to over $11 million in added asset value based on standard cap rates.Investors who target companies building durable competitive advantages through robotics integration will secure a significant economic moat.Industrial REITs are already proving that commercial tenants are willing to pay a premium to occupy tech-forward, automation-ready buildings.The entire global robotics sector is currently valued lower than Home Depot's market cap, highlighting the immense remaining upside.Connect & Take Action:Wealth Intelligence Brief: Text "WIB" to 844-447-1555 to get Matty's free macro data, real estate intel, and crypto signals delivered to your inbox 3 times a week.Imagos Income Fund: Text "INCOME" or "DEALS" to 844-447-1555 to learn more about Matty A's private debt fund targeting 10% fixed returns paid out monthly.
We add another string to our bow by learning about the fiddler crab. We discuss the arc of history bending towards crab, the MogBot 2000, bad dating advice, non-orientable wormholes, and so much more. Works Cited: “The Design of a Beautiful Weapon” - John Christy, Smithsonian Museum of Natural History “On the Other Hand: The Myth of Fiddler Crab Claw Reversal” - Judith S. Weis, BioScience, April 2019 “Sexual selection for structure building by courting male fiddler crabs: an experimental study of behavioral mechanisms” - John H. Christy et al., Behavioral Ecology, May 2002 “Synchronous waving in fiddler crabs: a review” - Patricia Ruth Yvonne Backwell, Current Zoology, July 2018 “Robotic crabs reveal that female fiddler crabs are sensitive to changes in male display rate” - Sophie L. Mowles et al., Biology Letters, January 2018 “Not what it looks like: mate-searching behaviour, mate preferences and clutch production in wandering and territory-holding female fiddler crabs” - M. Peso et al., R. Soc Open Sci.. August 2016 “Dishonest signalling of fighting ability and multiple performance traits in the fiddler crab Uca mjoebergi” - Simon P. Lailvaux et al., Functional Ecology, March 2009 “The effects of neighbor familiarity and size on cooperative defense of fiddler crab territories” - Isobel Booksmythe et al., Behavioral ecology, November 2011 “Beyond Abiotic Decay: Fiddler Crabs Accelerate Plastic Fragmentation in Pollution Hotspots” - Jose M. Riascos et al., Global Change Biology, December 2025 Links: For more information about us & our podcast, head over to our website! Follow Just the Zoo of Us on BlueSky, Facebook, Instagram & Discord! Follow Ellen on Instagram or BlueSky! Help support this show and unlock bonus content! Become a member at https://maximumfun.org/joinjustthezoo
SpaceTime with Stuart Gary | Astronomy, Space & Science News
Sponsor Link:This episode of SpaceTime is brought to you by NordVPN, where your online security starts. To check out our special offer for SpaceTime listeners, visit www.nordvpn.com/stuartgarySpaceTime Series 29 Episode 70 *The Small Magellanic Cloud is being ripped apart A new study reveals that the Small Magellanic Cloud, a satellite galaxy of the Milky Way, is slowly being torn apart by gravitational forces from the Large Magellanic Cloud. Researchers have utilised over a decade of observations to uncover the galaxy's dynamic state, challenging previous models of coherent rotation. *Blueprint for a lunar base NASA's plans for a lunar base at the Moon's South Pole are sparking innovative proposals for construction using local lunar materials. The Texas A&M Space Institute is leading research into using lunar regolith, a challenging construction material, to develop habitats for future lunar missions. *Meteor rocks New England A recent meteor explosion over New England has been confirmed as a sonic boom from a meteor entering the Earth's atmosphere, sending shockwaves across Massachusetts and Rhode Island. The meteor, travelling at 121,000 kilometres per hour, likely fragmented before falling into the North Atlantic Ocean. *The Science Robert Increased wildfire risks are predicted across parts of Australia, while a study reveals that Iceman Otzi's microbiome remains active even after 5,300 years. Additionally, video technology may allow for heart rate monitoring through facial recognition.Become a supporter of this podcast: https://www.spreaker.com/podcast/spacetime-with-stuart-gary--2458531/support.
In this episode of the Crazy Wisdom Podcast, host Stewart Alsop sits down with client strategist Amadeus Huff to cover a wide range of topics that wind their way from the nuts and bolts of recruiting and payment models to the rapidly shifting landscape of AI adoption in business. The two dig into how AI tools are reshaping client success roles, the murky territory of recording laws and privacy in a globalized world, the geopolitical implications of oil supply chains, sanctions, and the rise of domestic tech ecosystems in countries like Russia and Argentina, and what all of this means for the future of human connection and the nation-state. Amadeus closes on an optimistic note, arguing that as AI takes over bureaucratic busywork and erodes trust online, people will increasingly hunger for genuine human relationships and third spaces. You can connect with Amadeus Huff on LinkedIn.Timestamps00:00 - Stewart introduces Amadeus Huff, diving into recruiting as building connections between job seekers and employers with minimal variance.05:00 - Amadeus discusses AI adoption pitfalls, comparing aggressive growth strategies to Amazon's early model, questioning whether tools deliver promised results.10:00 - Conversation shifts to AI notetaking versus human perception, exploring probabilistic interpretation differences between humans and machines.15:00 - Recording consent laws debated across states, touching on Waymo surveillance, Uber data collection, and public versus private space definitions.20:00 - Global privacy landscape examined, covering Swiss banking secrecy erosion, ProtonMail's departure, and RISC-V semiconductor development escaping US jurisdiction.25:00 - Sanctions creating domestic innovation ecosystems discussed through Russia's example, paralleling Argentina's emerging commerce evolution.29:00 - Closing reflections on AI replacing bureaucracy while preserving human purpose, optimism about meaningful work and deeper personal connections emerging.Key Insights1. Recruiting is fundamentally about reducing variance between what job seekers want and what employers offer. The most ethical payment models in recruiting are tied to proven success, such as waiting three months to confirm a hire is working out, rather than collecting fees the moment a contract is signed.2. Business thinking has shifted from shareholder value to stakeholder value, meaning companies now consider the wellbeing of employees, families, and communities, not just stock price. This shift is accelerating due to AI overpromising and underdelivering, making value-based measurement more important.3. AI is most useful when it handles administrative tasks that provide no direct value to customers, such as transcribing meetings and populating CRM systems. This frees up workers to focus on meaningful relationship-building and intellectual work rather than bureaucratic busywork.4. There is an important distinction between recorded and unrecorded conversation in professional settings. Building trust through informal off-the-record dialogue before switching on a transcription tool creates clearer boundaries and stronger relationships with clients.5. Sanctions tend to follow a bell curve of effectiveness. Over time they force sanctioned countries to build domestic alternatives, which gain adoption and loyalty, ultimately reducing the influence of the original foreign companies once sanctions lift.6. AI is degrading trust in online information to the point where people will increasingly crave authentic human connection, physical gathering spaces, live experiences, and real relationships rather than algorithmically generated content.7. AI is quietly improving intergenerational relationships by removing codependency. When elderly parents learn to use AI for technical help, their calls to family members shift from problem-solving to genuine connection, which strengthens the relationship.
Andrew Kang, CEO and co-founder of RoboStrategy (BOT), explains how his company is helping other private robotics companies thrive as AI builds use cases for evolving tech. He makes the case that everyone in the U.S. can have their own robot in the not too distant future. Andrew points to the U.S. pushing for a manufacturing uptick as another bullish indicator for the robotics industry gaining steam. ======== Schwab Network ========Empowering every investor and trader, every market day.Subscribe to the Market Minute newsletter - https://schwabnetwork.com/subscribeDownload the iOS app - https://apps.apple.com/us/app/schwab-network/id1460719185Download the Amazon Fire Tv App - https://www.amazon.com/TD-Ameritrade-Network/dp/B07KRD76C7Watch on Sling - https://watch.sling.com/1/asset/191928615bd8d47686f94682aefaa007/watchWatch on Vizio - https://www.vizio.com/en/watchfreeplus-exploreWatch on DistroTV - https://www.distro.tv/live/schwab-network/Follow us on X – https://twitter.com/schwabnetworkFollow us on Facebook – https://www.facebook.com/schwabnetworkFollow us on LinkedIn - https://www.linkedin.com/company/schwab-network/About Schwab Network - https://schwabnetwork.com/about
Our guest on this week's episode is Andrei Quinn-Barabanov, supply chain practice lead at Moody's. New inflation reports came out this week showing that last month we reached the highest inflation rates of the past three years. Inflation is even higher when it comes to transportation cost increases. To help us understand how such inflation affects our supply chains, our guest joins DC Velocity's Senior News Editor Ben Ames.The market outlook for collaborative robots remains strong as the equipment advances to accommodate heavier duty use around the world. Senior Editor Victoria Kickham reports that new research from Interact Analysis that shipments of these cobots designed to work with and alongside humans are predicted to grow at an average annual rate of more than 17% between 2025 to 2030.Ben Ames reports that this week that a change is coming to robotic last mile fulfillment. Starship Technologies is an Estonian tech startup that makes autonomous, self-driving bots. If you've been on any large university campuses in the last few years, you've probably seen them driving along pathways and college quads, delivering small items like e-commerce orders for snacks and burritos. But now Starship says they plan to wind down their operations on U.S. university campuses and shift their focus to retail grocery chains and hot food delivery in cities across Europe and the U.S. Ben shares why the company has shifted their strategy.Articles and resources mentioned in this episode:Moody'sCobot shipments to rise more than 17% by 2030. China maintains market dominance.Starship steers delivery robots off college campuses and toward grocery sectorVisit DC VelocityVisit Supply Chain XchangeSend feedback about this podcast to podcast@agilebme.comThis podcast episode is sponsored by: ID Label
We add another string to our bow by learning about the fiddler crab. We discuss the arc of history bending towards crab, the MogBot 2000, bad dating advice, non-orientable wormholes, and so much more. Works Cited: “The Design of a Beautiful Weapon” - John Christy, Smithsonian Museum of Natural History “On the Other Hand: The Myth of Fiddler Crab Claw Reversal” - Judith S. Weis, BioScience, April 2019 “Sexual selection for structure building by courting male fiddler crabs: an experimental study of behavioral mechanisms” - John H. Christy et al., Behavioral Ecology, May 2002 “Synchronous waving in fiddler crabs: a review” - Patricia Ruth Yvonne Backwell, Current Zoology, July 2018 “Robotic crabs reveal that female fiddler crabs are sensitive to changes in male display rate” - Sophie L. Mowles et al., Biology Letters, January 2018 “Not what it looks like: mate-searching behaviour, mate preferences and clutch production in wandering and territory-holding female fiddler crabs” - M. Peso et al., R. Soc Open Sci.. August 2016 “Dishonest signalling of fighting ability and multiple performance traits in the fiddler crab Uca mjoebergi” - Simon P. Lailvaux et al., Functional Ecology, March 2009 “The effects of neighbor familiarity and size on cooperative defense of fiddler crab territories” - Isobel Booksmythe et al., Behavioral ecology, November 2011 “Beyond Abiotic Decay: Fiddler Crabs Accelerate Plastic Fragmentation in Pollution Hotspots” - Jose M. Riascos et al., Global Change Biology, December 2025 Links: For more information about us & our podcast, head over to our website! Follow Just the Zoo of Us on BlueSky, Facebook, Instagram & Discord! Follow Ellen on Instagram or BlueSky! Help support this show and unlock bonus content! Become a member at https://maximumfun.org/joinjustthezoo
Episode #6 of Empathy in the Age of AI, a special 25-part series: https://tinyurl.com/exyw2nua AI is impacting every part of our lives—and we need to start paying more attention.Listen to this grounded conversation with Dr. Sean Welsh, a philosopher and computer programmer whose work sits at the intersection of robotics, machine ethics, and moral decision-making.We discuss:If robots and AI systems can make moral and ethical decisionsThe risks of anthropomorphism and emotional intimacy with AIWhether AI will create mass unemployment or unlock new possibilitiesHow machines simulate empathy and what that means for the future of relationshipsIf you're concerned about sharing a planet with billions of humanoid robots—as tech CEOs are predicting—you don't want to miss this conversation.00:00 Preview01:00 Episode Introduction03:05 About Dr. Sean Welsh05:15 Sean's backstory11:47 What are the ethical considerations of robotics?17:30 AI warfare and the risk of “slaughterbots”19:52 What happens when AI carries out military strikes24:35 Can robots make moral decisions and who is responsible when they fail?36:12 Anthropomorphism and the psychology behind human-like robots41:00 The growing market for sex robots and what it could mean for society47:58 The hidden psychological risks of emotionally convincing AI relationships54:35 What happens when children interact more with chatbots than people?01:06:40 Could AI relationships reshape human intimacy and social connection?01:10:24 The hidden energy cost of ChatGPT and large AI models01:15:37 Will AI and robots actually cause mass unemployment?01:21:09 Sean Welsh's Purposeful Empathy storyCONNECT WITH SEAN✩ LinkedIn https://www.linkedin.com/in/seanwelsh77/✩ Website https://engineno2.com/CONNECT WITH ANITA✩ Email purposefulempathy@gmail.com ✩ Website https://www.anitanowak.com✩ Buy a copy of Purposeful Empathy http://tiny.cc/PurposefulEmpathyCA✩ LinkedIn https://www.linkedin.com/in/anitanowak/✩ Instagram https://tinyurl.com/anitanowakinstagram✩ Podcast Audio https://tinyurl.com/PurposefulEmpathyPodcast✩ Bluesky https://bsky.app/profile/anitanowak.bsky.socialSEAN'S WORK✩ An Introduction to Ethics and Robotics in AI https://link.springer.com/book/10.1007/978-3-030-51110-4✩ Ethics and Security Automata: Policy and Technical Challenges of the Robotic Use of Force https://www.routledge.com/Ethics-and-Security-Automata-Policy-and-Technical-Challenges-of-the-Robotic-Use-of-Force/Welsh/p/book/9781032096117✩ The drive towards ethical AI and responsible robots has begun https://theconversation.com/the-drive-towards-ethical-ai-and-responsible-robots-has-begun-52300SHOW NOTESHumans https://www.imdb.com/title/tt4122068/✩ Ex Machina https://www.imdb.com/title/tt0470752/✩ “Slaughterbots” video by the Campaign to Stop Killer Robots https://www.imdb.com/title/tt7659054/✩ Soul Machines https://www.soulmachines.com/Video edited by Jad Misri, Green Horizon Studio
Send us Fan MailThis week on the Talking Pools Podcast, Wayne Ivusich and Steve Sherwood take listeners on a journey through some of the strangest, funniest, and most unforgettable experiences pool professionals encounter in the field. What begins as a discussion about a pool overrun with frogs quickly evolves into a collection of stories that highlight the reality of working around water every day. Wayne and Steve invite listeners to share the weirdest things they have ever discovered in skimmer baskets and pool systems, leading to stories involving snakes, squirrels, possums, underwear, rodents nesting beneath winter covers, and even a horse that found its way through a safety cover and into a swimming pool. The conversation is both humorous and educational, reminding listeners that no two days in the pool industry are ever the same. The episode then shifts to a more serious discussion about water clarity and swimmer safety. Wayne recounts a tragic real-world drowning incident in a cloudy public pool, emphasizing why clear water is not simply an aesthetic goal but a critical life-safety requirement. The hosts discuss why operators should never compromise visibility standards and why maintaining proper filtration and water chemistry remains one of the most important responsibilities in aquatic operations. Steve also addresses the growing trend of misleading social media pool "miracle fixes" and viral videos that promise instant water recovery through tablets or additives. The hosts explain why proper pool chemistry does not work that way and encourage listeners to be skeptical of products that appear too good to be true. In this week's insurance segment, Steve is joined by Pat from California Pool Association Insurance Services to continue their discussion about a unique consulting project involving pools at a doggy daycare facility. The conversation explores liability concerns, insurance requirements, hold-harmless agreements, commercial pool responsibilities, and the challenges of maintaining aquatic facilities that are operated by people whose primary focus is animal care rather than water management. The discussion provides valuable insight for service companies considering unusual or high-liability clients. The second half of the episode dives deep into robotic pool cleaners, filtration systems, and service efficiency. Steve explains why robotic cleaners have become essential tools for modern pool professionals, discusses the pros and cons of suction-side, pressure-side, corded, and cordless cleaners, and shares how automation can dramatically improve service quality while reducing labor hours. The hosts also discuss customer expectations, communication, and the importance of establishing clear responsibilities between pool professionals and facility operators. Finally, Wayne and Steve discuss professional education, the value of Certified Pool Operator (CPO) training, and opportunities for experienced professionals to become CPO instructors themselves. The conversation highlights how education improves safety, builds confidence, creates additional revenue opportunities, and helps elevate professionalism throughout the industry. Topics Covered Weirdest things ever found in skimmer baskets Wildlife encounters in swimming pools Pool safety and water clarity Real-world drowning prevention lessons Social media pool chemistry myths Doggy daycare pool liability concerns Insurance and hold-harmless agreements Commercial pool management challenges Robotic pool cleaners and automation Sand filters vs. cartridge filters Customer expectations and communication CPO certification and instructor training Building a stronger pool service business Connect With Talking Pools
This week, the boys grab some beers and head to post-WWII America to watch nobody give AF about our war heroes in William Wyler's “The Best Years of Our Lives”. The highest-grossing movie since “Gone With The Wind”, this moving account follows several soldiers re-acclimating to civilian life in a world that has moved on without them. Thankless bastards. This movie rules. It's long, but it's awesome. John also talks about “Backrooms”. Grab a beer and join us! linktr.ee/theloveofcinema - Check out our YouTube page! Our phone number is 646-484-9298. It accepts texts or voice messages. 0:00 Intro; 7:15 “Backrooms” mini-review; 16:39 1946 Year in Review; 36:06 “The Best Years of Our Lives”: Films of 1946; 01:24:42 What You Been Watching?; 1:40:49 Next Week's Episode Teaser Additional Cast/Crew: Robert E Sherwood, MacKinlay Kantor, Hugo Friedhofer, Dana Andrews, Gregg Toland, Sharaff, Fredric March, Myrna Loy, Teresa Wright, Virginia Mayo, Cathy O'Donnell, Hoagy Carmichael, Harold Rossell, Gladys George, Roman Bohnen, Kan Parsons, Chiwetel Ejiofor, Will Soodik, Renate Reinsve, Mark Duplass. Hosts: Dave Green, Jeff Ostermueller, John Say Edited & Produced by Dave Green. Beer Sponsor: Carlos Barrozo Music Sponsor: Dasein Dasein on Spotify: https://open.spotify.com/artist/77H3GPgYigeKNlZKGx11KZ Dasein on Apple Music: https://music.apple.com/us/artist/dasein/1637517407 Recommendations: Widow's Bay, The Lord of The Flies, The Boroughss, The Cloverfield Paradox, Spider Noir, Everybody Wants Some, Bernie, Last Flags Flying, The Worst Person In The World, Oslo October 31st, Out of the Past, Is This Thing On, Song Sung Blue, John Adams Mini Series, NY Knicks, Casablanca, Additional Tags: Bryan Cranston, Kate Hudson, Bradley Cooper, Will Arnett, Jack /black, Joachim Trier, Richard Linklater, The Duffer Brothers, Focus Features, A24, Curry Barker, Robert Duvall, Sports Documentary, Bowling, Bette Davis, SZA, Keke Palmer, Amazon Studios, Warner Discovery, Paramount Skydance, Conan O'Brien, Weapons, Sinners, One Battle After Another, Frankenstein, Annapurna Films, Old Man Marley, Home Alone, Shawshenk Redemption, Gordon Ramsay, Thelma Schoonmaker, Stephen King's It, The Tenant, Rosemary's Baby, The Pianist, Cul-de-Sac, AI, The New York City Marathon, Apartments, Tenants, Rent Prices, Zohran Mamdani, Andrew Cuomo, Curtis Sliwa, Amazon, Robotics, AMC, IMAX Issues, Tron, The Dallas Cowboys, Short-term memory loss, Warner Brothers, Paramount, Netflix, AMC Times Square, Tom Cruise, George Clooney, MGM, Amazon Prime, Marvel, Sony, Conclave, Here, Venom: The Last Dance, Casablanca, The Wizard of Oz, Oscars 2026, Academy Awards, BFI, BAFTA, BAFTAS, British Cinema. England, Vienna, Leopoldstadt, The Golden Globes, Past Lives, Apple Podcasts, West Side Story, Adelaide, Australia, Queensland, New South Wales, Melbourne, The British, England, The SEC, Ronald Reagan, Stock Buybacks, Marvel, MCU, DCEU, Film, Movies, Southeast Asia, plague, HBO Max, Amazon Prime, casket maker, Seven Samurai, Roshomon, Sergio Leone, Clint Eastwood, Stellan Skarsgard, the matt and mark movie show.The Southern District's Waratah Championship, Night of a Thousand Stars, The Pan Pacific Grand Prix (The Pan Pacifics), Jeff Bezos, Rupert Murdoch, Larry Ellison, David Ellison, Elon Musk, Mark Zuckerberg.
In Mittelstand und Gewerkschaften trifft die EZB-Zinserhöhung auf Ablehnung. Und: Warum der Dax schlechter als manches Tagesgeldkonto performt.
What if airports had self-driving mobility pods that could safely navigate through crowds, just like something out of The Jetsons? Or the Pixar movie Wall-E?In this episode, John Koetsier sits down with Matthew Anderson, CEO of A&K Robotics, to explore the future of autonomous mobility. A&K Robotics is building AI-powered self-driving pods designed to help people navigate airports independently without relying on wheelchairs or staff assistance.But the real breakthrough isn't just autonomy. It's crowd navigation. Matthew explains why navigating dense, unpredictable crowds is one of the hardest problems in robotics, and how A&K's “crowd-centric AI” could become foundational technology for airports, stadiums, smart cities, conferences, and even humanoid robots in the future.They also discuss:* Why airports are the perfect proving ground for robotics* The AI and sensor stack powering autonomous mobility* Directional sound systems inspired by The Sphere in Las Vegas* Scaling robotics startups from prototype to deployment* Raising an $8M Series A round* The personal story that inspired Matthew to build the company* Why the future of robotics depends on moving safely through human environmentsGuest:Matthew Anderson — CEO, A&K RoboticsCompany: A&K RoboticsIf you enjoy conversations about AI, robotics, startups, and the future of technology, subscribe for more interviews with founders and innovators shaping what's next.Subscribe here:https://techfirst.substack.com00:00 – Intro00:30 – Meet A&K Robotics and the Vision for Autonomous Airport Mobility01:20 – Why Crowd Navigation AI Is the Hardest Problem in Robotics02:40 – Navigating Dense Airport Crowds and Passenger Flow04:05 – Directional Sound and Designing a Better Airport Experience05:50 – Building an “iPhone Experience” for Mobility Robots06:30 – Sensors, LIDAR, and Operating Without GPS07:20 – Fleet Management and Autonomous Operations in Airports08:00 – Mapping Airports and Optimizing Routes Through Crowds09:00 – Scaling the Business and Solving Systems Integration10:00 – Charging, Docking Stations, and the Future Airport Network10:45 – Raising an $8 Million Series A Round11:20 – Customers: Vancouver International Airport and Aena12:10 – Building a Polished Robotics Platform on Seed Funding12:50 – Matthew Anderson's Background in Robotics and Drones14:00 – The Bigger Vision: Crowd Navigation for All Robots14:40 – The Personal Story Behind the Company Mission15:40 – Licensing Opportunities and the $5 Billion Airport Mobility Market16:45 – Hiring, Scaling the Team, and Expanding Production18:00 – Growing Up Hacking Robots and the AC/DC Story19:10 – Why Building Robots Is Fun — and Why Accounting Wasn't20:40 – Final Thoughts and the Future of Autonomous Mobility
What happens when a lifelong passion for science, innovation, and helping others comes together in one remarkable career? In this episode of The She Believed She Could™ Podcast, Allison Walsh sits down with Dr. Erica Stockwell, an advanced gynecologic surgeon with AdventHealth for Women, to discuss her groundbreaking work in women's healthcare, minimally invasive surgery, and medical innovation. Dr. Stockwell shares how her background in biomedical engineering, medicine, and business led her to become a pioneer in robotic surgery and surgical technology. From holding medical device patents to helping shape the future of AI-assisted healthcare, she offers a fascinating look at where women's health is headed and why innovation matters more than ever. But beyond her impressive accomplishments, Dr. Stockwell also reveals the deeply personal challenges that shaped her journey. During medical residency, she became a new mother while simultaneously caring for her infant daughter battling cancer. Her powerful story of perseverance, faith, and community support serves as a reminder that even the most successful women face valleys—and that resilience is built by continuing forward through them. Together, Allison and Dr. Stockwell explore leadership, confidence, endometriosis care, women's health advocacy, entrepreneurship, motherhood, and the courage it takes to keep believing in yourself when life gets hard. If you're looking for inspiration, practical wisdom, and a glimpse into the future of healthcare, this conversation is one you won't want to miss. What You'll Learn in This Episode: How innovation is reshaping women's healthcare The benefits of minimally invasive gynecologic surgery Emerging trends in robotic surgery and AI-assisted medicine Why endometriosis requires comprehensive, multidisciplinary care How to build resilience during life's hardest seasons The role of mentorship and support systems in success Why confidence is created through action Lessons on leadership, entrepreneurship, and impact How to navigate motherhood while pursuing ambitious goals The future of women's health technology This episode is sponsored by AdventHealth for Women. Learn more about their Women's Health Navigation Team and how they're making healthcare simpler for women and their families at AdventHealthForWomen.com. Positioned for Partnerships™ Mini Course - Turn your platform into a revenue-generating brand opportunity—without needing a massive following. Learn how to position your brand, create a high-converting media kit, and confidently pitch partnerships so brands instantly understand your value.
Send us Fan MailIn this fascinating investor panel clip, top investors discuss where the next wave of wealth creation may come from after crypto, cannabis, sports betting, AI, and robotics.They break down emerging opportunities in nuclear energy, modular infrastructure, fractionalized investing, tokenization, and future financial structures — plus why many “hot trends” fail before reaching mass adoption.If you've ever felt like you're always late to the next big thing, this conversation explains how smart investors think ahead of the crowd.Topics Covered:✅ The next wealth boom after AI & crypto✅ Nuclear energy investment opportunities✅ Why many reactor startups may fail✅ Fractionalization of assets explained✅ Tokenization vs real-world investing✅ Why liquidity matters in new markets✅ How investors spot trends earlyIf you're an investor, entrepreneur, founder, or future trends watcher, this is a must-watch.
Key topics AI in physical supply chains Testing and building in fast-paced tech environments The impact of GLP-1 on health and consumer behavior Chapters 00:00 Navigating the Information Overload in Venture Capital 02:17 The Importance of Testing and Building in Tech 04:52 The Shift Towards a Builder Culture 07:36 Evaluating Build vs. Buy in Technology Investments 10:10 The Role of Interns in Modern Organizations 13:01 The Future of AI in Supply Chain and Logistics 15:42 The Ripple Effects of AI on the Physical World 18:12 Challenges in Supply Chain Automation 20:55 Complexities of Robotics in Warehousing 25:30 Automation and Workforce Dynamics 26:49 The Impact of GLP-1 on Consumer Behavior 31:03 Healthcare Innovations and Continuous Care 33:49 Investing in a Changing Landscape 37:33 The Future of Shopping and Agentic Commerce 41:06 Lightning Round Insights
OpenAI prepping major ChatGPT overhaul ahead of IPO, Meta admits 20k+ Instagram accounts were hacked, Google reportedly orders more than 3M AI chips from Intel. MP3 Please SUBSCRIBE HERE for free or get DTNS shows ad-free. A special thanks to all our supporters–without you, none of this would be possible. If you enjoy what youContinue reading "NVIDIA and LG Expand Robotics, AI, Mobility Partnership – DTH"
In this episode of the Crazy Wisdom Podcast, host Stewart Alsop sits down with software engineer and entrepreneur Arowolo Muritadhor for a wide-ranging conversation that moves from agriculture and manufacturing in Nigeria to the evolving role of crypto in the country's economy. They touch on how hyperinflation, particularly the naira's dramatic drop in 2023, pushed Nigerians toward stablecoins as a practical savings tool, and how informal kiosk networks have stepped in where traditional banking infrastructure falls short. The conversation also covers the tension between government regulation and the permissionless nature of blockchain technology, comparisons between the decline of the Roman Empire and current shifts in US economic dominance, the role of mobile payments in Africa, language learning, and whether AI agents have any real utility in crypto infrastructure yet. You can connect with Arowolo on LinkedIn and X at @armolas_06.Timestamps00:00 - Host welcomes Arowolo Muritadhor, introducing topics of software engineering and animal food production in Nigeria.05:00 - Discussion shifts to manufacturing, components assembly, and China's dominance in low-cost production globally.10:00 - Conversation explores crypto adoption in Nigeria as a network state phenomenon, separating informed users from mainstream population.15:00 - Mobile payments and kiosk ATM replacements emerge as critical financial infrastructure bridging unbanked Nigerians.20:00 - Roman Empire parallels drawn to modern crypto taxation, government control, and inevitable death-and-taxes reality.25:00 - Bitcoin and Ethereum permissionless nature debated against government wallet-level censorship vulnerabilities.30:00 - AI agents examined as crypto infrastructure tools, revealing mostly trading bots rather than foundational builders.35:00 - Nigeria's 2023 naira collapse compared to Argentina's hyperinflation, driving citizens toward stablecoin dollar savings.40:00 - US Treasury history unpacked through FDR gold confiscation and Nixon ending convertibility, paralleling empire decline.45:00 - Crypto reframed as anti-bank rather than purely anti-government, enabling freedom through immutable accountability.50:00 - Transparent blockchain ledgers discussed as potential government accountability tools across democracy, republic, and oligarchy structures.Key Insights1. Nigeria has a significant divide between its northern and southern regions in terms of economic activity. The north, centered around Abuja, is more agricultural with substantial cattle production, while Lagos in the south functions as a dense urban and commercial hub. This geographic and economic split shapes how different financial tools and technologies are adopted across the country.2. China's dominance in low-cost manufacturing has made it nearly impossible for countries like Nigeria, the United States, or Argentina to compete on price alone. The more realistic path for developing economies is to import components and focus on local assembly and creativity, which is where meaningful economic participation becomes possible.3. Crypto adoption in Nigeria accelerated dramatically around 2023 when the naira experienced a sharp devaluation against the US dollar. Before that point, saving in dollars was difficult for many Nigerians, especially those without formal bank accounts, making stablecoins like USDT an attractive and practical alternative for preserving wealth.4. Informal kiosk operators in Nigeria have organically become a substitute for ATMs, giving communities access to basic financial services where traditional banking infrastructure does not reach. This grassroots financial layer is now a key entry point for integrating crypto and stablecoin payments into everyday commerce.5. Governments are increasingly trying to regulate crypto at the wallet and centralized exchange level, using tax compliance as a primary mechanism. While Bitcoin and Ethereum remain largely permissionless, the practical chokepoints for most users remain centralized platforms where identity and transactions can be monitored.6. The historical parallel between the fall of the Roman Empire and current shifts in US economic and geopolitical power offers a useful frame for understanding why crypto matters. Just as Rome debased its currency and struggled to sustain imperial costs, the US faces mounting debt and a financialized economy that may accelerate dollar instability and push more people toward alternative stores of value.7. One genuinely constructive use case for blockchain beyond speculation is immutable accountability, particularly for public institutions and prediction markets. A transparent ledger that governments or officials voluntarily adopt could create verifiable records of decisions and promises, reducing corruption and increasing trust in ways that traditional governance structures have struggled to achieve.
In this episode of Tank Talks, Matt Cohen sits down with Aidan Madigan-Curtis, Partner at Eclipse, for a sharp conversation on physical AI, frontier tech, robotics, manufacturing, and the future of building in the real world. Aidan shares her unlikely path from a small mountain town in Penticton to Harvard, Bridgewater, Apple, Samsara, and now Eclipse, where she invests at the intersection of atoms and bits.She breaks down what factory floors taught her that most software-first founders miss, why physical AI is becoming one of the biggest venture capital opportunities of the next decade, and what the U.S. and Canada must understand about China's manufacturing advantage. From launching the first Apple Watch manufacturing lines to scaling Samsara's hardware operations and investing in autonomous excavation, robotics, energy, defense, and supply chain technology, Aidan brings a rare operator-investor perspective to one of the most important shifts happening in tech today.Buckle up to understand why the next wave of AI won't just live in software; it will reshape factories, robots, infrastructure, and the physical world around us.The Unlikely Path from Penticton to Harvard (00:04:25)Aidan shares the wild story of growing up in a tiny Canadian mountain town, applying to Harvard almost by accident, and nearly missing her acceptance letter because it sat undelivered in a PO box. She reflects on how community support, risk-taking, and a willingness to swing big shaped the rest of her career.Bridgewater, Systems Thinking, and Conviction Investing (00:09:00)Aidan explains how Bridgewater's fundamental, systematic approach to markets shaped how she evaluates venture opportunities today. She breaks down why Eclipse starts with deep theses, pressure-tests industries, and backs founders before the market fully understands where the world is going.The Factory Floor Lesson Every Founder Needs (00:17:27)Drawing from her time launching Apple Watch manufacturing lines, Aidan explains why the best founders must balance brutal honesty with extreme optimism. She argues that founders who get “high on their own supply” lose touch with reality, while founders without belief cannot rally a team to do the impossible.Why Physical AI Was the Bet Before It Was Cool (00:20:34)Aidan walks through her career pattern of choosing the “unsexy” path before it becomes obvious: Bridgewater before it was famous, Apple supply chain when software was eating the world, Samsara before industrial IoT was hot, and Eclipse before physical AI became a major venture category.China's “Vibe Manufacturing” Advantage (00:28:37)Aidan unpacks Eclipse's China Field Notes and explains what “vibe manufacturing” really means: a deeply layered, highly competitive, fast-moving manufacturing ecosystem that can turn ideas into physical products at extraordinary speed. She discusses China's compounding advantage in tooling, suppliers, human capital, robotics, and government-backed industrial competition.Where the U.S. Is Ahead and Behind in Robotics (00:37:18)Aidan breaks down the robotics race between the U.S. and China. She says the U.S. remains highly competitive in embodied AI, autonomy, and goal-oriented machine intelligence, but lags badly in manufacturing depth, actuators, magnets, physical iteration speed, and lower-level robotic control.The Robotics Data Problem (00:41:14)Aidan explains why video data alone is not enough to build general-purpose robotics. She discusses the need for proprioception, haptics, physics data, and real-world interaction data, plus why China's robotic data farms could become a major strategic advantage.Canada's Opportunity in AI, Energy, and Deep Tech (00:44:47)As a Canadian-born investor, Aidan lays out where Canada can win: talent attraction, smart immigration policy, abundant clean energy, AI infrastructure, university research, biotech, quantum, defense, and strategic government offtake. She argues Canada has the raw ingredients to become a major player if it moves with urgency.Eclipse's Interest in Canadian Founders (00:49:20)Aidan shares that Eclipse is already investing in Canada, including companies in Toronto and Vancouver, and is actively interested in deep tech and physical AI founders coming out of Canada's strongest ecosystems.About Aidan Madigan-CurtisAidan Madigan-Curtis is a Partner at Eclipse, where she invests in physical AI, robotics, manufacturing, energy, defense, supply chain, and frontier technology companies. Before Eclipse, she was an early executive at Samsara, helping scale the industrial IoT company from pre-product to public company. She previously worked on Apple's manufacturing team for the first Apple Watch and began her career at Bridgewater, where she developed a systems-thinking approach to markets and complex industries.Connect with Aidan Madigan-Curtis on LinkedIn: https://www.linkedin.com/in/aidan-madigan-curtis/Visit the Eclipse website: https://eclipse.capital/Connect with Matt Cohen on LinkedIn: https://ca.linkedin.com/in/matt-cohen1Visit the Ripple Ventures website: https://www.rippleventures.com/ This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit tanktalks.substack.com
The computer science educator is based at ACG Sunderland and was shortlisted in this year's Cambridge Dedicated Teacher Awards, out of 12,000 nominations.
In this episode of It Takes Balls, testicular cancer survivor Tony Salas shares how his cancer journey led him into one of the newest and most rapidly evolving areas of cancer surveillance: MicroRNA-371 testing and Signatera ctDNA monitoring.Tony opens up about the symptoms and diagnosis that changed his life, including the whirlwind of appointments, surgery, and learning how quickly testicular cancer treatment decisions have to be made. But what makes his story especially unique is what came after treatment—navigating surveillance while trying to avoid chemotherapy and gain greater confidence in whether his cancer was truly gone.Throughout the episode, Tony discusses using MicroRNA 371, an emerging blood test showing major promise in detecting active germ cell tumors, along with Signatera, a personalized circulating tumor DNA (ctDNA) test designed to monitor for residual disease and recurrence. He shares what it was like balancing traditional surveillance methods like CT scans and tumor markers with these newer technologies, and how the uncertainty of recurrence can weigh heavily on survivors mentally and emotionally.The conversation also explores the emotional side of survivorship, including scannxiety, the fear of recurrence, and the challenge of moving forward after treatment while still feeling tied to constant monitoring. Tony speaks candidly about the importance of self-advocacy, staying informed about evolving cancer research, and finding ways to regain a sense of control during survivorship.This episode is not only a powerful survivor story, but also an insightful look at the future of testicular cancer surveillance and personalized cancer monitoring. Whether you're navigating a diagnosis, currently in surveillance, or interested in the latest advances in testicular cancer treatment and recurrence detection, Tony's experience offers both hope and perspective.Provide your feedback on the podcast:https://www.testicularcancerawarenessfoundation.org/itbsurveyJoin The Ball Room:https://www.testicularcancerawarenessfoundation.org/theballroomWant to be a guest? Apply here:https://www.testicularcancerawarenessfoundation.org/it-takes-balls-submissionsConnect with Tony:https://www.instagram.com/tone_loc17/https://www.facebook.com/coachtony05Follow Testicular Cancer Awareness Foundation:https://www.testescancer.orghttps://www.x.com/testescancerhttps://www.instagram.com/testescancerhttps://www.facebook.com/tca.orgFollow Steven Crocker:https://www.instagram.com/stevencrockerhttps://www.facebook.com/steven.crocker2Theme song: No Time Like Now - Tom Willner www.tomwillner.com
In 1942, a 22-year-old chemistry student and part-time writer set down three short rules for how a fictional robot ought to behave. His aim was to kill off the lazy "robot-as-Frankenstein-monster" cliché. More than eighty years later, real engineers, real ethicists and real lawmakers are still arguing about them. This is the first of two episodes on Isaac Asimov — one of the "big dogs" of science fiction whose output ran to some five hundred books. John Helmer and Ezri Carlebach take on the most enduring part of that legacy: the Three Laws of Robotics. The Laws went on to power nine linked short stories in I, Robot, several films, hundreds of academic papers, and an argument about AI safety that shows no sign of ending any time soon. In this episode: The man and the output Robots before Asimov I, Robot as nine thought experiments Susan Calvin — one of SF's first great female scientists The Three Laws and the trolley problem Coming next: Foundation Links and resources: Website: techimaginarium.co.uk Instagram: @tech.imaginarium Watch on YouTube: https://www.youtube.com/@JohnHelmerConsulting Music by Nick Dwyer recording as Flintet. The Tech Imaginarium is a Learning Hack podcast, produced and hosted by John Helmer and written by John Helmer and Ezri Carlebach.
Secure your spot for the MULTI-AGENT ORCHESTRATION AI COURSE: https://multiplai.ai/multi-agent-orchestration-course/Are AI jobs disappearing faster than they're being created—or are we asking the wrong question?For months, AI leaders warned of massive job losses. Now, some of the same voices are changing their tune. But while Sam Altman and Dario Amodei are sounding more optimistic, tech layoffs continue to climb, AI anxiety is spreading across the workforce, and business leaders are facing a difficult question: what's actually happening beneath the headlines? In this week's AI News episode, Isar Matis breaks down the conflicting signals shaping the future of work and explains why the next few years may be far more turbulent than many expect. He also explores a major shift in the AI race: why model intelligence is no longer the primary battleground, and why speed, cost, accessibility, and business value are becoming the metrics that matter most. If you're a business leader trying to understand where AI is headed—and what it means for your workforce, strategy, and competitive advantage—this episode provides a practical framework for separating hype from reality.In this session, you'll discover: Why Sam Altman says he may have been wrong about the pace of AI-driven job disruption. The Jevons Paradox and how increased AI productivity could create new demand. The reality behind recent tech layoffs and whether AI is truly responsible. Why employee anxiety around AI may matter more than the statistics themselves. The growing gap between jobs being eliminated and new AI-related roles being created. Why reskilling workers may be harder than most organizations expect. How the AI race is shifting from model quality to business value. Why open-source models are rapidly closing the gap with frontier AI systems. The hidden cost explosion companies are experiencing with AI tokens and agents. What Microsoft's latest Build announcements reveal about the future of enterprise AI. Why robots cleaning homes may be an early sign of AI's impact on blue-collar work. About Leveraging AIThe Ultimate AI Course for Business People: https://multiplai.ai/ai-course/YouTube Full Episodes: https://www.youtube.com/@Multiplai_AI/Connect with Isar Meitis: https://www.linkedin.com/in/isarmeitis/ Join our Live Sessions, AI Hangouts and newsletter: https://services.multiplai.ai/eventsIf you've enjoyed or benefited from some of the insights of this episode, leave us a five-star review on your favorite podcast platform, and let us know what you learned, found helpful, or liked most about this show!
Human moral judgment emerges from emotion, empathy, lived experience, social development, and our embodied understanding of the world. AI has none of those things. So, can artificial intelligence be taught right from wrong?If we're going to rely on AI (the way the tech bros want us to), we're going to need to trust it, which means we're going to need to believe it has a trustworthy moral sense. Is that reasonable? Or even possible? Pigweed and Crowhill recall Google's Gemini image-generation fiasco (where "give me an image of a pope" created anything but an image of a pope), which resulted from a ham-handed attempted to paste moral rules on top of AI. It was comically stupid, but entirely predictable. Many people assume morality is simply a matter of following a set of rules, but no set of rules can create a proper moral sense. The boys discuss hallucinated legal citations, content moderation, reinforcement learning, the limits of rule-based ethics, Isaac Asimov's famous Three Laws of Robotics, and Pope Leo's recent call for AI guardrails. The conversation also explores autonomous weapons, the global AI arms race, and the uncomfortable reality that even the engineers building these systems do not always understand how they arrive at their conclusions.Their conclusion is both simple and unsettling: AI may become useful, powerful, and even trustworthy in certain contexts, but that is not the same thing as being moral. Machines may imitate moral reasoning, yet human beings must remain skeptical, vigilant, and ultimately responsible for the decisions AI helps make.Can a machine have a conscience? Or are we fooling ourselves when we talk about "moral AI" at all?
After a long wait, Canada's AI strategy has arrived — a document that encourages people to learn and adopt the technology in the hopes of creating 250,000 new jobs. Host Catherine Cullen speaks with AI experts and skeptics Jake Hirsch-Allen, Kristen Thomasen and Hamish van der Ven about what it means for employment, children's safety and the environment. Then, Minister of AI Evan Solomon joins the program to explain why Canadians need to understand this technology despite their low trust in it.Plus, there seemed to be a little bit of movement in trade negotiations with the United States this week – despite more trolling from President Trump about Canada becoming the 51st state. Lisa Raitt is on the advisory committee on Canada-U.S. economic relations and tells The House what progress has been made as the July 1st deadline inches closer. And, in a wide ranging exit interview at Rideau Hall, outgoing Governor General Mary Simon tells Catherine Cullen why she wasn't sure she would be able to finish her five years in the role and reflects on how Canada is doing on reconciliation and national unity. This episode features the voices of:Sumaiya Ahmed, librarian at the Toronto Public LibraryPrachi Salvi, director and marketing consultantJake Hirsch-Allen, director of partnerships at The DaisHamish van der Ven, associate professor at the University of British ColumbiaKristen Thomasen, chair in Law, Robotics, and Society at the University of WindsorEvan Solomon, Minister of AILisa Raitt, member of the Advisory Committee on Canada–U.S. Economic RelationsMary Simon, Governor General of Canada
Robotic grooves mirroring dystopian presents and futures
This episode explores the rapid evolution of AI and its impact on the job market, particularly for 20-somethings entering the workforce. AI experts Darius Mirshahzadeh of AIifyIt and Jerome Stewart of White Feather Group discuss how artificial intelligence is transforming knowledge work, creating new opportunities for those who adapt quickly while threatening traditional career paths. The conversation covers practical AI tools, automation platforms, and strategic advice for navigating the AI-driven economy. In this episode, Darius will discuss: (00:00) The AI Revolution: A New Era for Knowledge Workers (02:34) Chronological Evolution of AI: Past, Present, and Future (05:41) The Impact of AI on Knowledge Work and Job Security (08:52) Skills for the Future: Preparing for an AI-Driven Economy (11:27) Navigating the Job Market: Opportunities and Challenges for Young Professionals (14:44) The Role of Human Connection in an AI World (17:49) Becoming the Solution: Embracing AI Skills for Career Success (26:24) The Role of AI in Manual Tasks (28:27) The Future of Robotics and AI Integration (30:24) Reassessing Your Tech Stack (34:27) Getting Started with AI for Young Professionals (42:49 The Urgency of Adopting AI in Business Connect with Darius: Website: https://therealdarius.com/ Linkedin: https://www.linkedin.com/in/dariusmirshahzadeh/ Instagram: https://www.instagram.com/imthedarius/ YouTube: https://www.youtube.com/@Thegreatnessmachine Book: The Core Value Equation https://www.amazon.com/Core-Value-Equation-Framework-Limitless/dp/1544506708 Write a review for The Greatness Machine using this link: https://ratethispodcast.com/spreadinggreatness. Learn more about your ad choices. Visit megaphone.fm/adchoices
Winston Leung explains that QNX is enhancing safety for physical AI-based robots through its innovative microkernel architecture, which is designed for safety-critical applications. QNX architecture provides a reliable and deterministic platform, crucial for real-time control and decision-making in robotics. By partnering with industry leaders such as NVIDIA and Intel, QNX ensures its operating system is optimized for high-performance computing platforms, thereby enabling robust safety and security measures. Additionally, QNX's focus on integrating cybersecurity with functional safety standards underscores its commitment to protecting robots operating in human environments. Winston Leung is Senior Strategic Alliances Manager at QNX, where he manages key strategic partner relationships and programs to expand the company's product portfolio and ecosystem. He delivers strategies and thought leadership in functional safety, real-time performance, and reliability for embedded systems across robotics, medical, and transportation sectors. Download QNX's Inside the Robot: https://qnx.software/en/reports/inside-the-robot?utm_medium=podcast&utm_source=the-robot-report&utm_campaign=fy27-q2-inside-the-robot ### – SPONSORS – This episode is brought to you by GreyOrange If you're running a warehouse, your robots, people, and systems are only as powerful as their ability to work together. GreyMatter by GreyOrange is the AI-powered warehouse orchestration platform that coordinates every agent on your floor in real time, with over a million optimizations per minute, and delivering up to 4x productivity gains. GreyMatter works with the robots you already have, or with the ones you want. Ready to go beyond your WMS? LEARN MORE AT: https://www.greyorange.com/TheRobotReport/
Xen Baynham-Herd, Head of Growth at Base, joined us to discuss the growing adoption of Coinbase's Base network.Topics: - Visa added Base to its global stablecoin settlement network - Tokenization and stablecoins on Base- AI Agents and Robotics on Base - Will Base launch its own native token? Brought to you by
Are robots still a futuristic novelty, or have they officially become critical urban infrastructure? In this episode of The Edge of Show, host Josh Kriger sits down with Judah Longgrear , Co-Founder and President of Robot.com. Based in the AI epicenter of San Francisco, Robot.com is moving autonomous machines out of the laboratory and directly into real-world deployments across cities, events, and campuses worldwide.Discover how the company rebranded from Kiwi to Robot.com after securing one of the most powerful domains in tech history. Judah breaks down the massive, untapped opportunity of Robotic Media transforming friendly, smiling autonomous delivery units into localized, high-engagement branding platforms. He also explains how they blend street-level mobile fleets with programmatic Digital Out-Of-Home (DOOH) advertising boards to build a fully unified, multi-touch ad network.Tune in to learn about their latest agentic speaking robots and why seeing 5 to 10 robots a day will be completely normal within the next few years.Support us through our Sponsors! ☕ Want to make content like ours? Sign up with Castmagic to make your creative process easy: https://bit.ly/CastmagicReferral Work smarter, grow faster. Automate your SEO, get AI insights, and manage all your clients in one place with Helm. Start today 50% off your first month at helmseo.com
Rochester City School District (RCSD) students are gearing up for a weekend of competition. The second annual RCSD Flower City Frenzy Robotics Competition will be held on Saturday at East High School. In recent years, NPR has referred to robotics as a sport that builds the next generation of engineers. We talk with the students about the robots they've built, the skills they've learned, and how they hope to transfer their experiences beyond school walls, especially in the age of AI. Our guests: Sheldon Cox, executive director of career and technical education at the Rochester City School District Vicki Robertson, First Robotics mentor for the X-Cats at Wilson Magnet High School Daniel Newland, senior at Padilla High School and member of the electrical/programming team for XQ Robotics Charimar Colon, sophomore at Padilla High School and member of the mechanical/build team for XQ Robotics Noor Hussein, senior at Joseph C. Wilson High School and a robot driver/software lead for the X-Cats Angel Rios, sophomore at Joseph C. Wilson High School and a drive team coach and electrical lead for the X-Cats Izaya Sandsan, sophomore at East High School and a robot builder and controller for the Crimson Jewels ---Connections is supported by listeners like you. Head to our donation page to become a WXXI member today, support the show, and help us close the gap created by the rescission of federal funding.---Connections airs every weekday from noon-2 p.m. Join the conversation with questions or comments by phone at 1-844-295-TALK (8255) or 585-263-9994, email, Facebook or Twitter. Connections is also livestreamed on the WXXI News YouTube channel each day. You can watch live or access previous episodes here.---Do you have a story that needs to be shared? Pitch your story to Connections.
See omnystudio.com/listener for privacy information.
Summary In this episode, the Robot Report editorial team reunites to recap the major highlights, central themes, and networking events from the recent Robotics Summit and Expo in Boston. Steve Crowe breaks down key presentations, including Brian Gerkey's insights on Open Robotics AI integration as well as the Open Source Robotics Alliance (OSRA). The team also recaps Mikell Taylor's practical framework for deploying reliable, "worthy" robots. Finally, they share highlights from the closing keynote interview with Noland Arbaugh, the world's first Neuralink patient, who discussed his groundbreaking journey with brain-computer interface technology. ### – SPONSORS – This episode is brought to you by Yamaha Robotics Group (YRG) — driving the future of smart automation. Yamaha's Linear Conveyor Modules and Advanced Operator Interfaces are helping engineers push efficiency and flexibility further than ever. And let's face it: the PLC isn't going anywhere — it's evolving. LEARN MORE AT: https://hs.yrginc.com/therobotreport This episode is brought to you by maxon USA. If you're designing robots beyond controlled factory cells, mobile manipulators, quadrupeds, or humanoids maxon is worth a stop at the Robotics Summit in Boston. At the show, maxon is exhibiting its High Efficiency Joint (HEJ) portfolio: fully integrated robotic joints that combine motor, gearing, electronics, and sensing in a compact unit. Built for cyclic loads, impacts, and continuous operation, HEJ joints are designed for real‑world robotics. LEARN MORE AT: https://www.maxongroup.com/en-us
Send us Fan MailIn this exclusive investor panel clip, a frontier tech investor explains where smart money may flow after AI giants like OpenAI and SpaceX reached massive valuations.If trillion-dollar AI plays feel crowded, where is the next wave? His answer: humanoid robotics, plus emerging opportunities in robotics cybersecurity hardware and AI-powered infrastructure.He breaks down why many investors wait until markets show traction but before full institutional saturation — the sweet spot between early risk and late-stage pricing.Topics Covered:✅ How to invest before institutions pile in✅ Why trillion-dollar AI names may be too crowded✅ The next $10B–$50B opportunity sectors✅ Why humanoid robotics is still early✅ Robotics cybersecurity hardware plays✅ Quantum computing & nuclear trends ahead✅ Smart investor timing strategies explainedIf you invest in AI, venture capital, private equity, robotics, or future technology, this is a must-watch.
It uses a nearly 6-foot tall humanoid chassis and tactile five finger hands. Learn more about your ad choices. Visit podcastchoices.com/adchoices
In this episode of the Crazy Wisdom Podcast, host Stewart Alsop sits down with returning guest Ekue Kpodar for their third conversation together, covering a wide range of topics at the intersection of technology, geopolitics, and the evolving information age. They dig into Ekue's unconventional setup of running local AI models across roughly 15 computers, the growing case for open source models over closed ones from companies like OpenAI and Anthropic, and how Chinese open source models may be positioned to outcompete Western alternatives on a global scale. The conversation also touches on vibe coding and the democratization of software development, the strategic use of small models for IoT and enterprise applications, the role of Israel and China as dominant players in the information age, and how smaller nations and even individuals may wield outsized power as AI continues to collapse the cost of knowledge work. You can find Ekue Kpodar on X @ekpodar and LinkedIn.Timestamps00:00 Stewart welcomes Ekue for their third episode, diving into vibe coding and AI-driven development changes.05:00 Ekue explains using Claude on Chrome to auto-reply on Skool, burning tokens through screenshots, and Playwright as a more efficient alternative.10:00 Stewart describes his Claude-dependent planning and coding agent system breaking after a model update, prompting him to build his own chatbot.15:00 Small models discussed as critical for IoT, defense, and privacy-focused enterprises building internal APIs instead of routing traffic to OpenAI.20:00 Open source versus closed source debated, with Chinese models gaining global traction while US foundational labs remain expensive and restrictive.25:00 SaaS apocalypse explored as AI commoditizes knowledge work, with Linux and Terraform cited as proof open source still generates wealth.30:00 OpenAI's sci-fi terminator fears explained as the reason they stayed closed source, ultimately handing China a strategic open source advantage.35:00 China's economic dumping strategy applied to AI, potentially displacing US model dominance globally the same way manufacturing was disrupted.40:00 Israel's signals intelligence dominance discussed alongside asymmetric warfare, drones defeating tanks, and information control replacing military muscle.45:00 Global information age rankings debated, Israel leading, US and China tied, France and Poland emerging as sovereign tech players.50:00 Qatar, NVIDIA, and Iran cited as proof that rare resources and technology matter more than population size in the 21st century power landscape.Key Insights1. Running local AI models on a network of affordable computers can be more cost-effective than relying entirely on third-party APIs. By using compressed or smaller open source models locally, developers can handle repetitive or lower-stakes tasks without burning through expensive tokens from providers like Anthropic or OpenAI.2. Small AI models are becoming increasingly important for IoT, defense applications, and companies that do not want to send sensitive data to external providers. Organizations can download open source models, run them on internal servers, and build proprietary APIs around them, creating something like an intranet of specialized small models.3. The value created by AI tools is being redistributed away from traditional SaaS companies toward foundational model providers and individual builders. People are canceling subscriptions to software they once paid hundreds per month for, because AI now allows a single person to build comparable tools themselves.4. Open source technology does not eliminate the ability to profit. Linux and Terraform are both open source yet made their creators wealthy. People will still pay for installation, setup, troubleshooting, and customization even when the underlying software is free.5. China is applying its longstanding manufacturing dumping strategy to artificial intelligence by releasing cheap open source models globally, which threatens to erode US dominance in AI the same way Chinese manufacturing undercut other countries for decades.6. In the information age, the size of a country or institution matters far less than its access to rare resources or advanced technology. Qatar, Israel, and NVIDIA each demonstrate that small populations or headcounts can wield enormous global negotiating power through concentrated technological or resource advantages.7. Asymmetric warfare is redefining military power, with inexpensive drones defeating tanks that cost millions to build. This shifts the advantage toward nations that excel at signals intelligence and information management rather than those with the largest conventional military forces.
We're announcing AIEWF speakers this week! Take the AI Engineering Survey!Today's guest Ethan first joined us for the LS Paper Club as the lead on NVIDIA Cosmos World Model, but then joined xAI and built Grok Imagine in 3 months:He comes back on Latent Space with some nuclear hot takes: that Video Models primarily get their intelligence from LLMs, not from training on video data, and that the next frontier for truly interactive, realtime, long-horizon world models is to work on LLMs (perhaps Interaction Models as well…)Put it this way: In the near term, the next Sora won't be a better video model, but a video agent.Generative Media may more closely follow the evolution of AI coding which went from focusing on one-shot output performance and cost, to multiturn reasoning and planning models for agents and systems that can plan, edit, test, debug, and submit PRs.At a certain point, coding models got so good that the only significant next step to improve performance was handling the orchestration of these models.Now as the performance of video models increases significantly across realism, consistency, & prompt adherence while becoming more cost efficient, the next evolution of video generation may also be systems that can plan, generate, edit, critique, and iterate across an entire creative task. In this episode, Ethan joins swyx and Vibhu to unpack what it actually takes to build frontier image and video systems: data, VAEs, diffusion transformers, audio-video alignment, inference speedups, and the hidden cost of storing and moving massive video datasets. From building NVIDIA's Cosmos world model to joining xAI as Grok Imagine was being built from zero to one, Ethan He has been at the center of some of the most important work in video generation, multimodal models, and real-time world models.We go deep on Grok Imagine, how a small xAI team shipped its first multimodal video model in three months, why iteration speed matters more than almost anything in model development, and why many of the biggest gains come from fixing tiny bugs in data and training pipelines. Flipbook: The future of VideomaxxingVideo agents are almost a sure bet to be the trend in the coming year. We end with a glance at what's beyond video agents:Flipbook caused a minor sensation this year when it was released, but most treat it as a fun demo. Ethan takes it very seriously — with the speed and cost of inference coming down every year, the future of custom video JIT UI is closer than you think. We talked about why videogen models may become the front end of AI, how generative UI could replace traditional HTML/CSS, why world models need to be real-time, interactive, and long-horizon, and why the future of video generation may depend more on language models and agents than on diffusion alone.We discuss:* Why fast iteration mattered more than meetings* Why small training bugs can drive huge model quality gains* Why coding models may make compute the bottleneck again* How image and video models are trained with synthetic captions* The role of VAEs and latent space in frontier video models* Why image models are the foundation for video models* The tradeoff between temporal compression and real-time interactivity* Flipbook, Neural OS, and the future of generative UI* Why future interfaces may go from user intent to pixels* The hidden cost of training video models: storage, egress, and GPU hours* How step distillation and consistency models (like OpenAI sCM) makes video inference orders of magnitude faster* Grok Imagine 0.9 and large-scale audio-video generation* Why audio-video alignment is harder than text-video alignment* Ethan's definition of world models* Reference-to-video, video extension, and long-context video generation* Why xAI's research communication undersells Grok Imagine* How xAI culture shaped the speed of development* AI watermarking, SynthID, and detecting generated media* Why prompt rewriting matters for video models* Grok Imagine Agent and the rise of video agents* Why language models may unlock better video generation* Robotics, physical AI, and embodied world models* Why Ethan left xAI and shifted focus toward LLMs* Self-managed context, memory, and the next frontier for language modelsEthan He* LinkedIn: https://www.linkedin.com/in/ethanhe42* X: https://x.com/EthanHe_42Timestamps00:00:00 Introduction00:01:25 From NVIDIA Cosmos to xAI00:03:24 Building Grok Imagine from Zero to One00:10:07 How Image and Video Models Are Trained00:18:53 Video Compression, VAEs, and Real-Time Tradeoffs00:22:10 Generative UI, Flipbook, and Neural OS00:32:10 The Cost of Training Large Video Models00:37:04 Distillation, GANs, and Fast Video Inference00:41:21 Audio-Video Generation and Grok Imagine 0.900:48:34 What Makes a World Model?00:55:51 Reference Videos, Long Context, and Video Memory01:00:11 xAI Culture, Research, and First-Principles Building01:09:45 AI Safety, Watermarking, and Prompt Rewriting01:13:10 Video Agents and AI-Assisted Creation01:27:32 Why Language Models Unlock Better Video01:31:15 Robotics, Physical AI, and Embodied World Models01:32:38 Why Ethan Left xAI01:34:16 Self-Managed Context and the Future of LLMs01:38:43 Ethan's Career Path and Closing ThoughtsTranscriptIntroduction: Ethan He, Latent Space, and the Path to xAISwyx [00:00:00]: We're here in the studio with Ethan He, most recently of xAI. Welcome.Ethan [00:00:10]: Thank you. Glad being here.Swyx [00:00:11]: We're also here with Vibhu. you were first coming to us or joining the latent space world because you were working on Kosmos at NVIDIA, and you did a paper. We loved it. you presented it as well, so thank you for doing that.Ethan [00:00:23]: I've actually, I also presented the MoEs twice at latent space.Swyx [00:00:29]: How did you actually hear about us? Did we reach out to you? Is that how it worked?Ethan [00:00:33]: No, actually, I-- the community. Like I realized, oh, there is this online community that people talk about AI and also learn from each other through papers every week through the Paperclip. It's very nice.Ethan [00:00:49]: I learned a lot.Swyx [00:00:49]: I think three years stop. We haven't stopped even on Christmas and New Years. many weeks I want to stop but it keeps going.Vibhu [00:00:58]: No, that was good. I think you had posted that you worked on a paper, and I was “Oh, very cool. We have Paperclip. Present then.”Vibhu [00:01:04]: But I might have reached out to you after.Swyx [00:01:05]: you-- because it's an amateur club, right?Swyx [00:01:08]: so it's very unusual and but we have sometimes paper authors come by and actually explain the paper. Today we just did, the poolside paper, which was apparently very good.Vibhu [00:01:18]: Came out yesterday.Vibhu [00:01:19]: pretty interesting, right? Fully open. They talk about everything, systems. So it's a good one. We'll, we'll recommend people to read it.Swyx [00:01:25]: Bring us up to speed on your transition to xAI, ‘cause I actually don't even know when you joined. just like tell the, tell the story about the sort of transition.From NVIDIA Cosmos to xAI: Scaling Video and World ModelsEthan [00:01:34]: Before xAI, I was working on Kosmos world model as in-- at NVIDIA. So Kosmos is, it's a giant video foundation models that can-- that aims to simulate the world and for-- it serves as a foundation of-- for all of the roboticists to build on top of. There, once I built the Kosmos one, I realized as this thing also has a scaling law similar to language model, we need to scale up the video models further. that's, that's why I realized I need to move to somewhere with much more compute resources. That's how ISwyx [00:02:13]: Than NVIDIA?Vibhu [00:02:14]: The GPU rich came themselves.Vibhu [00:02:19]: And timeline-wise, when was Kosmo? It was pretty early, right? It was open world model, open paper, everything.Ethan [00:02:25]: It was end of twenty-four.Vibhu [00:02:28]: End of twenty-four.Ethan [00:02:30]: Then at mid twenty-five, I moved to xAI. At that time-- I joined about the time when xAI was about to build video models and in multi-model models. There were no infra, no data, and no model, and it just-- as a few engineers, we built it in three months and released the first model, Grok Imagine zero point nine.Ethan [00:02:55]: And since then, I keep working on video models and move more from training and to post-training of the video models. For example, like a reference to videos, kind of like the cameo feature and, video extensions. And, before I left, I worked on a world model, leading a small team to focus on the real-time long horizon video generation.Building Grok Imagine From Scratch in Three MonthsSwyx [00:03:24]: Can you give like a rough roadmap of okay, you're on a brand-new team. Grok previously was only text, or they partnered with BFL for their image gen stuff. What do you-- what are the building blocks, right? You have compute, data you can procure somewhere. Like just what are like the sequence of things that people should think about when you're setting up a new team?Vibhu [00:03:43]: actually even deeper, not just data you can procure. You guys had to go through getting the data too, right? So you shipped it pretty fast, but yeahSwyx [00:03:51]: three months is likeVibhu [00:03:52]: From everythingSwyx [00:03:52]: actually like very surprisingly fast.Ethan [00:03:55]: One thing I say like thanks to my experience at NVIDIA, ‘cause first time when we were building Kosmos together, we built it, for about a year. So this is like the second time I do it. Roughly have an idea, what to do. I say the most important thing is the talent. Everyone were very strong and clever, very close with each other towards a common goal. So that speed up things a lot. So you reduce the communication bandwidth among people, and everyone can work towards the same goal. It's, it's like every day there's not that much meetings on the calendar, like maybe like a, like a sync a day, and after that it's, it's just all building. It was pretty fun at that time.Ethan [00:04:47]: And another thing is that xAI has very strong foundations of like data inference, model inference, and the supporting there can help the model develop a lot. When I look at, training models, I don't so actually the top important thing is like how many, how many iterations can you do, per day? and the more iteration can you do, you can, you can train the model much faster. So if you have very strong infra and you have a lot of compute, you can, you can train these models in very short period of time. That can give you a much larger buffer to, for errors, and it also gives you the opportunity to spot more bugs.Iteration Speed, Compute, and Debugging Model PipelinesSwyx [00:05:46]: What is an iteration? Is it like a few hundred steps or what are youEthan [00:05:50]: Let's say just the train-training the model, like from acquire new data and maybe design new algorithms and train a new model, maybe at smaller scale orSwyx [00:06:01]: So cycle time for like any hyperparam that you're searching.Ethan [00:06:04]: Cycle time and tune to like eval this model. Is this model better than my previous iteration?Ethan [00:06:11]: SoSwyx [00:06:11]: So it's like before you, someone had already set this up that you can iterate very quickly.Ethan [00:06:15]: I think the foundation there is extremely good forDeveloping and research models.Ethan [00:06:23]: And often I find is it-- this is kind of boring, but like a lot of the improvements does not come from new algorithms. It comes from finding small bugs here and there in the data pipeline, in the, in the model training pipeline. Those give, those give the biggest boost to the model quality.Vibhu [00:06:46]: It's interesting, right? So you say it's like small team, less communication bandwidth, but also a lot of quality is like find little bugs. It seems counterintuitive, right? You have a lot of people, you can iron out more of those, but it's interesting to see the other side, right?Swyx [00:07:00]: I also wonder, have you-- do you try using LLMs to look for bugs? I don't know.Ethan [00:07:05]: I remember at that time it was mid two thousand and twenty-five, so it's the coding model wasn't quite there yet. I remem- I remember like December two thousand and twenty-five, it was extremely good. Yeah, I've been, I've been using it at that time. It's, it's helpful. sometimes it produce codes that are kind of difficult to maintain, even though like the first time it built something extremely fast. But it gave the, like a spaghetti code, thousands of lines that I couldn't maintain, and the LLM itself couldn't figure out what's, what's wrong and how to improve on top of it. But now I find it much better. Yeah, I want to bring up another point here is now coding models are much more efficient and can help us implement stuff much faster. Compute might become a bottleneck again because previously, like if you want to train a new model, say you want to generate new synthetic data and then or write a new algorithm, it might take a few weeks. And during that period of time, you don't-- you might not have experiments to run. But now you can build that thing within a few hours, then you can immediately train a model.Ethan [00:08:24]: Now you have to have enough compute to try all of the ideas. So compute might be the bottleneck of iterating speed again.Swyx [00:08:36]: yeah, I actually, honestly, I think it's like kind of a stressful job because you're “Well, I should be trying everything, and if I'm not, then I'm not doing my job well.”Vibhu [00:08:48]: there's also the stress of you're eating thousands of GPUs per hour, which is very expensive and, compute can go to other researchers.Swyx [00:08:56]: You got the daddy Elon toVibhu [00:08:57]: You got daddy Elon.Ethan [00:08:59]: It wasVibhu [00:09:00]: But there's still finite amount of compute, like you want to use it, you want to use it well, you want more of it.Ethan [00:09:06]: That was quite stressful indeed. Yeah, I think one thing is the-- with coding models now, like a lot of these jobs can be automated, which is much better. A second, it's a, it's a marathon, so you got to maintain good health and, a regular schedule.Vibhu [00:09:28]: It's, it's hard to hear that when you shift from zero to nothing in two months.Swyx [00:09:32]: and, I think obviously the culture at xAI is very famously, people work very hard. one thing I did want to dive into, in our-- in the notes that you, that you sent ahead of time, you had specific comments about the cost of Video Gen training. presumably this is on the Colossus-1, right? the two hundred megawatt cluster. Any whatever you want to just share on that.Vibhu [00:09:54]: I think there's, there's three things we're talking about, right? So there's Video Gen, there's also the Image Gen model that you put out. Do you want to like complete the, okay, so zero to one, you have a few months. Just what are the stages of create Image Gen model?Swyx [00:10:06]: Oh, yeah, maybe I got distracted.How Image and Video Models Are Trained: Synthetic Captions, Tokenizers, and VAEsVibhu [00:10:07]: Sorry. and then, from there's Video Gen, there's Audio Gen. Would love to get into those next. But what is that first few months like? So small team, a lot of bugs, iterations, but what does it look like? Do we take something off the shelf? Do we just get data compute? What's, what's the few months like? How do you go to state-art Image Gen model? How do you just start?Ethan [00:10:28]: I cannot comment specifically how xAI did, but it's, it's a quite standard process. I can draw some, examples from Cosmos. So mainly it's building a video model, you actually need to build a image model first. And building these two models, the data you need is a hundred percent synthetic pair of language and image or language to video. Because on the, on the internet, actually, the videos don't naturally associate with text. So you can say, oh, like on YouTube, you have the title and you have the description and the commentsSwyx [00:11:11]: TitleEthan [00:11:11]: of a video, but usually they're not relevant to the video itself. And say maybe like the video is a natural scene of mountains or something, and the title is, I'm so happy today.Ethan [00:11:26]: So they have they have no correlation at all. So the first step is to, you have to generate synthetic pair of language with the videos. So you gather videos from the internet, and you use a VLM to caption the videos. So that part, here's a question, like how do you, how do you gather VLM to begin with? So if there's noSwyx [00:11:55]: You, so you fuse the model, right? LikeEthan [00:11:57]: Say if there's no like VLM exists, like how do you generate the text to the beginning, right? It's, it's impossible.Swyx [00:12:04]: I see.Ethan [00:12:05]: In the beginning, it's like you ask human to describe the video as detailed as possible.For example, you ask them to describe everything, like all objects, all characters, and all interaction and dialogues in the, in the videos. So that's in the protocol of Cosmos labeling. We require the objective we give to the labelers was that you have to describe the video as detailed as possible, such that a blind person hears a blob of text can reconstruct what the video is like from their head.Swyx [00:12:43]: Video or image? You're talking about images.Ethan [00:12:44]: Video or image, either one of them.Vibhu [00:12:47]: This was pretty common when we went from clip and DALL-E, right?Vibhu [00:12:51]: It's all training on really detailed captioning of images. So same is applied to video, but insteadEthan [00:12:57]: same appliedVibhu [00:12:57]: of using multimodal model to pass in video images and write rich descriptions, you can alsoSwyx [00:13:04]: I think there's this traditional perspective of supervised, or, very highly human curated thing. I feel like there's a unlock with unsupervised, right? Where like you have enough to bootstrap that you can just throw common corpus on it or, whatever. like unsupervised vision and language pairing, right? Like where you just have, interspersed image and text and it just learns. To me, that is the VLM breakthrough that is different from the clip, different from the LM era.Ethan [00:13:36]: It's interesting to see that you kind of need both data.Ethan [00:13:41]: For example, for theSwyx [00:13:41]: You need it to bootstrap it up. YeahEthan [00:13:43]: for the generative model training, there's also usually like a small percentage of unlabeled data. So the model is instructed to generate a video without any text instruction. That can also help the model generalize. So after this stage of generative synthetic pair, so, one important common step is to train a compressor or a tokenizer of the image or videos. So because, if you train-- If you can technically, theoretically train image or video models on pure pixels, but the problem is that the, it's, it's a lot of tokens. So like one image, it's, a thousand by a thousand, it's like one million tokens, one million pixels. It's impossible to train transformer on that. So it's, you need to train a tokenizer, which can go from image to latent space and latent space back to image.Swyx [00:14:45]: That's why we named the podcast.Swyx [00:14:48]: But, basically, you're talking about vocabulary science.Ethan [00:14:50]: so vocab.Swyx [00:14:51]: And so, what is, what is imp-- like a million is impossible?Ethan [00:14:54]: In generative models, the vocab is continuous. It's a continuous space. We can think about like you map an image to a vector. It's a, it's a fixed length vector. It's sixteen or forty-eight, something like that. And then you map that vector back to the image space. And the mapping is, has-- The mapping is patch-based. So you say you haveEthan [00:15:22]: a sixteen by sixteen patch and you match, you map that patch of pixels into this latent space.Swyx [00:15:29]: We've covered thisVibhu [00:15:30]: This is like the vision transformersSwyx [00:15:32]: VAEs,Ethan [00:15:33]: VAEs.Vibhu [00:15:34]: You basically compress your input, you do your generation, you're reasoning all that generation in smaller dimension, and then you project back out.Swyx [00:15:43]: VAE is a form compression, but I think the for me, the patching thing is from VIT, right?Ethan [00:15:48]: You can make those.Swyx [00:15:49]: Literally the, yeah, the paper is titled like sixteen by sixteen is all you need. something like that. and then I think also, people make a lot of comparisons with this kind of patching with convolutions.Swyx [00:16:02]: Which is you're, you're kind of re- reconstructing the old paradigm with the new.Ethan [00:16:05]: Actually, in VAEs, there are, there are both convolution networks and transformers. You can actually do both.Ethan [00:16:14]: After this VAE, so what you've got is you've got latent space tokens and you've got the language tokens. So now the training of the diffusion transformer, usually generative models use diffusion transformers. It is actually quite standard. It's, it's very similar to how you train a language transformer models. It's not that much difference. It's just the tokens, the visual tokens in, visual tokens out. The only difference is there's a denoising process. So you train the model to unmask some of the noise. So you add, you add random noise to the visual tokens, and then you train the model to remove those noise to generate the clean tokens. Any inference, the model can iteratively remove noise from a hundred percent noise.Swyx [00:17:12]: And then there's also, to speed things along on the tech tree of diffusion, there's CFG, and then there's, there's also, latent diffusion that, there's, there's someone in there. I think, somewhere along the line, obviously, like stability and all these other guys, pioneered a lot of this, architecture. I don't know if you want to get into that or just, or do the video side up to you.Bootstrapping Video from Image Models and Temporal CompressionEthan [00:17:37]: After you train such model, such image model, the reason it's a, it's a foundation for video models is that image models are cheaper to train, and they have much denser connection between language and text. So, sorry, language and images. For example, you train a billion, you train on a billion images, and there's a mapping from the text to the image. And the cost to train the same, like the, a billion, a billion text to a billion videos, that's much more expensive because videosNaturally have more tokens than images. Because the diffusion models, their understanding of, language purely come from this mapping. So if you don't have enough mapping, so if you only train on like a ten million videos or something, there-- you might not see enough language tokens in your training, so your model does not understand human intention enough. So that's why you really-- you train-- you first train this image diffusion models, and then you bootstrap the video model from there.Swyx [00:18:53]: One thing I did want to ask, because I-- actually, I think you're, you're the first per-- video model person I've ever talked to, I think. we've, we've like talked to Luma and all those folks. There's all these tricks in video compression where basically frame by frame there's not that much difference, so actually you don't have to regenerate or save the whole frame, right? but I think MP4 compression or something else like that.Swyx [00:19:16]: is it tempting to use that? Or as far as I can tell, everyone just treats it as, “No, we would just generate every frame.” Is that roughly the state-art?Ethan [00:19:27]: There are a few different approaches. Let's say first, like you want to just directly use MP4 compression and use that as the tokens for the transformers to train, right? So people actually have tried that, but the main challenge is the latent space for the MP4 tokens were not, were not very comprehensible for the models. It's, it's extremely hard to train on that. And there's aEthan [00:20:01]: So that's why they created VAEs, which creates more continuous, latent space, so the models can understand that latent space and learn from it much easier. Even within the VAEs, there are different difficulties of the latent space. So you can imagine something the simplest, the most naive VAE is like you have an image, and you just shuffle all of the images into a, into a vector. So you don't need to train any VAEs, right? But that latent space is extremely hard for models to train on top of. That's why there are some debate on like how do you compress the tokens. So you mentioned like you can compress frame by frame. Also, you can compress, the temporal dimension.Ethan [00:20:52]: The difference is if you compress the temporal dimension, you get a much higher compression rate. Because there's temporal redundancy between frames, because, this frame and the last frame, likely they are mostly similar, so there's only some small difference. for example, I think in 12.1 VAE, they have like a eight by eight by four compression rate. So the four temporal tokens are compressed into one tokens. That can save a lot of, save a lot of the context length. If you do it frame by frame, you have to do maybe like eight by eight by one. Your context length will be four times larger. That being said, the benefit of the frame-- per frame compression, we might come back to this later, is, real-timeness and interactivity. ‘Cause if you, if you strain the output of the model, frame by frame, you can-- the model can respond to any user request immediately. So if you have like a temporal four compression, four times compression, thenSwyx [00:22:06]: It might be laggyEthan [00:22:07]: there's a lag there in nature.Swyx [00:22:10]: So you're very pilled on this. let's just go ahead and bring it up ‘cause we have the visual prepared anyway. There's some frontier applications of real-time video gen. So Flipbook is one of the examples that went viral recently, right? What is Flipbook?Real-Time Generative UI: Flipbook, Neural OS, and Diffusion Front EndsEthan [00:22:23]: Flipbook is kind of like a web brow- web browser. You can see like it has the web bro- browser UI on top. The difference is all of the UIs are generated by generative image model in real time, and anything here are fake. But you can, you can explore inside this wor- this imaginary world. Say like we-- here we have engineering the Great Pyramid. Like the model generates this for us to understand how it works, and if we want to navigate around and understand further, we can click on some of the, some of the description here, and the model will generate a new page, new subpage describing the details we want to know about.Swyx [00:23:14]: So it's basically kind of we're playing a video, but it's pausing for our next interaction, and then it just plays the next thing based on our interaction.Swyx [00:23:23]: Which is kind of cool.Vibhu [00:23:25]: and you kind of decide your story. So this was, how do you make a pyramid? levering technique seemed interesting, right? It shows how do you take Okay, I want to know what is thisSwyx [00:23:35]: The demo, the demo tweet had more animation between frames.Vibhu [00:23:38]: I think it's just skipping,Swyx [00:23:39]: Oh, it's just skipping a lot of frames.Ethan [00:23:40]: they also have a video modeVibhu [00:23:42]: It takes a lot. There's a lot of peopleEthan [00:23:42]: but, a lot of people are using it.Ethan [00:23:45]: So it's not available.Vibhu [00:23:46]: There's a live video stream. We can try,Swyx [00:23:50]: So this is an example of the kind of future that you see at the extreme. We don't-- we're obviously not in it today.Swyx [00:23:56]: But in a world where inference is completely free this is better than generating code and text?Ethan [00:24:02]: So this is, this is a final state of where Viva will be at for word model, I think. Imagine internet doesn't exist, and then you type in google.com. Like what should, what should, what should a model show you?the model can imagine something, and this is what the model imagine. And these web pages, they completely do not exist. So I think as the inference costs come down, we are going to have generative UI for everything. If you think about how the coding model works, so they write code for a web page, and they render the code might be con- converted into binary, and the binary render the pixels on the screen. So we in machine learning, every time we have some breakthrough, obviously it's, it's more intuit. So why don't we have like user instruction to the pixel directly? So the generative UI will be user intention to the pixels directly. And say like even if I want email, let's say everyone have the same interface, but I want, I want it slightly different. I want the email to show to me like a TikTok, so I can swipe left and right for the emails. And or maybe you want something else. We can have completely different things. Or like I have I'm looking at, Instagram stories, and I don't like the Like button. I always may click it. And, generative UI resolved it. So it's going to be a revolutionary replacement of the interface. So in the future, we might have much more powerfulEthan [00:25:50]: LLMs and coding models running behind the scene. And in the, in the front-end, the diffusion model will actually be the front-end to show stuff to you. That's how I imagine it.Swyx [00:26:02]: Diffusion front-end, deterministic back-end.Swyx [00:26:04]: Something like that. I find that very expensive, but,Vibhu [00:26:08]: I find it interesting you called LLMs writing code on the back end deterministic, but okay.Swyx [00:26:14]: you write it onceVibhu [00:26:15]: Compare it toSwyx [00:26:16]: And then you execute.Ethan [00:26:17]: If you think about the cost, say, let's say H100 costs $1 per hour, and if you use this eight hours a day and thirty days, so, every month you're paying this two forty, you'll actually not wanna pay for that. That's even more expensive than Cloud Code Max. But if you think about the compute costs come down like two times every year, and I think the future will likely arrive like within few years.Vibhu [00:26:49]: It's everything, right? compute cost comes down, compute gets faster, model gets smarterEthan [00:26:54]: More efficientVibhu [00:26:54]: model gets smaller.Swyx [00:26:55]: I don't know why you say two times, ‘cause I think it's like 100 times. In language models, it is roughly one hundred to a thousand times every twelve to eighteen months, for the same given level of LMSys, ELO.Vibhu [00:27:08]: That's a net of everything, right? That's model performance alongside compute. So different than just compute costs come down. But, a very interesting future.Swyx [00:27:19]: So the web designers will have to shout out that accessibility is an issue, right? how do you deal with screen readers or whatever. But yes, this is higher bandwidth storytelling than anything you can possibly generate with code, right? So I think that's the rough idea.Ethan [00:27:34]: And I'd like to add a little bit that so human naturally have the maximum bandwidth when we are looking at things, look at videos, and we also have maximum output bandwidth when we are talking. So in the future, it might be something like we talk to AI models, and the AI model responds back with a generative UI. So that would be the maximum input and output bandwidth to interact with AI models before neural link happens.Vibhu [00:28:06]: And it's also very custom, right? Some people are very visual, some people are not as visual, right? They prefer the text. But the best thing about generative UI, right, it can also be text.Swyx [00:28:17]: There's another project that we wanted to highlight, which is the Neural OS. Kinda similar idea, but here you're literally operating, simulating an operating system with a video model.Swyx [00:28:27]: and you can play Doom, you can do Firefox. I find this like mildly less impressive, obviously, because it's an OS that I can run.Swyx [00:28:37]: But here everything is imagined.Vibhu [00:28:40]: I was, used to the Command+W to close the Firefox tab. It didn't crash. That's why I saidSwyx [00:28:45]: It's too immersive.Vibhu [00:28:46]: It's, it's too immersive for me.Swyx [00:28:47]: Too immersive.Vibhu [00:28:48]: I wanted to close the tab.Vibhu [00:28:49]: But yes, I can play generated diffusion.Swyx [00:28:51]: this is shockingly fast.Swyx [00:28:54]: Because I remember there was a demo about like maybe one to two years ago. Someone tried to do the first-person shooter with a image model. There was no consistency. It was very slow. But here it looks like realistically it's-- this is Doom.Vibhu [00:29:07]: I think there's two sides to that, right? There's okay, what is running a game? The heavy part of it is actually the game engine, all the lighting, all that stuff, the graphics. This is just kind of video, right? Like we've solved consistency. This is still, it looks like a few years old image generation. There's some temporal consistency, but it's, it's kind of just images stitched together as frame video. But it's a good visual representation to pi- to picture the future you wanna see, right? that's, that's what I see in these more so.Ethan [00:29:38]: This reminds me of how the video models gets better and better. So Neural OS is kinda if you just look at it feels like it's just a crappy version of the, like the Windows we could have, right? And, but the difference is, so the model, this model is overfitted on the existing operating systems. It can generate nothing different than that. But it's actually also similar to video models. So when we are training these video model, image model, we train them on internet. There's no imaginary supernatural stuff on the internet. But once we train this model, you can prompt the model to generate something supernatural that have never existed in the data set. So if you train your Neural OS or neural computer on the standard screen recordings on the entire internet. The model can imagine completely new interface to interact with the computer.Swyx [00:30:43]: This is one of those things that is magical to me. usually generalizing out of distribution is bad, but somehow we have learned some kind of internal world model that you say, this plus, but it looks like rainbows and butterflies, it'll do it and it will kind of make sense.Swyx [00:31:03]: So yeah, that's kind of cool. Yeah, I don't know if there's any comment more on there. I do, I do wanted to, I did wanted to touch a little bit more on the model architecture stuff, which I think you were getting. It's, really fascinating. We don't get a chance to talk about this enough. So one of the papers that we covered, we've covered every annual, segment anything release. and I don't know if you follow-- you're a computer vision guy, so youEthan [00:31:26]: I knowSwyx [00:31:27]: . So they did memory attention, which is kind of interesting. And I always think, anything where you can, across the temporal dimension, keep some consistency, I think it's, very fascinating, and I don't know if Basically, does that-- the CV side bleeding into video gen side, I think is underexplored, right? we talk about it for labeling, but actually you can borrow the architecture itself.Ethan [00:31:50]: There's, there's also complete different approaches, right? you brought up the term world model, so we went from video model to world model. There is diffusion, but there's also other approaches that people are doing. So maybe we get into those after as well,?Swyx [00:32:03]: He has a whole definition of world models and stuff. I feel like we threw a lot at you. Whatever you want to comment on.Why Video Models Are Expensive: Storage, I/O, and Training ScaleEthan [00:32:10]: I think one thing that we should actually comment back on is okay, so we were talking about the steps to train image gen to video model. One thing we don't see as much of is okay, you brought up the delta in training data, right? SoEthan [00:32:24]: you won't have as much a video model might not generalize, but what is the cost of training a large video model? So we know for LLMs roughly, okay, even like the poolside thing that came out today, right? It's a Gemma level model trained on roughly forty trillion tokens at this many H200s over this much time, right? You can see what is the exact cost of that. So how many GPU hours over how much H200 costs? So how do we do the back-end math of, same thing for video models, image models. How do you, how do you kind of break that down? I can share some back-envelope calculation. So surprisingly, video models is-- the cost is very-- is comparable to language models and obviously the largest scale is language model, maybe like a medium scale to language models. I said just storing the videos alone, it costs a lot. You can, you can maybe look up on AWS or something.Ethan [00:33:20]: You really, say if you have a billion videos and let's say, let's just say like each video, like five megabyte, then you need five petabyte to just store those videos. And also remember we talk about you use a VAE to compress the videos, and you also need to store, typically you need to store those continuous feature, in-- also in your storage. That's also comparable size with the videos themselves. So just storing these videos and the features is tens of petabytes alone. And,Swyx [00:33:58]: I just, I just looked up the calculation. Five petabytes on S3 Standard is one hundred K per month.Ethan [00:34:05]: AndSwyx [00:34:05]: It's comparableEthan [00:34:05]: and you needSwyx [00:34:06]: AndEthan [00:34:06]: And then like tens of petabytes, two hundred K. And even more expensive is you have the ingress and egress.Swyx [00:34:13]: Oh, yeah.Ethan [00:34:14]: Like you-- through the internet. You have to just to download those videos, I believe it's, it's more expensive on AWS than just storing those videos.Swyx [00:34:25]: Storing, yeah.Ethan [00:34:25]: And each training runs, you probably need to pull them once. If you train multiple times, it's, it's even more than that. So it's like just storing the network, those costs is just, it would be a few, a few millions per month to just storing everything, not to mention the GPU cost.Ethan [00:34:45]: AndSwyx [00:34:45]: my side tangent, the compute rental, like GPU rental is very efficient. There's one side, okay, you can be XAI and build your data center. Should we not just build our, storage compute as well? LikeEthan [00:34:57]: Of courseSwyx [00:34:57]: cloud cost compared to just,Ethan [00:34:59]: You save so muchSwyx [00:35:00]: store. Yeah, exactly.Swyx [00:35:01]: Especially with like egress and stuff. So.Ethan [00:35:04]: That's a good idea, but it also comes to-- there are some of its own challenges.Swyx [00:35:09]: Of course, of course.Ethan [00:35:10]: like people who build the GPU data centers, they might not expect this much, storage. And yeah, people build storage, typically they just build it somewhere with just CPUs.Swyx [00:35:23]: I just looked it up. Five-- AWS only charges for egress, not ingress. Tier five for five petabytes is two hundred and thirty K.Ethan [00:35:32]: Even more expensive than the storage.Swyx [00:35:34]: But storing is per month, right? You check in, then you cannot check out. so it's so cool. It's okay. So there's that side.Ethan [00:35:41]: So the TLDR, my backhand mathSwyx [00:35:42]: Data is larger than you think. Yes.Ethan [00:35:44]: my backhand math of GPU hours times GPU cost is also very much, I'm missing some storage.Swyx [00:35:49]: You're also-- you're basically like also more IO bound than normal training.Swyx [00:35:55]: Yes. ‘Cause like data loading, so caching everything, it becomes super important.Ethan [00:36:00]: So in Cosmos, we did a lot of optimizations to make it not IO bound. So, speaking of the training, actually training the model, the GPU cost, if you look up like the open source model, how big these video models are, I think like LTX has nineteen B parameters. That's a dense model. And people are also exploring, MoEs, so it might be twenty B active and, like a hun- hundreds B, total. So that's, that's even-- that's similar size as medium-sized LLM models. And if you, if you look at number of tokens-Uh, we disclose that in Cosmos. It's also like tens of trillions of tokens on the visual tokens. So putting this together, the cost of, training these video models, it's actually comparable with LLMs. Not to mention, the infra is slightly different from LLM, so it might be less efficient to train these models.Inference Speedups: Step Distillation, Consistency Models, and GANsSwyx [00:37:04]: Do you get the benefits of traditional diffusion speed-up? So for, images, there's LCM, LoRAs for, fine-tuning. There's, there's a lot of stuff that's beenEthan [00:37:15]: Flow matching.Swyx [00:37:16]: there's flow matching. There's a lot of stuff that's been done. there's some overlap that applies to diffusion on the inference side and stuff or?Ethan [00:37:23]: so the difference-- the inference side is a completely different story.Ethan [00:37:28]: I think for the training side, it might be a little bit hard to reduce that cost. And for the inference side, the biggest gain is from the distillation of these models. You can-- It's called step distillation, slightly different from knowledge distillation in LLMs. So you-- Typically, for flow matching models, you need like 100 steps or something. Like a distortion model even need even more, like 1,000 steps to generate a good image or video. A step distillation is try to learn to generate fewer step from the model itself. It's kind of like now we-- you use the full model to generate in 100 steps, and then you take a model that only generate 10 steps and let that model to learn from the perfect one.Ethan [00:38:25]: why this workSwyx [00:38:27]: Strong to weak seemingly.Ethan [00:38:28]: It is. It's kind ofSwyx [00:38:29]: DistillationEthan [00:38:29]: kind of like strong to weak. the-- from the modeling perspective, the strong model, the teacher model is trying to model the image and videos of inter-internet, and that distribution is extremely complex. But the step distilled model is just trying to learn from the teacher. The teacher is a model, and the size is fixed, as the distribution is much simpler than the whole internet. That's the intuition I have why step distillation can work. So usually these models serve in productions, they only run in a few steps. In Cosmos, I believe we have, we have like four step and eight steps. If you do some simpler task, image-image translation, it can even run in fewer step, like one step in Cosmos Transfer.Swyx [00:39:22]: I think this is the same intuition that guides a lot of the consistency model work. I sent you a link for, SCM. I don't know if you covered that. To me, that was actually one of, the most impressive papers I've ever seen from OpenAI.Swyx [00:39:34]: That this is the unifying grand concept of consistency models. I don't know if you have any comments on this.Ethan [00:39:41]: So there are, there are a few different approaches,Swyx [00:39:46]: Oh, yeah. Here it is.Swyx [00:39:47]: Two steps versus twenty or 100 steps, whatever. It's already done.Ethan [00:39:52]: So there are, there are a few different approaches, for example, consistency model, and there are also Actually, we shouldn't forget GAN. So GAN, actually, that was, that was the OG ofSwyx [00:40:05]: OGEthan [00:40:05]: step distillation ‘cause it trained just one step to begin with. So actually, a lot of, uh-- For example, there's a distribution matching distillation which use, which uses GAN, as one of the laws for distillation. It-- GAN just tells you, “Hey, generate an image,” and thenEthan [00:40:31]: it has a discriminator to tell, is this image real or not? So the model, the model just need to learn one of the distribution, not the full distribution. Because in training, the model is asked to reconstruct the ground truth image from the internet, which is extremely hard. And in-- When you're training GAN, it's a step process. It's just a, “Hey, you generate image. Does this image look as real as the image from the internet?” Which is a much simpler task. And, yeah, combining a lot of these approaches together, people typically do that, like consistency model and distribution matching and GAN, and we can get these few step models.Audio-Video Generation and Time AlignmentSwyx [00:41:21]: Then there's one step I wanted to add, which is audio and video.Ethan [00:41:26]: So, Grok Imagine zero point nine, I believe it's, it's a first audio video transmodel deployed at a large scale. SoSwyx [00:41:39]: And that was your first model?Ethan [00:41:40]: that was, Grok Imagine's first model. It's, it's audio video, joint generation. I think the hard part is, the modality alignment, ‘cause before this transmodel, we have, we have text to video alignment. We have this, correspondence between text and video. Typically, most of the VLMs, they understand images and videos. Video's very rare, and they don't understand audio mostly. And if you look at the audio generation on the LLM side, you can talk to them perfectly fine, but if you ask them to sing a song or something, it typically is not very good. Also, they don't have, they don't have music either. The hard part is thatUh, actually audio has two component. It has like a discrete component, a continuous component. The discrete component is like the language.Ethan [00:42:44]: So when we speak, it's just, someSwyx [00:42:47]: It's an ASR issue, yeah.Ethan [00:42:49]: It's, it's text token with some characteristics, I would say.Ethan [00:42:54]: But musicSwyx [00:42:56]: I think the speech guys would disagree with this.Swyx [00:42:57]: Like disfluencies and then,Vibhu [00:43:00]: There's tones you can get angry.Ethan [00:43:01]: Well, I say largely.Ethan [00:43:03]: the mu- but the music is completely different. It's, it's very continuous, and you cannot model them like discrete tokens in language models. this is like the hard part for models is, not to mention we have to align text, video, and audio together.Ethan [00:43:26]: SoVibhu [00:43:26]: How?Ethan [00:43:28]: So significant-- some significant challenges are like-- So first, like we talk about as the VLMs, they cannot understand most of them cannot understand audio.Ethan [00:43:39]: So you have to have some way to do the synthetic data generation for audio. You have to caption the model, and that involve, that involve synthetic data and human data effort a lot. And not just surprisingly, most of the LLMs are very bad at recognizing, like the beat, tone, and the details of the of music. They can, they can give some general prediction of which song is this, but it's very hard to describe the details of the music. like we mentioned in image generation, like you have to describe image as detailed as possible so that someone blind can reconstruct that. So here is like someoneVibhu [00:44:32]: DeafEthan [00:44:32]: someone deaf can reconstruct how the music sounds like without actually listening to it. Maybe you can think of it need to have the-- or they call the script.Vibhu [00:44:49]: Subtitles, yeah.Ethan [00:44:49]: You gotta have all the details of the music, and the dialogue.Vibhu [00:44:55]: So is the challenge there typically stuff like music and audio, or is it just Like is there a baseline? Okay, there's enough data where we can understand, narration, conversation, but there's nuances in audio that's where you hit all the data issues or is it just from stage zero, you just do it all right?Ethan [00:45:15]: So one important thing is like the alignment. So the model, the model has to know like the video and audio, the, uh-- it has to have a time-based alignment, like at which time step the video and the audio token correspond to each other. But we actually don't have this kind of alignment for most of the other modalities. If you think about like text and image, text and video, they are loosely aligned. So you can, you can have a description of what's going on in the video, but you don't have to exactly, You typically don't have exact description, oh, at, time step one second like what happened?Vibhu [00:46:02]: It's veryEthan [00:46:03]: At time step two second what happenedVibhu [00:46:03]: coarse. Yeah.Swyx [00:46:05]: So what was the ideal time step? You have to oblate it, and then it's like four seconds or something.Ethan [00:46:09]: So that comes down to how you design the model to, for the model to be aware of as a time, as a time modality. So the model is like a time aware. And that's something pretty unique if you think about LLMs. So if you ask LLM to complete a task, say they, uh-- you ask them and they will say, “Oh, this task will probably take twelve hours to complete,” and they come back in one hour. Say “I've already spent two days on this and I've exhausted everything.”Ethan [00:46:47]: So the LLMs them-themselves, they don't have a sense of time there.Vibhu [00:46:53]: I actually don't think that's just them not having a sense of time. I think it's somewhat based, right?Vibhu [00:46:58]: Like you tell someone, “Okay, go work on this feature. Go implement this,” there's a general understanding you would have of how long that would take without LLMs working at LLM speed, right? So you think back like two years ago, if I tell you to like build me like a new front end for latent space, have a search bar, have all this, you'll estimate that it'll take a few days, right?Vibhu [00:47:19]: So you tell an LLM, “Go build this.” It'll take me a few days. But I think it's somewhat grounded as opposed to them not having the best-- Not saying that they have a great understanding, but I think that example is like you can see where it comes from, right? You're trained on all over the text.Swyx [00:47:35]: They're, they're trying to estimate what a human would say.Vibhu [00:47:37]: because that's what the, that's what the data kind of represents. It's not themEthan [00:47:41]: It came from the corpus on the internet. People have a estimate of how much time.Vibhu [00:47:45]: And not even just in direct like training samples, right? Just your world understanding of tokens of how long stuff takes, right? Go read a book. It'll take you a while, right?Vibhu [00:47:56]: Even if you do nothing but read a book, it takes a few days. So yeah, LLM, I read it took me a few hours.Vibhu [00:48:01]: It'll take me a few hours to go through this research. But this is a tangent.Swyx [00:48:05]: Somewhat, yeah.Swyx [00:48:06]: This is a train of thought I haven't really expressed until now is, which is basically like a full world model must also be recursive, meaning that the participant in the world model must also be aware that they have a world model. which is like this whole recursive thing down the, down the line. but yes, and that the world model can be wrong and that they need to update it and blah. Yeah. We've, argued this on the, newsletter as well, that there needs to be sort of recursive or adversarial world models.World Models: Real-Time, Long-Horizon, Interactive VideoVibhu [00:48:34]: just, to ask, how do you define world model?Swyx [00:48:38]: Oh, yeah, let's go there.Ethan [00:48:40]: SoVibhu [00:48:40]: So just for context, we talked about, video generation, and then there's a-- if you say there's a distinction between world models, what's your, what's your definition? How do you see the two?Ethan [00:48:53]: So disclaimer, I'm not going to debate, what is world model. Yeah. there are many definitions, so I'll just talk about my definition. Since I came from the multi-model, multi-model domain, so mainly talking from video. So world model is like real-time interactive long horizon videos. So there are three parts. so we-- let's talk about them one by one. So the so interaction, so we just, we just look at Facebook and neural computer. So the interaction part of it, so you, world model can allow you to interact with them through keyboard, mouse, and maybe also voice. So these all is-- all is a modality. You can, you can interact with the model, and the model should respond reasonably. Second part is real time. So once you, once, say, you move your mouse, if, say, the world model generate a game, how fast can the game respond? So if you're like professional CS: GO players- -my say, oh, you have to respond- He's beginner within sub ten milliseconds or- Yeah even less. So that's not most of the- No, sixty FPS. Let's go. Oh, three hundred FPS. Oh, five hundred FPS. Wait. okay, yeah. I didn't do the math, but yeah, okay. Uh- Yeah, three hundred FPS, that's a three millisecond. So you have to respond- Oh, s**t. Okay. YeahEthan [00:50:29]: within a millisecond. Most of the video models cannot do that. Yeah. And, but if you, say, if you have a video model that is, say, like a digital human, the response time might be more generous. Maybe typically, for real-time voice interaction, it's like two hundred millisecond. So that's, that's much more generous. But even two hundred millisecond is pretty, it is pretty tricky, ‘cause remember we mentionedEthan [00:51:01]: you have this, temporal compression coming from the VAE. So if you, if you don't compress the temporal dimension, your sequence length is going to explode. So if you want to have this real-time, real-timeness in your model, you have to do is one context problem. And the third part is long horizon, ‘cause we-- if you're not going to just play with, video games just, a few seconds, most video models only a few seconds. We're going to play with minutes, hours. The model have to be able to generate long-form content.Ethan [00:51:42]: So putting these three together, it's, real-time, long horizon interactive videos. I think the final state will be, for example, like a video, a video version of Playbook, where you can, you can interact with, a neural computer. You move your mouse, and you click on the generative interface, and it will reply to you through pixels- generating in real time. But getting there, it's, it's a very long way to get there. So one of the first step, at Grok Imagine, where I led a small world model team there, was to build video extension. So, video extension- it's the first step of interactivity. Yeah. It's, it's the first step. Yeah. So it's the first step- You have it here, video editing, yeah. Yeah. Yeah. So the first step is because, this unlocks long horizon videos. Typically, for most of the video generation models, you give it a prompt or an image as an initial frame. You generate video, that's it. That's just, one time, done. And some creators would try to, use the last frame as a first frame for the second video. It can-- sometimes it works, but if you do it a few times, it says the quality would decrease. And- It doesn't have that context- Yeah over the full video, so the temporal- Yeah, exactly. Yeah, ‘cause you only gave it the last frame, of course, right? Yeah. Exactly. And- it's actually a pretty fun hack. if you've seen like- Oh, no, he's saying something better. Yeah. And for example, like Vue, I remember Vue 3 has like a second context of the last video. It is slightly better than using the last frame, but it has the same problem-- similar problem that it, the quality would decrease. if you extend a few times to, one minute, the video quality would look much worse than the first video. Second, another problem is that the model doesn't have long-range knowledge of, what's happening before. Say, if they generate some dialogue, some, two people speaking, and their voice might change, over some time, especially if the second conditioning, it does not cover the previous context. So these are the core challenges. So the Grok Imagine video extension, it has historical context of all of the previous generated videos. It can, It has, it has the context of, who is speaking and what objects have appeared and everything, having that to generate the next video. So if we naively do this, you can imagine, just, put all of the previous history video tokens into the context. The context lens will easily explode. Especially for video models, that can be like a few, a few million context, I would imagine- context lens. Yes.Yeah.Swyx [00:54:58]: Let's run with that.Ethan [00:54:59]: for example, like in Cosmos, I think just five seconds of video is like a fifty K or sixty K number of tokens. So like if you do, if you do fifty second, that's a five hundred K tokens. If you do longer than that, easily explode. This long horizon, problem was the first step we're trying to solve world model. It turns out people, yeah, people love video extension. Like a lot, a lot of the creators love using video extension to create longer form videos. This is the part I liked that you have a, you have an intermediate step toward the final goal instead of just a straight shot to the final version very much.Swyx [00:55:48]: But I can see you have a strong vision of where we want to end up.Long Context, Redundancy, and Efficient Interactive VideoVibhu [00:55:51]: Does it seem like it's an efficiency issue? okay, we're at a few million tokens context,. If you draw the parallel to language models, we had very short context, two thousand, eight thousand, then, you scale it up one million, ten million. sure, there's effective context, but at the end of the day, it's just what's it worth? sure, there's a whole training data side. In video, it might be slightly easier ‘cause we have a hundred million token video, right? Just take a movie with the full context there. Like is this efficiency from an inference standpoint that like it's expensive, but we know how to solve it? Or like why is this not the approach? So like my broader point was on your second point of world models, you say it needs to be interactive and live, right? You should be able to play a game and see the interaction live. So one thing I see with research is a lot of what you actually serve is different than what you build, right? So we talked about distillation. You train big model, you distill it, you do quantization, speculative decoding. We do all this stuff to serve it efficiently. Should we not just have a solution, like a world model that can interact well, do inference optimization, serve it, distill it secondary, so make it real time after you solve it? So like a-- another parallel is say, continual learning, right? What we need is someone to solve it and show it works inefficiently. Give it a few years, people will make it efficient. Same thing with regular attention, right? It worked. Over a few years, people have different forms of attention, and we've scaled it to be efficient at log context,? So kind of two things there, right? One is it seems like it works. You've scaled it. Can we not just scale it a lot more efficiently over time? Do we need a separate approach if this works? And same thing with interaction, right? if we can get it done, like if we can solve some way that it works, we can solve making it more efficient from an inference standpoint later.Ethan [00:57:53]: that's actually a very good point. So in videos, there's actually a lot of redundancies. So we solve a lot of the pixel redundancy from VE, but there's more redundancy in long range and long horizon videos. Say, if a character appear in the first clip and then it disappeared, it only reappear at the end of the video, you probably don't need the-- the context, like in the middle of the generation. So you only need that character, where you need. So that's why, I helped build another feature. It's a reference video.Vibhu [00:58:36]: Is it here?Swyx [00:58:36]: is it the same model release or different one?Ethan [00:58:39]: It's a different one.Ethan [00:58:41]: You probably need to search onSwyx [00:58:43]: I'll find itEthan [00:58:43]: X reference to video.Ethan [00:58:46]: So reference video allow you to like upload up to seven images as condition and generate the video. Say, if like I want-- it can, it can be characters or objects or even scenes. Say like I want, I want condition on, Sean's selfie and holding a bladeSwyx [00:59:07]: We have a dogEthan [00:59:08]: or whatever.Swyx [00:59:08]: We put the dog in the thing.Ethan [00:59:09]: you can put them there and the video models will generate the video from and copies the context over. So that can solve a lot of the problems there, like the long context problem. It doesn't need to have a very long context, but it's-- I feel like it's an intermediate solution. The modelSwyx [00:59:29]: It's cheating.Ethan [00:59:30]: the model should be able to like selectively know, where should I draw the references. So say if I want to generate a movie, I generate it autoregressive, like a ten second at a time or something. And now this character appear, I can look back to where it first appear and, bring that back. Yeah, this one, I put the references. Yeah, that's, Optimus, Einstein myself, Annie.Vibhu [01:00:02]: Oddly enough, I used Grok Search to find it, and it pulled your LinkedIn post. But yeah we found it.Ethan [01:00:08]: Interesting.Vibhu [01:00:10]: ButxAI's Underrated Work, Culture, and WatermarkingSwyx [01:00:11]: this is a problem. This is not your fault, but like XAI doesn't communicate all this work that you do very well because they just have the model release and then that's it. But actually, these details are very good.Swyx [01:00:22]: As far as I understand, everything you just described is state-art, like no one else has done it.Vibhu [01:00:30]: A lot of-- yeah, I have a lot moreSwyx [01:00:32]: And then, and then you just put this blog post with the cookies. I'm this is not enough,?Swyx [01:00:37]: but I, obviously this is like the high level numbers that people want to know. But no, okay, soVibhu [01:00:42]: And I wonder, like part of that is also some labs don't share research into what happens. And ifSwyx [01:00:50]: No, but this is literally bragging about how good they are, right?Swyx [01:00:54]: Like, why would you not say that you are capable of extending with full context? this is not a secret sauce. This is like we did the work. yeah, I don't know.Ethan [01:01:02]: different labs have slightly different communication styles.Swyx [01:01:07]: Anyway, if anyone from XAI is listening we are always happy to help you tell your story. Yeah, okay, so you did references, and I think, I think kind of the point you're, you're making is it is sort of like a kludge, right? this is-- you can do seven, but what about 100?Swyx [01:01:23]: Right? Then you need a completely different thing.Ethan [01:01:26]: So I think it's-- this is, a mechanism to, select the context from the history, and you might not put the entire history into the context. for example, there's a paper called Frame Pack, which haveEthan [01:01:41]: a heuristic that the latest history, the last one second, I put the entire history, and the history before that, I would, compress it and makes the video smaller. So they follow this pattern, this build overall pattern that the maximum sequence length is fixed. So the further you are from the current frame, you have a smaller image. So this is just a heuristic. I think it can be more automatic. The model is aware like which history part of it can be select. So this part of the research is actually being actively, worked on by a lot of people. It's also quite interesting. I feel this is actually, this part of long context is a little bit ahead of the LLM part.Ethan [01:02:31]: So for example, like in LLMs, if you-- so contexts keep growing. Let's say if you call tool and the tool call history is extremely long, that's still in context, and keep growing, keep growing. Even if you switch the topic to something else, the whole context was there. There are some agentic harnesses that help you to, say, prune the tool results and, prune Like when you, when you query a file, only show like the top 200 lines or something. Those were very heuristic-driven.Swyx [01:03:08]: For listeners, we did a write-up on the cloud code, leak where there are eight different kinds of pruning, including like you prune the tool results and all that. So you can, you can read up on that kind of thing.Ethan [01:03:17]: I think, one breakthrough in continual learning might be like a way to automatically, manage its own context.Swyx [01:03:27]: These are all heuristics, and they will be replaced by machine learning.Ethan [01:03:30]: InterestinglyVibhu [01:03:32]: TheEthan [01:03:32]: the same thing is being researched in both LLMs and video models.Vibhu [01:03:36]: The interesting thing is also like in the paper you showed, it's actually happening at the model level, right? Compared to like language models, sure, we have base attention, but we'll do our own compression, we'll do our own pruning, which is separate from model error.Vibhu [01:03:49]: Eventually, it all just boils in, hopefully.Swyx [01:03:52]: I think this is a form of like attention, but like also know sort of reasoning attention. I feel like that's different than normal attention.Swyx [01:04:03]: Does that, does that make sense?Ethan [01:04:04]: It's, it's different in the sense that attention, not to mention, set sparse attention aside,
Poetic Pictures: Camera Creates Captured Couplets. Parcel Panic: Digital Arrests and Deceptive Delivery Drama. Robo-Revival: T-Shirt Tech Taking Tailoring to the Top. Wolf Warning: Japan's Bear-Battling Bot Beasts Bite Back. Hedgehog Horizons: Satellites, Sensors and Saving Britain's Spiky Survivors. Tappy Tones: Boox Brings Bold Bluetooth Book Browsing. Hunting Hacks or High-Tech Hype? When Gadgets Game the Great Outdoors. Discordant Decisions: AI's Job-Judging Jumble. Sense and Surveillance: When Smart Security Cameras Go Spectacularly Silly.
Marketing a GSA schedule the right way can completely change how a small business grows inside federal contracting, especially when AI and cybersecurity are reshaping every RFI and RFP hitting the street. In this episode Zack Golden and the GovCon Giants team break down how active contractors, aspiring consultants, and entrepreneurs can position themselves around the hottest demand signals in the federal market right now, including AI governance, autonomous systems, and the massive Air Force Research Lab opportunity coming out of contract. Here is what you will learn in this episode: How to market a GSA schedule by expanding divisions, leaning into industry days, and using local site visits to win agency attention Why AI governance and AI security layers are showing up on nearly every federal RFI and RFP and how to position a client or your own company to capture that demand How to become a govcon consultant for AI and tech companies without burning out by managing one client to maintenance mode before signing the next How to build complementary teams by pairing AI companies with audit trail, robotics, and aerospace firms to deliver true turnkey solutions to federal buyers How to chase the $10 billion Air Force Research Lab AI IDIQ and other large vehicles by partnering with primes, IDIQ holders, and GSA schedule holders already inside the door EPISODE CHAPTERS: 0:00 - Meet Mindy your federal opportunity AI assistant 0:30 - Welcome to the Federal Help Center podcast 0:52 - Marketing a GSA schedule for small business 1:21 - Expanding divisions into cybersecurity and maintenance services 1:50 - Industry day strategy for GSA schedule holders 2:20 - Inside the OpenCube IQ AI tools and CRM 2:49 - Using AI agents for capability statements and FAR research 3:19 - Consulting opportunities with AI governance companies 3:48 - Why AI is the federal buzzword every agency wants 4:16 - Managing multiple consulting clients without burning out 4:44 - Building complementary teams around AI governance and audit trails 5:41 - Finding partner companies that compliment your AI offering 6:37 - Chasing the $10 billion Air Force Research Lab AI IDIQ 7:07 - Robotics autonomous systems and secured AI document platforms 8:03 - Adding AI offerings to existing GSA schedules and IDIQs Mindy gives you the federal opportunities, agency signals, recompete intel, and pursuit briefs that tell you not just what contracts exist, but which ones to chase and how to win them. Sign up for free Daily Alerts and get opportunities delivered to your inbox before the day starts.
Thanks for considering supporting on Patreon. PATREON - patreon.com/nodumbquestions NDQ EMAIL LIST - https://www.nodumbquestions.fm/email-list STUFF IN THIS EPISODE: Husqvarna Robotic Lawn Mowers Differential GPS Sandy Beach (Break-neck Beach), Hawaii Ben Schmanke - AuthenTech VSLAM RTK Positioning Mammotion DJI CONNECT WITH NO DUMB QUESTIONS: Thank you for considering supporting No Dumb Questions on Patreon. Discuss this episode here NDQ Subreddit Our podcast YouTube channel Our website is nodumbquestions.fm No Dumb Questions Twitter Matt's Twitter Destin's Twitter SUBSCRIBE LINKS: Subscribe on iTunes Subscribe on Android OUR YOUTUBE CHANNELS ARE ALSO FUN: Matt's YouTube Channel (The Ten Minute Bible Hour) Destin's YouTube Channel (Smarter Every Day)
MacroVoices Erik Townsend & Patrick Ceresna welcome, Dr. Pippa Malmgren & Jim Bianco. They'll discuss whether AI and Robotics are going to take our jobs, Nuclear Fusion, Disappearing Scientists, and much more. https://bit.ly/4tXAlJr
Preview for Later Today: Doug Messier describes NASA's innovative mission using robotic hoppers to survey the lunar South Pole, seeking water and potential sites for a future moon base through high-resolution imaging in the moon's environment.MAY 1952
In 1990, Marc Rowan walked out of Drexel with his belongings in a cardboard box. Within a year, Apollo was managing $6 billion. David Haber speaks with Marc Rowan, Cofounder, CEO, and Chair of Apollo Global Management, about building Apollo into one of the world's largest alternative asset managers and how private capital is reshaping the global economy. The conversation covers the rise of private credit, and why Rowan believes private markets are becoming increasingly central to financing the real economy. They also discuss AI, data centers, robotics, and the growing intersection between venture-backed technology companies and large-scale private financing. Along the way, they reflect on leadership, institutional culture, and why enduring organizations must adapt rather than protect the status quo. Resources: Follow David Haber on X: https://x.com/dhaber Learn more about Apollo Global Management: https://www.apollo.com Stay Updated:Find a16z on YouTube: YouTubeFind a16z on XFind a16z on LinkedInListen to the a16z Show on SpotifyListen to the a16z Show on Apple PodcastsFollow our host: https://twitter.com/eriktorenberg Please note that the content here is for informational purposes only; should NOT be taken as legal, business, tax, or investment advice or be used to evaluate any investment or security; and is not directed at any investors or potential investors in any a16z fund. a16z and its affiliates may maintain investments in the companies discussed. For more details please see a16z.com/disclosures. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.
Join Jim and Greg for this special 3 Martini Lunch as they look at some important stories that did not rise to martini status in recent weeks but deserve attention. Today, Jim looks at three different stories that leave him optimistic about young Americans. Greg also spotlights a story involving young people but devotes his other two choices to horrible actions by Democrats.After Greg shares some thoughts about Memorial Day, they shift to their discussions for the day. Jim cheers data showing very positive results from schools banning cell phone use by students for all or part of the school day. And while usage goes down, we're seeing significant progress in other areas. Meanwhile, Greg sounds the alarm on new research showing what impact hours of screen time have on young brains.Next, Jim applauds the low teen birth rate, which has plummeted over the past few decades. Greg focuses on the recent Supreme Court decision on racial gerrymandering to point out how Democrats claim to be defending norms and institutions, but want to abolish or radically alter institutions that don't do what they want them to do.Finally, Jim takes us inside his very positive experience with high school robotics competitions and the great lessons those students learn. Greg talks about a recent Justice Department report explaining just how much the Biden administration discriminated against pro-life Christians and other conservatives of faith.Please visit our great sponsors:OneSkinFor a limited time, try OneSkin with 15% off using code 3ML at https://oneskin.co/3MLPocket HoseFor a limited time, get two free gifts—a 360° rotating pocket pivot and a thumb drive nozzle—when you buy the Pocket Hose Ballistic; just text MARTINI to 64000, message and data rates may apply.New episodes every weekday.
Douglas Messier discusses a new partnership to develop asteroid mining technology. Key innovations like optical mining and solar thermal engines could eventually allow for large-scale robotic construction in space. (16/16)