POPULARITY
Categories
00:00-15:00: Hurricanes win Stanley Cup. ML breaks down how it happened. Thanks to Marz Motors and Batavia Downs Gaming. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.
Had Ada Palmer back on – this time to talk about Machiavelli, perhaps the most misunderstood thinker of all time.Machiavelli cut his teeth as a high-level diplomat for Florence, a position from which he got to closely observe the most important rulers in Europe at the time, including the ones who were on the path to destroying his dearly beloved Florence.In 1513 the Medici retook control of Florence and, wrongly suspecting Machiavelli of participating in a coup attempt, fired, tortured, and exiled him.Machiavelli could have left exile and worked for any number of different principalities that would have been eager to make use of his talents.Instead, he decided to rot in the countryside and compile his career's lessons about power, politics, and human nature into a book he dedicated to the very man whose new regime had tortured and exiled him, Lorenzo di Piero de' Medici.But at least the Medici were in a position to use his insights to defend Florence. Machiavelli the patriot did not want any other hands to touch these books, because those hands, armed further with these lessons, might pose an existential danger to Florence.The closest modern analogy, at least as Machiavelli would have seen it, would be Szilard's letter warning FDR about the possibility of a nuclear fission bomb.What were those insights? And how were they inspired by Machiavelli's dangerous diplomatic missions all across Europe, and his extensive reading of antiquity? Watch this episode with Ada Palmer to find out!By the way, Ada is launching a new podcast which I'm very excited about. The first season will be about Machiavelli - a perfect way to dive deeper into the topics we discussed in this episode. Subscribe at Beforecast's website to be notified of the first episode, subscribe on YouTube, follow her on Patreon, and if you want even more Ada, check out her FixTheNews Podcast episode, and check out her books and more.Watch on YouTube; read the transcript.Sponsors* Cursor recently saved one of my podcast recordings. When a video file from a shoot came out corrupted, I pointed Cursor at it: it recovered the footage on its own, tracking down the right reference file from the file's metadata and realigning the out-of-sync audio. My whole team now uses Cursor for everyday tasks, not just coding. Get started at cursor.com/dwarkesh* Jane Street's hiring process has been going viral on Twitter lately. The memes are pretty funny, but I wanted to see what their interviews were actually like. So I had Ricson, one of Jane Street's ML researchers, walk me through a retired puzzle: he gave me an image dataset where 50% of the files had been corrupted – I had to figure out how to recover them. If you're interested in these sorts of puzzles, you can find Jane Street's open roles at janestreet.com/dwarkesh* Crusoe is turning the AI datacenter buildout into an industrial process. At their massive Colorado factory, they assemble Spark units, modular datacenters with power, cooling, and fire suppression built in. They also manufacture specific components in-house to skip the longest lead times. Crusoe has experience running these Spark units on a range of energy sources, including solar and used EV batteries, ensuring they don't get bottlenecked by grid availability. Learn more at crusoe.ai/dwarkeshTimestamps(00:00:00) – How Florence bargained with Cesare Borgia for survival(00:15:08) – Machiavelli's analytical innovations(00:23:58) – Why popes became warlords(00:36:13) – Why the common people demanded nepotism(00:47:57) – Cesare Borgia brought terror to rulers and justice to the people(00:57:55) – Art as a proxy for war(01:06:41) – Florence, a city famous in hell(01:15:57) – The Prince was a job application to Machiavelli's torturers(01:41:39) – During the Renaissance, original ideas had to be couched in antiquity(01:50:44) – Why copyright began with the Inquisition(02:02:12) – Machiavelli wasn't Machiavellian Get full access to Dwarkesh Podcast at www.dwarkesh.com/subscribe
Aisha Francis has built a career as a performer, choreographer, teacher, and one of the dance industry's most respected heels educators. In this conversation, she shares the unexpected story of how she ended up helping Beyoncé learn to dance in heels, along with the lessons she's learned from decades of working in the industry. We discuss confidence as a trainable skill, the physical and psychological foundations of performance, what dancers often misunderstand about building a career, and why training with intention matters. Aisha also opens up about burnout, losing her love for dance, finding it again through teaching, and the realities of navigating a constantly changing industry. From unforgettable stories on stage to practical insights on artistry, professionalism, and longevity, this episode offers a candid look at what it takes to grow not only as a dancer, but as a performer and person. Follow Galit: Instagram - https://www.instagram.com/gogalit Website - https://www.gogalit.com/ Fit From Home - https://galit-s-school-0397.thinkific.com/courses/fit-from-home You can connect with Aisha on Instagram https://www.instagram.com/iamaishafrancis and through her website https://aishafrancis.com/ Listen to DanceSpeak on Apple Podcasts and Spotify.
In this episode of Alexa's Input (AI), I sit down with David Aronchick, co-founder and CEO of Expanso and former product lead for Kubernetes at Google.Data is growing everywhere outside your data center. Solar panels in remote across a country. Security cameras at retail stores. IoT sensors across factory floors. And moving that data to the cloud for processing? It's expensive, slow, and often restricted by compliance.David is an expert when it comes to solving distribution problems. He led Kubernetes product at Google, co-founded Kubeflow to bring ML to production, and now he's building Expanso to tackle a difficult constraint: when your data can't move, how do you process it where it lives?We discuss:- The need for distributed data orchestration-Upstream data control: filtering and transforming at the source- Three forces making edge computing inevitable (physics, regulations, economics)- How to build successful open source infrastructure projects- Customer discovery and finding real pain points- His transition from Protocol Labs to founding Expanso- ETL pipelines: moving the first four steps closer to the data- Context loss and lineage in distributed systems- Processing 400,000 signals per second with 150MB agents- AI observability: attaching source metadata to training data- Running ML pipelines at the edge- Real-world deployment challenges (bandwidth, regulations, cost)Expanso is rethinking how we process data in an AI-native world—moving compute to data instead of data to compute. If you want to understand where distributed systems and edge computing are heading, this is a deep dive into the infrastructure layer beneath modern AI applications.General Podcast LinksWatch: https://www.youtube.com/@alexa_griffith Read: https://alexasinput.substack.com/ Listen: https://creators.spotify.com/pod/profile/alexagriffith/ More: https://linktr.ee/alexagriffithLearn more about the host atWebsite: https://alexagriffith.com/ LinkedIn: https://www.linkedin.com/in/alexa-griffith/Find out more about the guest atLinkedIn: https://www.linkedin.com/in/aronchick/ Twitter/X: https://x.com/aronchick GitHub: https://github.com/aronchick Expanso Website: https://expanso.io/ResourcesExpanso Website: https://expanso.io/ Kubernetes: https://kubernetes.io/ Kubeflow: https://www.kubeflow.org/ CNCF (Cloud Native Computing Foundation): https://www.cncf.io/ Protocol Labs: https://protocol.ai/KeywordsDavid Aronchick, Expanso, Kubernetes, Kubeflow, distributed systems, edge computing, data pipelines, ETL, upstream data control, Google Kubernetes Engine, open source, CNCF, observability, log processing, data lineage, provenance, schema enforcement, IoT, edge AI, distributed data, machine learning infrastructure, Protocol Labs, IPFS, Filecoin, data governance, compliance, GDPR, bandwidth optimization, data aggregation, AI infrastructure, multi-cloud, hybrid cloud, real-time processing
nFactorial Intelligence - еженедельный обзор новостей из мира стартапов и ИИ На этой неделе разбираем: самая мощная ИИ-модель в истории - Claude Fable 5. Ангельский портфель Джеффа Безоса и ментальные модели Илона Маска. Fable 5 переписал 50 млн строк кода Stripe за день и отказывается помогать, если считает ваши ML-исследования слишком интересными. Zepto и Bending Spoons выходят на IPO, а китайский Unitree готовится захватить рынок человекоподобных роботов. YC отказал Mercor и потерял больше $1 млрд, война открытых и закрытых моделей. Топы из Кремниевой Долины играют в мафию на камеру. Рекомендации: - Следующая встреча nFactorial Club 21 июня в 9:00 онлайн - https://hi.nfactorial.club/ - nFactorial Teens: 2-недельный летний лагерь по вайб-кодингу для школьников в Алматы (11-16 лет). Цель: создание своего оригинального веб-приложения или веб-игры - https://courses.nfactorial.school/teens
We start our series of 8 episodes focused on seiyuu who have improved their singing skills to a level that they are unrecognizable now (yes, because they now sing amazingly!).In this episode we cover Ryohei Kimura, one of the most popular seiyuu in the past decade and one of the most sought after seiyuu when it comes to portraying unique and wacky characters in 2D music projects.Music suggestions:- Lil Happy "Playlist"- NSFW "Crazy Nutrient"- TRIGNAL (Ryohei Kimura) "Siren Fairlady"Thanks to M L for inspiring this series of episodes!
Hey folks, Alex here, and welcome to a BIG MODEL week! We finally got Mythos (well almost)! Let me catch you up! This week started with WWDC26 from Apple, and Max Weinbach, who was in the room at Apple Park and actually has access to some of the new features including an all new SIRI AI, joined us to break down what could be the most used AI in the world very soon. At first I was skeptical, but he convinced me that the new Siri is actually good! Then, we saw the ultimate model drop: Anthropic finally shipped Mythos (X, my system card thread, benchmarks). Same weights, two names: Mythos 5 is the unrestricted version that only Project Glasswing partners get, Fable 5 is what the rest of us get, wrapped in the heaviest guardrails I've ever seen ship on a frontier model. It's state of the art on nearly every benchmarkThe model that was “too dangerous to release” is now... well, released, but with the heaviest guardrails we've seen. More on this later. Peter Gostev from Arena.ai joined us to break down the new model. Last but definitely not least, Google released a real-time translation model, that our friend Thor Schaeff from DeepMind demoed live, while we all spoke in different languages and it translated us in REAL TIME. It was really cool, definitely check that out. There's quite a few more things, like Loop Engineering Alpha, Swyx came by to talk about FrontierCode, OpenAI confirmed our suspicions that the anti-datacenter social media posts could be a concerted effort by groupds links to the Chinese government and much more. Let's dive in! ThursdAI - Let me catch you up, every week!
In this special Onward and Upward segment episode of Mission Matters, Adam Torres interviews ML Bruin, Author of The Noah Series of Books. ML shares insights into his creative process, discusses the evolution of The Noah Series, and reflects on the lessons learned from writing, publishing, and connecting with young readers through positive storytelling. Follow Adam on Instagram at https://www.instagram.com/askadamtorres/ for up to date information on book releases and tour schedule. Apply to be a guest on our podcast: https://missionmatters.lpages.co/podcastguest/ Visit our website: https://missionmatters.com/ More FREE content from Mission Matters here: https://linktr.ee/missionmattersmedia Learn more about your ad choices. Visit podcastchoices.com/adchoices
STRONGER BONES LIFESTYLE: REVERSING THE COURSE OF OSTEOPOROSIS NATURALLY
In this eye-opening conversation, Debi Robinson and Dr. John Neustadt expose a fundamental flaw in how we approach bone health: we've been focusing on bone density instead of actual fracture risk.Drawing from 20+ years of research and clinical practice, Dr. Neustadt reveals that only four nutrients have been proven in clinical trials to reduce fractures—calcium, vitamin D, vitamin K2 (MK-4 specifically), and magnesium. He challenges the one-size-fits-all approach to supplementation and explains why popular supplements like MK-7 and strontium fall short of their marketing claims.The episode deep-dives into why bone density tests are poor predictors of fracture risk, how supplement companies mislead consumers with marketing claims that don't align with clinical data, and the critical role of gut health, sleep, hormones, and lifestyle in fracture prevention.Most importantly, Debi and Dr. Neustadt provide actionable, evidence-based strategies that women can implement immediately to actually protect their bones—without fear-based messaging.WHAT YOU'LL LEARN✓ Why bone density scores are not reliable predictors of fracture risk✓ The 4 nutrients with clinical trial evidence for fracture reduction (and the doses that actually work)✓ Why MK-7 vitamin K2 doesn't improve bone strength (and why MK-4 does)✓ How to assess YOUR individual calcium needs (most women are over-supplementing)✓ The vitamin D target range for optimal fracture protection✓ Why strontium supplements mislead consumers (and the hidden risks)✓ The role of melatonin receptors in bone health and sleep deprivation's link to fractures✓ How gut health directly impacts bone strength✓ The importance of serotonin, melatonin, and the gut-bone axis✓ HRT and testosterone replacement as part of a comprehensive bone health strategy✓ How to evaluate supplement companies and ensure they have fracture outcome data✓ Red flags when choosing bone health supplements✓ The gap between conventional medicine's approach (DEXA + medication) and integrative bone health✓ Why doctors are confused about osteoporosis (and how to advocate for yourself)ACTION STEPSGet your vitamin D tested. Aim for 30–44 ng/mL for optimal fracture protection (different from immune health recommendations).Assess your dietary calcium intake before adding supplements. If you're eating well, you may only need 400 mg as a supplement, not the standard 1,200 mg recommendation.Switch MK-7 supplements to MK-4. If you're taking a vitamin K2 supplement, verify it's MK-4 at 45 mg per day in divided doses. MK-7 doesn't reduce fractures.Check your supplement labels for strontium. If it's there, especially if the company markets it as "proven to improve bone density," consider switching to a formula without it.Prioritize gut health. Work with a practitioner to run stool tests if you have bloating, constipation, postnasal drip, or other GI symptoms. Gut inflammation accelerates bone loss.Track your sleep quality. Sleep deprivation is linked to 17% of fractures. If you're sleeping less than 6 hours nightly, prioritize this.Ask supplement companies the right questions:"Do you have fracture outcome data from clinical trials?""Will you provide a certificate of analysis showing purity and potency?""What guarantee do you offer?"Evaluate your medications. Check with your doctor: Are any of your current prescriptions contributing to bone loss? (SSRIs, certain blood pressure meds, proton pump inhibitors, corticosteroids, etc.)Consider HRT or bioidentical hormone replacement, especially if you're post-menopausal. Research shows a 40% reduction in osteoporotic fracture risk with appropriate hormone therapy.Build lifestyle foundations: Prioritize whole-food nutrition, strength training, stress management, and community connection. Oxytocin (released through physical contact) supports bone health.RESOURCES & LINKSDr. John Neustadt's Website: nbihealth.com and book Fracture-Proof Your Bones: A Comprehensive Guide to OsteoporosisDebi's website: https://debirobinson.comHealthy Gut Healty Bones Program: https://debirobinson.com/healthy-gut-healthy-bones-program-v2/Join the Community: https://debirobinson.com/the-stronger-bones-lifestyle-community/Yoga Therapy MasterClass: https://debirobinson.com/yoga-therapy-for-bones-health-mc/28-Day Stronger Bones Method: https://debirobinson.com/28-day-stronger-bonesmorning-method/Instagram: https://www.instagram.com/debirobinsonwellness/Youtube Channel: https://www.youtube.com/@debirobinsonwellness/DEBI'S TAKEAWAY"Fracture-proofing your bones isn't about chasing a higher DEXA score. It's about building the internal biochemical balance that actually prevents fractures. You have the research, you have the tools, and you have the power to take control of your bone health naturally. Use that power."
In this special Onward and Upward segment episode of Mission Matters, Adam Torres interviews ML Bruin, Author of The Noah Series of Books. ML shares insights into his creative process, discusses the evolution of The Noah Series, and reflects on the lessons learned from writing, publishing, and connecting with young readers through positive storytelling. Follow Adam on Instagram at https://www.instagram.com/askadamtorres/ for up to date information on book releases and tour schedule. Apply to be a guest on our podcast: https://missionmatters.lpages.co/podcastguest/ Visit our website: https://missionmatters.com/ More FREE content from Mission Matters here: https://linktr.ee/missionmattersmedia Learn more about your ad choices. Visit podcastchoices.com/adchoices
Герой пятого выпуска — Михаил Гульшин, руководитель технологий защиты в Т-Мобайле. О чем болтаем?Обсуждаем, какие они — звоночки от мошенников, и как им противостоит Нейрощит. Выясняем, как обучают ML-модели для Нейрощита и почему команда ручается за качество определения мошенников. Узнаем, на что обычно жалуются клиенты, и объясняем, как их потребности закрывает еще один продукт Т-Банка — Фродрулетка.Таймкоды:00:36 О чем болтаем?3:58 Гость — Михаил Гульшин4:44 Как Миша попал в антифрод5:57 Как появился Нейрощит7:07 Как Нейрощит связан с Фродрулеткой7:51 Что такое Фродрулетка8:47 Кем представляются мошенники9:20 Нейрощит под капотом10:30 Как безошибочно вычислять мошенников12:44 Как для абонента выглядит работа Нейрощита15:56 Популярные схемы мошенников18:27 Команда Нейрощита19:02 Как тестируют ML-модели20:19 Вариативность в ответах ML-модели 20:48 Как набирают датасет для ML23:28 Обратная связь от пользователей24:13 Нефункциональные параметры моделей25:27 Можно ли обмануть Нейрощит26:39 Что получилось не сразу28:40 На что жалуются клиенты29:22 Лучшая защита — это нападение?30:10 Мошенничество — это индустрия31:34 Как натаскивают ML-моделей32:40 QA-команды в Нейрощите34:51 Интеграции с другими продуктами36:48 Как убедить человека по телефону39:01Блиц40:25 Будущее НейрощитаСсылки:Канал QA-команды Т-Банка в Телеграме: https://t.me/+zv2oER8b4Lg2MDliБольше о разработке и технологиях Т-Банка: https://l.tbank.ru/kod_zheltyiО жизни команды и свежих ИТ-вакансиях: https://l.tbank.ru/t_crew
Ráno jsme zastihli naši reportérku Zuzanu Boučkovou na cestě do Špindlerova Mlýna. Teď už Zuzana stojí přímo u obnaženého dna Labské přehrady. A ráno řešila déšť a holínky.
00:00-15:00: ML loves the way Milwaukee plays, loves the roster and says they are easy to cheer for. Thanks to CH Insurance and Marz Motors. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.
At this year's PEGS Boston, industry experts gathered on a panel to explore how AI and machine learning are deployed in biologics R&D today. Moderated by Peter M. Tessier, Ph.D., Albert M. Mattocks professor of pharmaceutical sciences and chemical engineering at University of Michigan, the panel consisted of Andrew Buchanan, Ph.D., head of discovery at a stealth-mode biotech company; Norbert Furtmann, Ph.D., head of biologics AI and design of large molecules research at Sanofi; Konrad S. Krawczyk, Ph.D., founder and CSO at NaturalAntibody SA; Andrew C.R. Martin, Ph.D., emeritus professor of bioinformatics and computational biology at University College London; Melody Shahsavarian, Ph.D., senior director of data strategy and digital transformation of biotherapeutics discovery research at Eli Lilly & Company; and Bernhardt L. Trout, Ph.D., professor of chemical engineering at Massachusetts Institute of Technology. Links from this episode: Pharmaceutical Sciences & Chemical Engineering, University of Michigan University of Michigan Sanofi NaturalAntibody SA Bioinformatics, UCL Biosciences Computational Biology, UCL University College London Eli Lilly & Company
Hanácká mozeka Litovel letos slaví 25 let. V jejím čele pravidelně usedá za cimbál Iveta Navrátilová, která kromě vedení muziky pečuje léta o dětský pěvecký sbor Mládí v Litovli a smíšený pěvecký sbor v Příkazích.
Privileged Access Management has outgrown the vault. In this episode, Matthias sits down with lead analyst Alejandro Leal, author of KuppingerCole's newly released PAM Leadership Compass, to explore how the definition of privilege itself has changed, what NHIs and agentic AI mean for PAM, and why deployment sovereignty is now a boardroom conversation. Key Topics: ✅ How the definition of "privilege" has shifted from admin accounts to dynamic runtime identity capabilities✅ PAM convergence with IGA, CIEM, ITDR, SIEM, and SOAR — the end of the standalone PAM product✅ Non-Human Identities (NHIs) and agentic AI: the silent accumulation of machine privilege✅ Just-in-time access: the gap between concept and operational reality✅ Deployment sovereignty: who controls the keys to the kingdom — SaaS, on-prem, or hybrid?✅ AI and ML in PAM: separating genuine innovation from marketing inflation "Most enterprises can tell you the number of employees they have — very few can tell you the number of machine identities." If that sounds familiar, this episode is for you.
Welcome back to our weekend Cabral HouseCall shows! This is where we answer our community's wellness, weight loss, and anti-aging questions to help people get back on track! Check out today's questions: Kay: Hi Dr. Cabral, Thanks for your very informative and interesting podcasts. How would you advise a post-menopausal 60 y.o family member if they tested low in ferritin (39.4 ng/mL)? I've read that this biomarker shows how much energy your body's cells have and low levels would result in symptoms like fatigue, low energy/easily tired and excessive hair shedding. This family member suffers from these symptoms. Other biomarkers revealed low AM cortisol and low LDL-C/ApoB ratio (1.1) and low basal metabolic rate of 1143 kcals/day. Although her TSH tested normal (1.3 uIU/mL), she's been on levothyroxine 75 mcg to manage hypothyroid. Her high-sensitivity CRP was not optimal at 1.49 mg/L and she has a family history of heart disease. What would you recommend for this family member? Thanks Earl: I am currently on 20 mg of lisinopril daily. Also, my GFR is 62. Would either of these be a concern when considering creatine? Alesi: Dr.Cabral, can you please explain Alpha-gal syndrom? Why does it happen, how to confirm it by testing and how would you approach it? Is it treatable? Thank you Peter: Hello, Dr.Cabral. I am an integrative health practitioner and would like to thank you for helping me understand the underlying causes of human imbalances. There is one thing that makes no sense to me though…regarding IgG testing, why would you recommend to test every year? Why doesn't suffice to test once and simply stay away from intolerant food items? Why would these intolerances change? Also, in my country there are IgG4 vs IgG1-3 testing options, what are the differences? Thank you very much for your time and knowledge you share with us. Dipali: Hi I want to start 7 days detox plan, I already did your minerals and heavy metal test, I got my results back. My question is I am taking berberine, oregano oil and magnesium citrate,( I am prediabetic my Hba1c is 6.2)do I need to stop before starting detox method. Thanks Thank you for tuning into today's Cabral HouseCall and be sure to check back tomorrow where we answer more of our community's questions! - - - Show Notes and Resources: StephenCabral.com/3774 - - - Get a FREE Copy of Dr. Cabral's Book: The Rain Barrel Effect - - - Join the Community & Get Your Questions Answered: CabralSupportGroup.com - - - Dr. Cabral's Most Popular At-Home Lab Tests: > Complete Minerals & Metals Test (Test for mineral imbalances & heavy metal toxicity) - - - > Complete Candida, Metabolic & Vitamins Test (Test for 75 biomarkers including yeast & bacterial gut overgrowth, as well as vitamin levels) - - - > Complete Stress, Mood & Metabolism Test (Discover your complete thyroid, adrenal, hormone, vitamin D & insulin levels) - - - > Complete Food Sensitivity Test (Find out your hidden food sensitivities) - - - > Complete Omega-3 & Inflammation Test (Discover your levels of inflammation related to your omega-6 to omega-3 levels) - - - Get Your Question Answered On An Upcoming HouseCall: StephenCabral.com/askcabral - - - Would You Take 30 Seconds To Rate & Review The Cabral Concept? The best way to help me spread our mission of true natural health is to pass on the good word, and I read and appreciate every review!
In this talk, Nikita, Senior Applied Data Scientist at the AWS Generative AI Innovation Center, shares his expertise in bringing enterprise artificial intelligence out of the sandbox—from his early days optimizing traditional machine learning models like gradient boosting to deploying advanced production-grade GenAI pipelines. We explore what it really takes to move generative AI systems from pilot prototypes to production environments.Links:- AWS Generative AI Innovation Center: https://aws.amazon.com/ai/generative-ai/innovation-center/You'll learn about:- Deploying multi-layered defenses independent of backend LLMs.- Evaluating parameter-efficient methods like LoRA and QLoRA for small models.- Balancing long-term domain expertise with real-time documentation retrieval.- Utilizing multi-agent orchestration for search and anomaly explanation.- Setting up robust LLM-as-a-judge frameworks verified by human metrics.- Leveraging Amazon Bedrock components for memory and runtime scalability.TIMECODES:05:52 Shifting from traditional ML to generative AI07:49 Hybrid pipelines blending classical ML and LLMs11:25 Production guardrails and multi-layered system defense16:15 Prompt bypasses, input attacks, and AI red teaming20:49 Newsletter localization and translation with Zalando27:24 Evaluation frameworks and human-in-the-loop metrics33:07 Aligning LLM-as-a-judge with few-shot prompts34:49 Fine-tuning small language models versus prompting41:18 Complementary mechanics of RAG and fine-tuning43:00 Agentic web search tools for anomaly explanation47:01 Automated text generation from real-time sports sensors49:58 AWS project scoping and proof of concept timelines54:58 Interview requirements and career skills for AWS roles57:59 Enterprise architecture patterns and system observability01:00:42 Reusable infrastructure blocks on Amazon BedrockThis session is designed for machine learning engineers, data scientists, and technical product managers looking to architect reliable, production-ready GenAI workflows. It is highly valuable for teams aiming to bridge the gap between experimental AI prototypes and secure enterprise software.Connect with DataTalks.Club:- Join the community - https://datatalks.club/slack.html- Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ- Check other upcoming events - https://lu.ma/dtc-events- GitHub: https://github.com/DataTalksClub- LinkedIn - https://www.linkedin.com/company/datatalks-club/ - Twitter - https://twitter.com/DataTalksClub - Website - https://datatalks.club/ Connect with Nikita- Linkedin - https://www.linkedin.com/in/kozodoi/- Github - https://github.com/kozodoi- Website and blog - https://www.kozodoi.me/
Thiago Cardoso is the Director of Data & AI at iFood and the architect behind iFood Pago's AI agent platform. This fintech system serves millions of restaurants across Brazil through WhatsApp and the iFood app. In this episode, he breaks down what it actually takes to ship agentic AI in production at scale.The Control-vs-Magic Spectrum Building Agents // MLOps Podcast #382 with Thiago Cardoso, Director of Data & AI at iFood
I've been reviewing music and following the careers of male seiyuu for 16 years, and over time, I've noticed some seiyuu going from being a disaster as singers to becoming some of the best singers we now have in the industry.In this episode, inspired by regular viewer M L, let's talk about some of the male seiyuu who have improved a lot over time and what could be behind those improvements.This is the first of several episodes covering male seiyuu who have improved their singing skills over time.
Brought to you by In the Money Plus - get a free month when you sign up BEFORE the Belmont - details at inthemoneypodcast.com/plus Peter Thomas Fornatale hosts Jonathan Kinchen and Belmont Stakes Day specialist Michael Adolphson for ITM's definitive free horse racing tips and analysis show ahead of the 158th Belmont Stakes. The crew delivers their final picks for Saturday's $2 million Grade 1 at Saratoga and breaks down the complete all-stakes Pick 6 sequence — six races, five graded stakes, one of the richest single-day betting cards in American horse racing. RACE 8 — TRUE NORTH STAKES (G3) | $400,000 | 6½ Furlongs PP1 Acoustic Ave — Trainer: Linda Rice — Jockey: Jose Lezcano PP2 Imagination — Trainer: Bob Baffert — Jockey: Flavien Prat PP3 Bentornato — Trainer: J.F. D'Angelo — Jockey: Irad Ortiz Jr. PP4 Listenupshance — Trainer: Doug O'Neill — Jockey: Antonio Fresu PP5 Faust — Trainer: Steve Asmussen — Jockey: Jose L. Ortiz PP6 Book'em Danno — Trainer: Danny Ryan — Jockey: Pedro Lopez PP7 Be You — Trainer: Todd Pletcher — Jockey: John Velazquez PP8 Illuminare — Trainer: Todd Pletcher — Jockey: Manuel Franco PP9 Pentathlon — Trainer: Shug McGaughey — Jockey: Dylan Davis RACE 9 — JAIPUR STAKES (G1) | $500,000 | 5½ Furlongs Turf PP1 Governor Sam — Trainer: George Weaver — Jockey: Pedro Lopez PP2 Bold Journey — Trainer: Bill Mott — Jockey: Junior Alvarado PP3 Litigation — Trainer: B.A. Lynch — Jockey: Florent Geroux PP4 Works for Me — Trainer: J.R. Lee — Jockey: Flavien Prat PP5 Reef Runner — Trainer: D. Fawkes — Jockey: Irad Ortiz Jr. PP6 Ag Bullet — Trainer: Rafael Baltas — Jockey: John Velazquez PP7 Clock Tower — Trainer: Wesley Ward — Jockey: Dylan Davis PP8 John the Beer Man — Trainer: R. Atras — Jockey: Kendrick Carmouche PP9 Twenty Six Black — Trainer: H. De Paz — Jockey: Manuel Franco PP10 My Boy Prince — Trainer: Mark Casse — Jockey: Jose L. Ortiz RACE 10 — WOODY STEPHENS STAKES (G1) | $500,000 | 7 Furlongs PP1 Gilded Bandit — Trainer: Bill Mott — Jockey: Junior Alvarado PP2 Obliteration — Trainer: Steve Asmussen — Jockey: John Velazquez PP3 Six Speed — Trainer: George Weaver — Jockey: Irad Ortiz Jr. PP4 Stradale — Trainer: Steve Asmussen — Jockey: Ricardo Santana Jr. PP5 Solitude Dude — Trainer: Saffie Joseph Jr. — Jockey: Flavien Prat PP6 Crude Velocity — Trainer: Bob Baffert — Jockey: Florent Geroux PP7 Englishman — Trainer: Cherie DeVaux — Jockey: Jose L. Ortiz PP8 Civil Liberty — Trainer: Doug O'Neill — Jockey: Antonio Fresu PP9 Taj Mahal — Trainer: B.T. Russell — Jockey: Manuel Franco RACE 11 — METROPOLITAN HANDICAP (G1) | $1,000,000 | 1 Mile PP1 Nysos — Trainer: Bob Baffert — Jockey: Flavien Prat PP2 Vibe — Trainer: Todd Pletcher — Jockey: Luis Saez PP3 Antiquarian — Trainer: Todd Pletcher — Jockey: John Velazquez PP4 Saudi Crown — Trainer: Brad Cox — Jockey: Irad Ortiz Jr. PP5 Rated by Merit — Trainer: Chad Brown — Jockey: Dylan Davis PP6 Knightsbridge — Trainer: Bill Mott — Jockey: Junior Alvarado PP7 Journalism — Trainer: Michael McCarthy — Jockey: Jose L. Ortiz RACE 12 — MANHATTAN STAKES (G1) | $1,000,000 | 1-3/16 Miles Turf PP1 Tiz Dashing — Trainer: Barclay Tagg — Jockey: Javier Castellano PP2 Test Score — Trainer: Graham Motion — Jockey: Manuel Franco PP3 Make Me King — Trainer: H. Al Jehani — Jockey: Jose L. Ortiz PP4 Integration — Trainer: Shug McGaughey — Jockey: John Velazquez PP5 Deterministic — Trainer: Michael Clement — Jockey: Kendrick Carmouche PP6 Bright Picture — Trainer: Andre Fabre — Jockey: Flavien Prat PP7 Rhetorical — Trainer: W. Walden — Jockey: Irad Ortiz Jr. PP8 One Stripe — Trainer: Graham Motion — Jockey: G. Lerena PP9 Battle of Normandy — Trainer: Shug McGaughey — Jockey: Dylan Davis RACE 13 — BELMONT STAKES (G1) | $2,000,000 | 1¼ Miles PP1 Vitruvian Man — Trainer: Doug O'Neill — Jockey: Antonio Fresu — ML: 30-1 PP2 Powershift — Trainer: Todd Pletcher — Jockey: Luis Saez — ML: 12-1 PP3 Chief Wallabee — Trainer: Bill Mott — Jockey: Junior Alvarado — ML: 3-1 PP4 Renegade — Trainer: Todd Pletcher — Jockey: Irad Ortiz Jr. — ML: 2-1 PP5 Ottinho — Trainer: Chad Brown — Jockey: Dylan Davis — ML: 20-1 PP6 Growth Equity — Trainer: Chad Brown — Jockey: Manuel Franco — ML: 12-1 PP7 Commandment — Trainer: Brad Cox — Jockey: John Velazquez — ML: 6-1 PP8 Emerging Market — Trainer: Chad Brown — Jockey: Flavien Prat — ML: 6-1 PP9 Golden Tempo — Trainer: Cherie DeVaux — Jockey: Jose Ortiz — ML: 9-2 Free horse racing picks, pace analysis, wagering strategy, and best bets for Belmont Stakes Day at Saratoga. Whether you're playing the Belmont Stakes win bet, the Pick 6, building exactas, trifectas, or superfectas across the full card, this is your free horse racing analysis hub for June 6, 2026.
00:00-15:00: More on Ben Rice from ML, the MVP of the Yanks. Thanks to CH Insurance and Marz Motors. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.
How do you know if an AI system is trustworthy, compliant, ethical, and fit for purpose? In this episode of the FIT4Privacy Podcast, Punit Bhatia is joined by Stella Liu, an AI evaluation expert and founder of AI Evals & Analytics, to unpack one of the most practical and overlooked challenges in AI today: how to evaluate AI systems before and after deployment. KEY MOMENTS 02:09 —AI Definition 03:02 —AI Evaluations 10:31 — Why AI Testing Is Hard 14:06 — Evals Plus Analytics 18:15 —Synthetic Data 23:47 —Protecting Privacy Ethical 29:05 — AI Evals as a Company 29:52 —How to reach Stella Liu Stella explains why AI behaves differently from traditional software, why testing code alone is no longer enough, and how AI evaluations (AI evals) help organizations assess real-world behavior, risk, and performance. From evaluation driven development to continuous monitoring in production, the conversation explores how teams can move beyond guesswork and hype toward repeatable, measurable AI governance. ⸻ ABOUT THE GUEST Stella Liu is the Co-founder of AI Evals & Analytics (Maven), where she created the AI Evals & Analytics Playbook and teaches top-rated courses on LLM evaluation, monitoring, and product alignment. She's also the Head of AI Applied Science at ASU, leading evals and analytics across university-wide AI products and building higher-ed's first formal AI evaluation framework, and she previously led data science at Shopify and Carvana with 12+ years shipping large-scale ML systems. ABOUT THE HOST Punit Bhatia is one of the leading privacy experts who works independently and has worked with professionals in over 30 countries. Punit works with business and privacy leaders to create an organization culture with high privacy awareness and compliance as a business priority. Selectively, Punit is open to mentor and coach privacy professionals. ⸻ Resources & Links Guest Links Stella Lui • Website: https://maven.com/ • LinkedIn: https://www.linkedin.com/in/wenxingl/ Grow Skills (Privacy Courses & Insights) • Courses: https://growskills.store/courses/ • Insights: https://growskills.store/insights/ • Website: https://growskills.store/ FIT4Privacy • Website: https://www.fit4privacy.com • Podcast: https://www.fit4privacy.com/podcast • Blog: https://www.fit4privacy.com/blog • YouTube: http://youtube.com/fit4privacy Punit Bhatia • Website: https://www.punitbhatia.com Books • Be Ready for GDPR • AI & Privacy – How to Find Balance • Intro to GDPR • Be an Effective DPO
Brought to you by In the Money Plus - get a free month when you sign up BEFORE the Belmont - details at inthemoneypodcast.com/plus Peter Thomas Fornatale hosts Jonathan Kinchen and Belmont Stakes Day specialist Michael Adolphson for ITM's definitive free horse racing tips and analysis show ahead of the 158th Belmont Stakes. The crew delivers their final picks for Saturday's $2 million Grade 1 at Saratoga and breaks down the complete all-stakes Pick 6 sequence — six races, five graded stakes, one of the richest single-day betting cards in American horse racing. RACE 8 — TRUE NORTH STAKES (G3) | $400,000 | 6½ Furlongs PP1 Acoustic Ave — Trainer: Linda Rice — Jockey: Jose Lezcano PP2 Imagination — Trainer: Bob Baffert — Jockey: Flavien Prat PP3 Bentornato — Trainer: J.F. D'Angelo — Jockey: Irad Ortiz Jr. PP4 Listenupshance — Trainer: Doug O'Neill — Jockey: Antonio Fresu PP5 Faust — Trainer: Steve Asmussen — Jockey: Jose L. Ortiz PP6 Book'em Danno — Trainer: Danny Ryan — Jockey: Pedro Lopez PP7 Be You — Trainer: Todd Pletcher — Jockey: John Velazquez PP8 Illuminare — Trainer: Todd Pletcher — Jockey: Manuel Franco PP9 Pentathlon — Trainer: Shug McGaughey — Jockey: Dylan Davis RACE 9 — JAIPUR STAKES (G1) | $500,000 | 5½ Furlongs Turf PP1 Governor Sam — Trainer: George Weaver — Jockey: Pedro Lopez PP2 Bold Journey — Trainer: Bill Mott — Jockey: Junior Alvarado PP3 Litigation — Trainer: B.A. Lynch — Jockey: Florent Geroux PP4 Works for Me — Trainer: J.R. Lee — Jockey: Flavien Prat PP5 Reef Runner — Trainer: D. Fawkes — Jockey: Irad Ortiz Jr. PP6 Ag Bullet — Trainer: Rafael Baltas — Jockey: John Velazquez PP7 Clock Tower — Trainer: Wesley Ward — Jockey: Dylan Davis PP8 John the Beer Man — Trainer: R. Atras — Jockey: Kendrick Carmouche PP9 Twenty Six Black — Trainer: H. De Paz — Jockey: Manuel Franco PP10 My Boy Prince — Trainer: Mark Casse — Jockey: Jose L. Ortiz RACE 10 — WOODY STEPHENS STAKES (G1) | $500,000 | 7 Furlongs PP1 Gilded Bandit — Trainer: Bill Mott — Jockey: Junior Alvarado PP2 Obliteration — Trainer: Steve Asmussen — Jockey: John Velazquez PP3 Six Speed — Trainer: George Weaver — Jockey: Irad Ortiz Jr. PP4 Stradale — Trainer: Steve Asmussen — Jockey: Ricardo Santana Jr. PP5 Solitude Dude — Trainer: Saffie Joseph Jr. — Jockey: Flavien Prat PP6 Crude Velocity — Trainer: Bob Baffert — Jockey: Florent Geroux PP7 Englishman — Trainer: Cherie DeVaux — Jockey: Jose L. Ortiz PP8 Civil Liberty — Trainer: Doug O'Neill — Jockey: Antonio Fresu PP9 Taj Mahal — Trainer: B.T. Russell — Jockey: Manuel Franco RACE 11 — METROPOLITAN HANDICAP (G1) | $1,000,000 | 1 Mile PP1 Nysos — Trainer: Bob Baffert — Jockey: Flavien Prat PP2 Vibe — Trainer: Todd Pletcher — Jockey: Luis Saez PP3 Antiquarian — Trainer: Todd Pletcher — Jockey: John Velazquez PP4 Saudi Crown — Trainer: Brad Cox — Jockey: Irad Ortiz Jr. PP5 Rated by Merit — Trainer: Chad Brown — Jockey: Dylan Davis PP6 Knightsbridge — Trainer: Bill Mott — Jockey: Junior Alvarado PP7 Journalism — Trainer: Michael McCarthy — Jockey: Jose L. Ortiz RACE 12 — MANHATTAN STAKES (G1) | $1,000,000 | 1-3/16 Miles Turf PP1 Tiz Dashing — Trainer: Barclay Tagg — Jockey: Javier Castellano PP2 Test Score — Trainer: Graham Motion — Jockey: Manuel Franco PP3 Make Me King — Trainer: H. Al Jehani — Jockey: Jose L. Ortiz PP4 Integration — Trainer: Shug McGaughey — Jockey: John Velazquez PP5 Deterministic — Trainer: Michael Clement — Jockey: Kendrick Carmouche PP6 Bright Picture — Trainer: Andre Fabre — Jockey: Flavien Prat PP7 Rhetorical — Trainer: W. Walden — Jockey: Irad Ortiz Jr. PP8 One Stripe — Trainer: Graham Motion — Jockey: G. Lerena PP9 Battle of Normandy — Trainer: Shug McGaughey — Jockey: Dylan Davis RACE 13 — BELMONT STAKES (G1) | $2,000,000 | 1¼ Miles PP1 Vitruvian Man — Trainer: Doug O'Neill — Jockey: Antonio Fresu — ML: 30-1 PP2 Powershift — Trainer: Todd Pletcher — Jockey: Luis Saez — ML: 12-1 PP3 Chief Wallabee — Trainer: Bill Mott — Jockey: Junior Alvarado — ML: 3-1 PP4 Renegade — Trainer: Todd Pletcher — Jockey: Irad Ortiz Jr. — ML: 2-1 PP5 Ottinho — Trainer: Chad Brown — Jockey: Dylan Davis — ML: 20-1 PP6 Growth Equity — Trainer: Chad Brown — Jockey: Manuel Franco — ML: 12-1 PP7 Commandment — Trainer: Brad Cox — Jockey: John Velazquez — ML: 6-1 PP8 Emerging Market — Trainer: Chad Brown — Jockey: Flavien Prat — ML: 6-1 PP9 Golden Tempo — Trainer: Cherie DeVaux — Jockey: Jose Ortiz — ML: 9-2 Free horse racing picks, pace analysis, wagering strategy, and best bets for Belmont Stakes Day at Saratoga. Whether you're playing the Belmont Stakes win bet, the Pick 6, building exactas, trifectas, or superfectas across the full card, this is your free horse racing analysis hub for June 6, 2026.
Christopher Romero, MD, a pediatric endocrinologist at Mount Sinai Medical Center, New York City, and Associate Professor of Pediatrics at the Icahn School of Medicine at Mount Sinai discusses arginine vasopressin deficiency. The name of the rare disease central diabetes insipidus was changed in 2024 to better reflect its etiology.Central diabetes insipidus, a rare disease, is unrelated to the common medical problem diabetes mellitus, other than they are both problems related to endocrinologic dysfunction. Whereas diabetes mellitus involves pancreatic function and the production of the hormone insulin, central diabetes insipidus involves the pituitary gland and regulation of the hormone vasopressin. Dr. Romero stated that a new name for central diabetes insipidus was introduced in 2024—arginine vasopressin deficiency (AVP-D) to reflect the difference and relieve misconceptions caused by the traditional naming. The central issue with AVP-D is the function of antidiuretic hormone, which regulates water concentrations in the body. Pediatric and adult patients with this vasopressin deficiency (which mediates antidiuretic hormone levels) excrete more urine than patients without the deficiency. “It causes these patients to drink more, to make up for the water loss,” said Dr. Romero, “resulting in kids being thirstier and having to use the bathroom more often.” As a result, AVP-D can lead to weight loss and loss of appetite, dehydration, and electrolyte abnormalities. He also pointed out that the abnormal cycle of drinking and urination in children interferes with school work and performance. “Unless you're aware of [AVP-D], you may miss the diagnosis,” said Dr. Romero. The pituitary gland is involved with so many functions, and symptoms only slowly evolve. Issues with the onset of puberty and growth may hint at the pituitary source of the problem. Historically, treatment was managed with an oral formulation of vasopressin, which was first available in the 1970s. An intravenous form was available in inpatient settings. A nasal spray formulation was subsequently developed, and is useful particularly with older children. Dr. Romero pointed out, figuring out the correct dosage for an individual pediatric patient is key; every child with AVP-D is different in terms of how much water they lose during the drinking–urination cycle. “Even though the oral form was effective, only two dosages were available. You have to titrate the dose to balance the water loss,” he emphasized.The introduction of Desmoda in February 2026, an oral solution of desmopressin acetate 0.05 mg/mL, allows for easier titration. The solution may be easier to take than the pills for young children, and caregivers may have a better idea of precisely how much medication the patient is getting. For those reasons, Dr. Romero believes this formulation may be the best option for young pediatric patients with AVP-D.
60 Minutes staff and new boss clash, Meghan Markle expert Mark Dolan joins us, Kristin Cavallari victimized by nude move, Pride Month, Erika Kirk's dating life, Black Crowes v. USA chants, J-Lo insists her new movie won't suck, and The Orgy Dome launches a GoFundMe. MMA Fighter Sean Strickland vs Dylan Mulvaney. This again? Erika Kirk was allegedly drunk and all over some dude at a bar, but she denies the report and says she'll never date again. Bret Michaels can't win as he's now catching crap for canceling the Freedom 250 Concert. MARK DOLAN IS HERE! We go deep on Meghan Markle, her narcissism, her failures, and how she's tainted the British Royal Family. Jill Biden has been ALL OVER the place trying to do damage control about her new book. Everything J Lo touches turns to crap...Including this new movie with Brett Goldstein. Some people are saying they're dating, but don't dare ask her about it. The Black Crowes are making news again. Some people in Tampa are no longer Black Crowes fans. Chet Hanks is a douche, but he's oddly likeable. Brad Pitt was recently in France doing some cool things. But back home, Maddox is still hating his dad. Karmelo Anthony, the alleged murderer, not the basketball player, is still getting donations in his Give Send Go. Marc's favorite new GoFundMe is for the Orgy Dome at Burning Man. Stuttering John is back online! 60 Minutes is under fire as Scott Pelley dresses down the new boss in a private meeting. Brad Galli was on ML's Soul of Detroit today. Give it a listen. Jason Carr is still out there getting free food. Kristin Cavallari was a victim of the nude move. Who was it? Marlon Jackson took a huge spill on stage. Kelsey Grammer was not around for comment. We might have some merch left. Click here to check what's available. If you'd like to help support the show… consider subscribing to our YouTube Channel, Facebook, Instagram and Twitter (Drew Lane, Marc Fellhauer, Trudi Daniels, Jim Bentley, BranDon, and Roberto).
TestTalks | Automation Awesomeness | Helping YOU Succeed with Test Automation
AI coding tools promised to make development faster — and they delivered. But here's the problem nobody talks about enough: when you speed up coding, you don't eliminate the bottleneck in the SDLC. You just move it. And for most teams, it lands squarely in QA. In this episode, Joe sits down with Vilhelm von Ehrenheim, Co-founder and Chief AI Officer of QA.tech, to dig into how agentic AI is reshaping software testing from the ground up. Vilhelm brings serious ML credibility, he helped build Motherbrain, one of the earliest production LLM systems in venture capital, and he's now applying that experience to one of the hardest problems in software delivery: testing at AI development velocity. You'll learn how QA.tech's behavioral knowledge graph gives AI agents the context they need to actually understand your application, why validating user intent beats checking element identifiers every time, how autonomous agents can review PRs, reproduce bugs from Slack messages, and generate targeted tests without a single line of test code ,and what the tester's role actually looks like when agents do the heavy lifting. If you're wondering whether your QA practice can survive the pace of AI-driven development, this one's required listening.
Innovation occurs across many areas, and compliance professionals need not only to be ready for it but also to embrace it. Join Tom Fox, the Voice of Compliance, as he visits with top innovative minds, thinkers, and creators in the award-winning Innovation in Compliance podcast. In this episode, host Tom visits with Nikunj Bajaj, Co-founder & CEO at TrueFoundry, about enterprise agentic AI infrastructure, governance, and hidden costs most organizations are not accounting for. Nikunj describes TrueFoundry's platform as a single control plane for enterprises to build, ship, and govern agentic AI applications, inspired by Meta's internal ML stack, which he says is about a decade ahead of the rest of the industry. He argues enterprises over-focus on model and tool selection when problem definition and effective use are the real constraints. On governance, he identifies two failure modes: avoiding meaningful use cases entirely to sidestep governance risk, or trying to solve all governance problems up front and never reaching ROI. Successful teams implement application-specific controls iteratively, starting with a few high-value use cases rather than hundreds of low-value ones. He highlights that model inference accounts for only about 20% of total generative AI spend, with the majority of spend concentrated in infrastructure, engineering, and debugging, creating cost-allocation and budget-control challenges for compliance teams. For auditability, he argues that an agent without full decision traces is “a liability with an API key,” and walks through how end-to-end tracing enables audit readiness, faster debugging, and proactive attack detection. He closes by advocating centralized control via a unified AI gateway while enabling federated development and tailoring guardrails to whether your exposure surface is external or internal. Key highlights: Stop Chasing Tools Governance vs Speed Hidden AI Costs Agent Auditability Board Level Priorities Resources: Connect with Nikunj Bajaj LinkedIn – Nikunj Bajaj Learn More About TrueFoundry TrueFoundry Website TrueFoundry on LinkedIn
We're announcing AIEWF speakers this week! Take the AI Engineering Survey!Today's guest Ethan first joined us for the LS Paper Club as the lead on NVIDIA Cosmos World Model, but then joined xAI and built Grok Imagine in 3 months:He comes back on Latent Space with some nuclear hot takes: that Video Models primarily get their intelligence from LLMs, not from training on video data, and that the next frontier for truly interactive, realtime, long-horizon world models is to work on LLMs (perhaps Interaction Models as well…)Put it this way: In the near term, the next Sora won't be a better video model, but a video agent.Generative Media may more closely follow the evolution of AI coding which went from focusing on one-shot output performance and cost, to multiturn reasoning and planning models for agents and systems that can plan, edit, test, debug, and submit PRs.At a certain point, coding models got so good that the only significant next step to improve performance was handling the orchestration of these models.Now as the performance of video models increases significantly across realism, consistency, & prompt adherence while becoming more cost efficient, the next evolution of video generation may also be systems that can plan, generate, edit, critique, and iterate across an entire creative task. In this episode, Ethan joins swyx and Vibhu to unpack what it actually takes to build frontier image and video systems: data, VAEs, diffusion transformers, audio-video alignment, inference speedups, and the hidden cost of storing and moving massive video datasets. From building NVIDIA's Cosmos world model to joining xAI as Grok Imagine was being built from zero to one, Ethan He has been at the center of some of the most important work in video generation, multimodal models, and real-time world models.We go deep on Grok Imagine, how a small xAI team shipped its first multimodal video model in three months, why iteration speed matters more than almost anything in model development, and why many of the biggest gains come from fixing tiny bugs in data and training pipelines. Flipbook: The future of VideomaxxingVideo agents are almost a sure bet to be the trend in the coming year. We end with a glance at what's beyond video agents:Flipbook caused a minor sensation this year when it was released, but most treat it as a fun demo. Ethan takes it very seriously — with the speed and cost of inference coming down every year, the future of custom video JIT UI is closer than you think. We talked about why videogen models may become the front end of AI, how generative UI could replace traditional HTML/CSS, why world models need to be real-time, interactive, and long-horizon, and why the future of video generation may depend more on language models and agents than on diffusion alone.We discuss:* Why fast iteration mattered more than meetings* Why small training bugs can drive huge model quality gains* Why coding models may make compute the bottleneck again* How image and video models are trained with synthetic captions* The role of VAEs and latent space in frontier video models* Why image models are the foundation for video models* The tradeoff between temporal compression and real-time interactivity* Flipbook, Neural OS, and the future of generative UI* Why future interfaces may go from user intent to pixels* The hidden cost of training video models: storage, egress, and GPU hours* How step distillation and consistency models (like OpenAI sCM) makes video inference orders of magnitude faster* Grok Imagine 0.9 and large-scale audio-video generation* Why audio-video alignment is harder than text-video alignment* Ethan's definition of world models* Reference-to-video, video extension, and long-context video generation* Why xAI's research communication undersells Grok Imagine* How xAI culture shaped the speed of development* AI watermarking, SynthID, and detecting generated media* Why prompt rewriting matters for video models* Grok Imagine Agent and the rise of video agents* Why language models may unlock better video generation* Robotics, physical AI, and embodied world models* Why Ethan left xAI and shifted focus toward LLMs* Self-managed context, memory, and the next frontier for language modelsEthan He* LinkedIn: https://www.linkedin.com/in/ethanhe42* X: https://x.com/EthanHe_42Timestamps00:00:00 Introduction00:01:25 From NVIDIA Cosmos to xAI00:03:24 Building Grok Imagine from Zero to One00:10:07 How Image and Video Models Are Trained00:18:53 Video Compression, VAEs, and Real-Time Tradeoffs00:22:10 Generative UI, Flipbook, and Neural OS00:32:10 The Cost of Training Large Video Models00:37:04 Distillation, GANs, and Fast Video Inference00:41:21 Audio-Video Generation and Grok Imagine 0.900:48:34 What Makes a World Model?00:55:51 Reference Videos, Long Context, and Video Memory01:00:11 xAI Culture, Research, and First-Principles Building01:09:45 AI Safety, Watermarking, and Prompt Rewriting01:13:10 Video Agents and AI-Assisted Creation01:27:32 Why Language Models Unlock Better Video01:31:15 Robotics, Physical AI, and Embodied World Models01:32:38 Why Ethan Left xAI01:34:16 Self-Managed Context and the Future of LLMs01:38:43 Ethan's Career Path and Closing ThoughtsTranscriptIntroduction: Ethan He, Latent Space, and the Path to xAISwyx [00:00:00]: We're here in the studio with Ethan He, most recently of xAI. Welcome.Ethan [00:00:10]: Thank you. Glad being here.Swyx [00:00:11]: We're also here with Vibhu. you were first coming to us or joining the latent space world because you were working on Kosmos at NVIDIA, and you did a paper. We loved it. you presented it as well, so thank you for doing that.Ethan [00:00:23]: I've actually, I also presented the MoEs twice at latent space.Swyx [00:00:29]: How did you actually hear about us? Did we reach out to you? Is that how it worked?Ethan [00:00:33]: No, actually, I-- the community. Like I realized, oh, there is this online community that people talk about AI and also learn from each other through papers every week through the Paperclip. It's very nice.Ethan [00:00:49]: I learned a lot.Swyx [00:00:49]: I think three years stop. We haven't stopped even on Christmas and New Years. many weeks I want to stop but it keeps going.Vibhu [00:00:58]: No, that was good. I think you had posted that you worked on a paper, and I was “Oh, very cool. We have Paperclip. Present then.”Vibhu [00:01:04]: But I might have reached out to you after.Swyx [00:01:05]: you-- because it's an amateur club, right?Swyx [00:01:08]: so it's very unusual and but we have sometimes paper authors come by and actually explain the paper. Today we just did, the poolside paper, which was apparently very good.Vibhu [00:01:18]: Came out yesterday.Vibhu [00:01:19]: pretty interesting, right? Fully open. They talk about everything, systems. So it's a good one. We'll, we'll recommend people to read it.Swyx [00:01:25]: Bring us up to speed on your transition to xAI, ‘cause I actually don't even know when you joined. just like tell the, tell the story about the sort of transition.From NVIDIA Cosmos to xAI: Scaling Video and World ModelsEthan [00:01:34]: Before xAI, I was working on Kosmos world model as in-- at NVIDIA. So Kosmos is, it's a giant video foundation models that can-- that aims to simulate the world and for-- it serves as a foundation of-- for all of the roboticists to build on top of. There, once I built the Kosmos one, I realized as this thing also has a scaling law similar to language model, we need to scale up the video models further. that's, that's why I realized I need to move to somewhere with much more compute resources. That's how ISwyx [00:02:13]: Than NVIDIA?Vibhu [00:02:14]: The GPU rich came themselves.Vibhu [00:02:19]: And timeline-wise, when was Kosmo? It was pretty early, right? It was open world model, open paper, everything.Ethan [00:02:25]: It was end of twenty-four.Vibhu [00:02:28]: End of twenty-four.Ethan [00:02:30]: Then at mid twenty-five, I moved to xAI. At that time-- I joined about the time when xAI was about to build video models and in multi-model models. There were no infra, no data, and no model, and it just-- as a few engineers, we built it in three months and released the first model, Grok Imagine zero point nine.Ethan [00:02:55]: And since then, I keep working on video models and move more from training and to post-training of the video models. For example, like a reference to videos, kind of like the cameo feature and, video extensions. And, before I left, I worked on a world model, leading a small team to focus on the real-time long horizon video generation.Building Grok Imagine From Scratch in Three MonthsSwyx [00:03:24]: Can you give like a rough roadmap of okay, you're on a brand-new team. Grok previously was only text, or they partnered with BFL for their image gen stuff. What do you-- what are the building blocks, right? You have compute, data you can procure somewhere. Like just what are like the sequence of things that people should think about when you're setting up a new team?Vibhu [00:03:43]: actually even deeper, not just data you can procure. You guys had to go through getting the data too, right? So you shipped it pretty fast, but yeahSwyx [00:03:51]: three months is likeVibhu [00:03:52]: From everythingSwyx [00:03:52]: actually like very surprisingly fast.Ethan [00:03:55]: One thing I say like thanks to my experience at NVIDIA, ‘cause first time when we were building Kosmos together, we built it, for about a year. So this is like the second time I do it. Roughly have an idea, what to do. I say the most important thing is the talent. Everyone were very strong and clever, very close with each other towards a common goal. So that speed up things a lot. So you reduce the communication bandwidth among people, and everyone can work towards the same goal. It's, it's like every day there's not that much meetings on the calendar, like maybe like a, like a sync a day, and after that it's, it's just all building. It was pretty fun at that time.Ethan [00:04:47]: And another thing is that xAI has very strong foundations of like data inference, model inference, and the supporting there can help the model develop a lot. When I look at, training models, I don't so actually the top important thing is like how many, how many iterations can you do, per day? and the more iteration can you do, you can, you can train the model much faster. So if you have very strong infra and you have a lot of compute, you can, you can train these models in very short period of time. That can give you a much larger buffer to, for errors, and it also gives you the opportunity to spot more bugs.Iteration Speed, Compute, and Debugging Model PipelinesSwyx [00:05:46]: What is an iteration? Is it like a few hundred steps or what are youEthan [00:05:50]: Let's say just the train-training the model, like from acquire new data and maybe design new algorithms and train a new model, maybe at smaller scale orSwyx [00:06:01]: So cycle time for like any hyperparam that you're searching.Ethan [00:06:04]: Cycle time and tune to like eval this model. Is this model better than my previous iteration?Ethan [00:06:11]: SoSwyx [00:06:11]: So it's like before you, someone had already set this up that you can iterate very quickly.Ethan [00:06:15]: I think the foundation there is extremely good forDeveloping and research models.Ethan [00:06:23]: And often I find is it-- this is kind of boring, but like a lot of the improvements does not come from new algorithms. It comes from finding small bugs here and there in the data pipeline, in the, in the model training pipeline. Those give, those give the biggest boost to the model quality.Vibhu [00:06:46]: It's interesting, right? So you say it's like small team, less communication bandwidth, but also a lot of quality is like find little bugs. It seems counterintuitive, right? You have a lot of people, you can iron out more of those, but it's interesting to see the other side, right?Swyx [00:07:00]: I also wonder, have you-- do you try using LLMs to look for bugs? I don't know.Ethan [00:07:05]: I remember at that time it was mid two thousand and twenty-five, so it's the coding model wasn't quite there yet. I remem- I remember like December two thousand and twenty-five, it was extremely good. Yeah, I've been, I've been using it at that time. It's, it's helpful. sometimes it produce codes that are kind of difficult to maintain, even though like the first time it built something extremely fast. But it gave the, like a spaghetti code, thousands of lines that I couldn't maintain, and the LLM itself couldn't figure out what's, what's wrong and how to improve on top of it. But now I find it much better. Yeah, I want to bring up another point here is now coding models are much more efficient and can help us implement stuff much faster. Compute might become a bottleneck again because previously, like if you want to train a new model, say you want to generate new synthetic data and then or write a new algorithm, it might take a few weeks. And during that period of time, you don't-- you might not have experiments to run. But now you can build that thing within a few hours, then you can immediately train a model.Ethan [00:08:24]: Now you have to have enough compute to try all of the ideas. So compute might be the bottleneck of iterating speed again.Swyx [00:08:36]: yeah, I actually, honestly, I think it's like kind of a stressful job because you're “Well, I should be trying everything, and if I'm not, then I'm not doing my job well.”Vibhu [00:08:48]: there's also the stress of you're eating thousands of GPUs per hour, which is very expensive and, compute can go to other researchers.Swyx [00:08:56]: You got the daddy Elon toVibhu [00:08:57]: You got daddy Elon.Ethan [00:08:59]: It wasVibhu [00:09:00]: But there's still finite amount of compute, like you want to use it, you want to use it well, you want more of it.Ethan [00:09:06]: That was quite stressful indeed. Yeah, I think one thing is the-- with coding models now, like a lot of these jobs can be automated, which is much better. A second, it's a, it's a marathon, so you got to maintain good health and, a regular schedule.Vibhu [00:09:28]: It's, it's hard to hear that when you shift from zero to nothing in two months.Swyx [00:09:32]: and, I think obviously the culture at xAI is very famously, people work very hard. one thing I did want to dive into, in our-- in the notes that you, that you sent ahead of time, you had specific comments about the cost of Video Gen training. presumably this is on the Colossus-1, right? the two hundred megawatt cluster. Any whatever you want to just share on that.Vibhu [00:09:54]: I think there's, there's three things we're talking about, right? So there's Video Gen, there's also the Image Gen model that you put out. Do you want to like complete the, okay, so zero to one, you have a few months. Just what are the stages of create Image Gen model?Swyx [00:10:06]: Oh, yeah, maybe I got distracted.How Image and Video Models Are Trained: Synthetic Captions, Tokenizers, and VAEsVibhu [00:10:07]: Sorry. and then, from there's Video Gen, there's Audio Gen. Would love to get into those next. But what is that first few months like? So small team, a lot of bugs, iterations, but what does it look like? Do we take something off the shelf? Do we just get data compute? What's, what's the few months like? How do you go to state-art Image Gen model? How do you just start?Ethan [00:10:28]: I cannot comment specifically how xAI did, but it's, it's a quite standard process. I can draw some, examples from Cosmos. So mainly it's building a video model, you actually need to build a image model first. And building these two models, the data you need is a hundred percent synthetic pair of language and image or language to video. Because on the, on the internet, actually, the videos don't naturally associate with text. So you can say, oh, like on YouTube, you have the title and you have the description and the commentsSwyx [00:11:11]: TitleEthan [00:11:11]: of a video, but usually they're not relevant to the video itself. And say maybe like the video is a natural scene of mountains or something, and the title is, I'm so happy today.Ethan [00:11:26]: So they have they have no correlation at all. So the first step is to, you have to generate synthetic pair of language with the videos. So you gather videos from the internet, and you use a VLM to caption the videos. So that part, here's a question, like how do you, how do you gather VLM to begin with? So if there's noSwyx [00:11:55]: You, so you fuse the model, right? LikeEthan [00:11:57]: Say if there's no like VLM exists, like how do you generate the text to the beginning, right? It's, it's impossible.Swyx [00:12:04]: I see.Ethan [00:12:05]: In the beginning, it's like you ask human to describe the video as detailed as possible.For example, you ask them to describe everything, like all objects, all characters, and all interaction and dialogues in the, in the videos. So that's in the protocol of Cosmos labeling. We require the objective we give to the labelers was that you have to describe the video as detailed as possible, such that a blind person hears a blob of text can reconstruct what the video is like from their head.Swyx [00:12:43]: Video or image? You're talking about images.Ethan [00:12:44]: Video or image, either one of them.Vibhu [00:12:47]: This was pretty common when we went from clip and DALL-E, right?Vibhu [00:12:51]: It's all training on really detailed captioning of images. So same is applied to video, but insteadEthan [00:12:57]: same appliedVibhu [00:12:57]: of using multimodal model to pass in video images and write rich descriptions, you can alsoSwyx [00:13:04]: I think there's this traditional perspective of supervised, or, very highly human curated thing. I feel like there's a unlock with unsupervised, right? Where like you have enough to bootstrap that you can just throw common corpus on it or, whatever. like unsupervised vision and language pairing, right? Like where you just have, interspersed image and text and it just learns. To me, that is the VLM breakthrough that is different from the clip, different from the LM era.Ethan [00:13:36]: It's interesting to see that you kind of need both data.Ethan [00:13:41]: For example, for theSwyx [00:13:41]: You need it to bootstrap it up. YeahEthan [00:13:43]: for the generative model training, there's also usually like a small percentage of unlabeled data. So the model is instructed to generate a video without any text instruction. That can also help the model generalize. So after this stage of generative synthetic pair, so, one important common step is to train a compressor or a tokenizer of the image or videos. So because, if you train-- If you can technically, theoretically train image or video models on pure pixels, but the problem is that the, it's, it's a lot of tokens. So like one image, it's, a thousand by a thousand, it's like one million tokens, one million pixels. It's impossible to train transformer on that. So it's, you need to train a tokenizer, which can go from image to latent space and latent space back to image.Swyx [00:14:45]: That's why we named the podcast.Swyx [00:14:48]: But, basically, you're talking about vocabulary science.Ethan [00:14:50]: so vocab.Swyx [00:14:51]: And so, what is, what is imp-- like a million is impossible?Ethan [00:14:54]: In generative models, the vocab is continuous. It's a continuous space. We can think about like you map an image to a vector. It's a, it's a fixed length vector. It's sixteen or forty-eight, something like that. And then you map that vector back to the image space. And the mapping is, has-- The mapping is patch-based. So you say you haveEthan [00:15:22]: a sixteen by sixteen patch and you match, you map that patch of pixels into this latent space.Swyx [00:15:29]: We've covered thisVibhu [00:15:30]: This is like the vision transformersSwyx [00:15:32]: VAEs,Ethan [00:15:33]: VAEs.Vibhu [00:15:34]: You basically compress your input, you do your generation, you're reasoning all that generation in smaller dimension, and then you project back out.Swyx [00:15:43]: VAE is a form compression, but I think the for me, the patching thing is from VIT, right?Ethan [00:15:48]: You can make those.Swyx [00:15:49]: Literally the, yeah, the paper is titled like sixteen by sixteen is all you need. something like that. and then I think also, people make a lot of comparisons with this kind of patching with convolutions.Swyx [00:16:02]: Which is you're, you're kind of re- reconstructing the old paradigm with the new.Ethan [00:16:05]: Actually, in VAEs, there are, there are both convolution networks and transformers. You can actually do both.Ethan [00:16:14]: After this VAE, so what you've got is you've got latent space tokens and you've got the language tokens. So now the training of the diffusion transformer, usually generative models use diffusion transformers. It is actually quite standard. It's, it's very similar to how you train a language transformer models. It's not that much difference. It's just the tokens, the visual tokens in, visual tokens out. The only difference is there's a denoising process. So you train the model to unmask some of the noise. So you add, you add random noise to the visual tokens, and then you train the model to remove those noise to generate the clean tokens. Any inference, the model can iteratively remove noise from a hundred percent noise.Swyx [00:17:12]: And then there's also, to speed things along on the tech tree of diffusion, there's CFG, and then there's, there's also, latent diffusion that, there's, there's someone in there. I think, somewhere along the line, obviously, like stability and all these other guys, pioneered a lot of this, architecture. I don't know if you want to get into that or just, or do the video side up to you.Bootstrapping Video from Image Models and Temporal CompressionEthan [00:17:37]: After you train such model, such image model, the reason it's a, it's a foundation for video models is that image models are cheaper to train, and they have much denser connection between language and text. So, sorry, language and images. For example, you train a billion, you train on a billion images, and there's a mapping from the text to the image. And the cost to train the same, like the, a billion, a billion text to a billion videos, that's much more expensive because videosNaturally have more tokens than images. Because the diffusion models, their understanding of, language purely come from this mapping. So if you don't have enough mapping, so if you only train on like a ten million videos or something, there-- you might not see enough language tokens in your training, so your model does not understand human intention enough. So that's why you really-- you train-- you first train this image diffusion models, and then you bootstrap the video model from there.Swyx [00:18:53]: One thing I did want to ask, because I-- actually, I think you're, you're the first per-- video model person I've ever talked to, I think. we've, we've like talked to Luma and all those folks. There's all these tricks in video compression where basically frame by frame there's not that much difference, so actually you don't have to regenerate or save the whole frame, right? but I think MP4 compression or something else like that.Swyx [00:19:16]: is it tempting to use that? Or as far as I can tell, everyone just treats it as, “No, we would just generate every frame.” Is that roughly the state-art?Ethan [00:19:27]: There are a few different approaches. Let's say first, like you want to just directly use MP4 compression and use that as the tokens for the transformers to train, right? So people actually have tried that, but the main challenge is the latent space for the MP4 tokens were not, were not very comprehensible for the models. It's, it's extremely hard to train on that. And there's aEthan [00:20:01]: So that's why they created VAEs, which creates more continuous, latent space, so the models can understand that latent space and learn from it much easier. Even within the VAEs, there are different difficulties of the latent space. So you can imagine something the simplest, the most naive VAE is like you have an image, and you just shuffle all of the images into a, into a vector. So you don't need to train any VAEs, right? But that latent space is extremely hard for models to train on top of. That's why there are some debate on like how do you compress the tokens. So you mentioned like you can compress frame by frame. Also, you can compress, the temporal dimension.Ethan [00:20:52]: The difference is if you compress the temporal dimension, you get a much higher compression rate. Because there's temporal redundancy between frames, because, this frame and the last frame, likely they are mostly similar, so there's only some small difference. for example, I think in 12.1 VAE, they have like a eight by eight by four compression rate. So the four temporal tokens are compressed into one tokens. That can save a lot of, save a lot of the context length. If you do it frame by frame, you have to do maybe like eight by eight by one. Your context length will be four times larger. That being said, the benefit of the frame-- per frame compression, we might come back to this later, is, real-timeness and interactivity. ‘Cause if you, if you strain the output of the model, frame by frame, you can-- the model can respond to any user request immediately. So if you have like a temporal four compression, four times compression, thenSwyx [00:22:06]: It might be laggyEthan [00:22:07]: there's a lag there in nature.Swyx [00:22:10]: So you're very pilled on this. let's just go ahead and bring it up ‘cause we have the visual prepared anyway. There's some frontier applications of real-time video gen. So Flipbook is one of the examples that went viral recently, right? What is Flipbook?Real-Time Generative UI: Flipbook, Neural OS, and Diffusion Front EndsEthan [00:22:23]: Flipbook is kind of like a web brow- web browser. You can see like it has the web bro- browser UI on top. The difference is all of the UIs are generated by generative image model in real time, and anything here are fake. But you can, you can explore inside this wor- this imaginary world. Say like we-- here we have engineering the Great Pyramid. Like the model generates this for us to understand how it works, and if we want to navigate around and understand further, we can click on some of the, some of the description here, and the model will generate a new page, new subpage describing the details we want to know about.Swyx [00:23:14]: So it's basically kind of we're playing a video, but it's pausing for our next interaction, and then it just plays the next thing based on our interaction.Swyx [00:23:23]: Which is kind of cool.Vibhu [00:23:25]: and you kind of decide your story. So this was, how do you make a pyramid? levering technique seemed interesting, right? It shows how do you take Okay, I want to know what is thisSwyx [00:23:35]: The demo, the demo tweet had more animation between frames.Vibhu [00:23:38]: I think it's just skipping,Swyx [00:23:39]: Oh, it's just skipping a lot of frames.Ethan [00:23:40]: they also have a video modeVibhu [00:23:42]: It takes a lot. There's a lot of peopleEthan [00:23:42]: but, a lot of people are using it.Ethan [00:23:45]: So it's not available.Vibhu [00:23:46]: There's a live video stream. We can try,Swyx [00:23:50]: So this is an example of the kind of future that you see at the extreme. We don't-- we're obviously not in it today.Swyx [00:23:56]: But in a world where inference is completely free this is better than generating code and text?Ethan [00:24:02]: So this is, this is a final state of where Viva will be at for word model, I think. Imagine internet doesn't exist, and then you type in google.com. Like what should, what should, what should a model show you?the model can imagine something, and this is what the model imagine. And these web pages, they completely do not exist. So I think as the inference costs come down, we are going to have generative UI for everything. If you think about how the coding model works, so they write code for a web page, and they render the code might be con- converted into binary, and the binary render the pixels on the screen. So we in machine learning, every time we have some breakthrough, obviously it's, it's more intuit. So why don't we have like user instruction to the pixel directly? So the generative UI will be user intention to the pixels directly. And say like even if I want email, let's say everyone have the same interface, but I want, I want it slightly different. I want the email to show to me like a TikTok, so I can swipe left and right for the emails. And or maybe you want something else. We can have completely different things. Or like I have I'm looking at, Instagram stories, and I don't like the Like button. I always may click it. And, generative UI resolved it. So it's going to be a revolutionary replacement of the interface. So in the future, we might have much more powerfulEthan [00:25:50]: LLMs and coding models running behind the scene. And in the, in the front-end, the diffusion model will actually be the front-end to show stuff to you. That's how I imagine it.Swyx [00:26:02]: Diffusion front-end, deterministic back-end.Swyx [00:26:04]: Something like that. I find that very expensive, but,Vibhu [00:26:08]: I find it interesting you called LLMs writing code on the back end deterministic, but okay.Swyx [00:26:14]: you write it onceVibhu [00:26:15]: Compare it toSwyx [00:26:16]: And then you execute.Ethan [00:26:17]: If you think about the cost, say, let's say H100 costs $1 per hour, and if you use this eight hours a day and thirty days, so, every month you're paying this two forty, you'll actually not wanna pay for that. That's even more expensive than Cloud Code Max. But if you think about the compute costs come down like two times every year, and I think the future will likely arrive like within few years.Vibhu [00:26:49]: It's everything, right? compute cost comes down, compute gets faster, model gets smarterEthan [00:26:54]: More efficientVibhu [00:26:54]: model gets smaller.Swyx [00:26:55]: I don't know why you say two times, ‘cause I think it's like 100 times. In language models, it is roughly one hundred to a thousand times every twelve to eighteen months, for the same given level of LMSys, ELO.Vibhu [00:27:08]: That's a net of everything, right? That's model performance alongside compute. So different than just compute costs come down. But, a very interesting future.Swyx [00:27:19]: So the web designers will have to shout out that accessibility is an issue, right? how do you deal with screen readers or whatever. But yes, this is higher bandwidth storytelling than anything you can possibly generate with code, right? So I think that's the rough idea.Ethan [00:27:34]: And I'd like to add a little bit that so human naturally have the maximum bandwidth when we are looking at things, look at videos, and we also have maximum output bandwidth when we are talking. So in the future, it might be something like we talk to AI models, and the AI model responds back with a generative UI. So that would be the maximum input and output bandwidth to interact with AI models before neural link happens.Vibhu [00:28:06]: And it's also very custom, right? Some people are very visual, some people are not as visual, right? They prefer the text. But the best thing about generative UI, right, it can also be text.Swyx [00:28:17]: There's another project that we wanted to highlight, which is the Neural OS. Kinda similar idea, but here you're literally operating, simulating an operating system with a video model.Swyx [00:28:27]: and you can play Doom, you can do Firefox. I find this like mildly less impressive, obviously, because it's an OS that I can run.Swyx [00:28:37]: But here everything is imagined.Vibhu [00:28:40]: I was, used to the Command+W to close the Firefox tab. It didn't crash. That's why I saidSwyx [00:28:45]: It's too immersive.Vibhu [00:28:46]: It's, it's too immersive for me.Swyx [00:28:47]: Too immersive.Vibhu [00:28:48]: I wanted to close the tab.Vibhu [00:28:49]: But yes, I can play generated diffusion.Swyx [00:28:51]: this is shockingly fast.Swyx [00:28:54]: Because I remember there was a demo about like maybe one to two years ago. Someone tried to do the first-person shooter with a image model. There was no consistency. It was very slow. But here it looks like realistically it's-- this is Doom.Vibhu [00:29:07]: I think there's two sides to that, right? There's okay, what is running a game? The heavy part of it is actually the game engine, all the lighting, all that stuff, the graphics. This is just kind of video, right? Like we've solved consistency. This is still, it looks like a few years old image generation. There's some temporal consistency, but it's, it's kind of just images stitched together as frame video. But it's a good visual representation to pi- to picture the future you wanna see, right? that's, that's what I see in these more so.Ethan [00:29:38]: This reminds me of how the video models gets better and better. So Neural OS is kinda if you just look at it feels like it's just a crappy version of the, like the Windows we could have, right? And, but the difference is, so the model, this model is overfitted on the existing operating systems. It can generate nothing different than that. But it's actually also similar to video models. So when we are training these video model, image model, we train them on internet. There's no imaginary supernatural stuff on the internet. But once we train this model, you can prompt the model to generate something supernatural that have never existed in the data set. So if you train your Neural OS or neural computer on the standard screen recordings on the entire internet. The model can imagine completely new interface to interact with the computer.Swyx [00:30:43]: This is one of those things that is magical to me. usually generalizing out of distribution is bad, but somehow we have learned some kind of internal world model that you say, this plus, but it looks like rainbows and butterflies, it'll do it and it will kind of make sense.Swyx [00:31:03]: So yeah, that's kind of cool. Yeah, I don't know if there's any comment more on there. I do, I do wanted to, I did wanted to touch a little bit more on the model architecture stuff, which I think you were getting. It's, really fascinating. We don't get a chance to talk about this enough. So one of the papers that we covered, we've covered every annual, segment anything release. and I don't know if you follow-- you're a computer vision guy, so youEthan [00:31:26]: I knowSwyx [00:31:27]: . So they did memory attention, which is kind of interesting. And I always think, anything where you can, across the temporal dimension, keep some consistency, I think it's, very fascinating, and I don't know if Basically, does that-- the CV side bleeding into video gen side, I think is underexplored, right? we talk about it for labeling, but actually you can borrow the architecture itself.Ethan [00:31:50]: There's, there's also complete different approaches, right? you brought up the term world model, so we went from video model to world model. There is diffusion, but there's also other approaches that people are doing. So maybe we get into those after as well,?Swyx [00:32:03]: He has a whole definition of world models and stuff. I feel like we threw a lot at you. Whatever you want to comment on.Why Video Models Are Expensive: Storage, I/O, and Training ScaleEthan [00:32:10]: I think one thing that we should actually comment back on is okay, so we were talking about the steps to train image gen to video model. One thing we don't see as much of is okay, you brought up the delta in training data, right? SoEthan [00:32:24]: you won't have as much a video model might not generalize, but what is the cost of training a large video model? So we know for LLMs roughly, okay, even like the poolside thing that came out today, right? It's a Gemma level model trained on roughly forty trillion tokens at this many H200s over this much time, right? You can see what is the exact cost of that. So how many GPU hours over how much H200 costs? So how do we do the back-end math of, same thing for video models, image models. How do you, how do you kind of break that down? I can share some back-envelope calculation. So surprisingly, video models is-- the cost is very-- is comparable to language models and obviously the largest scale is language model, maybe like a medium scale to language models. I said just storing the videos alone, it costs a lot. You can, you can maybe look up on AWS or something.Ethan [00:33:20]: You really, say if you have a billion videos and let's say, let's just say like each video, like five megabyte, then you need five petabyte to just store those videos. And also remember we talk about you use a VAE to compress the videos, and you also need to store, typically you need to store those continuous feature, in-- also in your storage. That's also comparable size with the videos themselves. So just storing these videos and the features is tens of petabytes alone. And,Swyx [00:33:58]: I just, I just looked up the calculation. Five petabytes on S3 Standard is one hundred K per month.Ethan [00:34:05]: AndSwyx [00:34:05]: It's comparableEthan [00:34:05]: and you needSwyx [00:34:06]: AndEthan [00:34:06]: And then like tens of petabytes, two hundred K. And even more expensive is you have the ingress and egress.Swyx [00:34:13]: Oh, yeah.Ethan [00:34:14]: Like you-- through the internet. You have to just to download those videos, I believe it's, it's more expensive on AWS than just storing those videos.Swyx [00:34:25]: Storing, yeah.Ethan [00:34:25]: And each training runs, you probably need to pull them once. If you train multiple times, it's, it's even more than that. So it's like just storing the network, those costs is just, it would be a few, a few millions per month to just storing everything, not to mention the GPU cost.Ethan [00:34:45]: AndSwyx [00:34:45]: my side tangent, the compute rental, like GPU rental is very efficient. There's one side, okay, you can be XAI and build your data center. Should we not just build our, storage compute as well? LikeEthan [00:34:57]: Of courseSwyx [00:34:57]: cloud cost compared to just,Ethan [00:34:59]: You save so muchSwyx [00:35:00]: store. Yeah, exactly.Swyx [00:35:01]: Especially with like egress and stuff. So.Ethan [00:35:04]: That's a good idea, but it also comes to-- there are some of its own challenges.Swyx [00:35:09]: Of course, of course.Ethan [00:35:10]: like people who build the GPU data centers, they might not expect this much, storage. And yeah, people build storage, typically they just build it somewhere with just CPUs.Swyx [00:35:23]: I just looked it up. Five-- AWS only charges for egress, not ingress. Tier five for five petabytes is two hundred and thirty K.Ethan [00:35:32]: Even more expensive than the storage.Swyx [00:35:34]: But storing is per month, right? You check in, then you cannot check out. so it's so cool. It's okay. So there's that side.Ethan [00:35:41]: So the TLDR, my backhand mathSwyx [00:35:42]: Data is larger than you think. Yes.Ethan [00:35:44]: my backhand math of GPU hours times GPU cost is also very much, I'm missing some storage.Swyx [00:35:49]: You're also-- you're basically like also more IO bound than normal training.Swyx [00:35:55]: Yes. ‘Cause like data loading, so caching everything, it becomes super important.Ethan [00:36:00]: So in Cosmos, we did a lot of optimizations to make it not IO bound. So, speaking of the training, actually training the model, the GPU cost, if you look up like the open source model, how big these video models are, I think like LTX has nineteen B parameters. That's a dense model. And people are also exploring, MoEs, so it might be twenty B active and, like a hun- hundreds B, total. So that's, that's even-- that's similar size as medium-sized LLM models. And if you, if you look at number of tokens-Uh, we disclose that in Cosmos. It's also like tens of trillions of tokens on the visual tokens. So putting this together, the cost of, training these video models, it's actually comparable with LLMs. Not to mention, the infra is slightly different from LLM, so it might be less efficient to train these models.Inference Speedups: Step Distillation, Consistency Models, and GANsSwyx [00:37:04]: Do you get the benefits of traditional diffusion speed-up? So for, images, there's LCM, LoRAs for, fine-tuning. There's, there's a lot of stuff that's beenEthan [00:37:15]: Flow matching.Swyx [00:37:16]: there's flow matching. There's a lot of stuff that's been done. there's some overlap that applies to diffusion on the inference side and stuff or?Ethan [00:37:23]: so the difference-- the inference side is a completely different story.Ethan [00:37:28]: I think for the training side, it might be a little bit hard to reduce that cost. And for the inference side, the biggest gain is from the distillation of these models. You can-- It's called step distillation, slightly different from knowledge distillation in LLMs. So you-- Typically, for flow matching models, you need like 100 steps or something. Like a distortion model even need even more, like 1,000 steps to generate a good image or video. A step distillation is try to learn to generate fewer step from the model itself. It's kind of like now we-- you use the full model to generate in 100 steps, and then you take a model that only generate 10 steps and let that model to learn from the perfect one.Ethan [00:38:25]: why this workSwyx [00:38:27]: Strong to weak seemingly.Ethan [00:38:28]: It is. It's kind ofSwyx [00:38:29]: DistillationEthan [00:38:29]: kind of like strong to weak. the-- from the modeling perspective, the strong model, the teacher model is trying to model the image and videos of inter-internet, and that distribution is extremely complex. But the step distilled model is just trying to learn from the teacher. The teacher is a model, and the size is fixed, as the distribution is much simpler than the whole internet. That's the intuition I have why step distillation can work. So usually these models serve in productions, they only run in a few steps. In Cosmos, I believe we have, we have like four step and eight steps. If you do some simpler task, image-image translation, it can even run in fewer step, like one step in Cosmos Transfer.Swyx [00:39:22]: I think this is the same intuition that guides a lot of the consistency model work. I sent you a link for, SCM. I don't know if you covered that. To me, that was actually one of, the most impressive papers I've ever seen from OpenAI.Swyx [00:39:34]: That this is the unifying grand concept of consistency models. I don't know if you have any comments on this.Ethan [00:39:41]: So there are, there are a few different approaches,Swyx [00:39:46]: Oh, yeah. Here it is.Swyx [00:39:47]: Two steps versus twenty or 100 steps, whatever. It's already done.Ethan [00:39:52]: So there are, there are a few different approaches, for example, consistency model, and there are also Actually, we shouldn't forget GAN. So GAN, actually, that was, that was the OG ofSwyx [00:40:05]: OGEthan [00:40:05]: step distillation ‘cause it trained just one step to begin with. So actually, a lot of, uh-- For example, there's a distribution matching distillation which use, which uses GAN, as one of the laws for distillation. It-- GAN just tells you, “Hey, generate an image,” and thenEthan [00:40:31]: it has a discriminator to tell, is this image real or not? So the model, the model just need to learn one of the distribution, not the full distribution. Because in training, the model is asked to reconstruct the ground truth image from the internet, which is extremely hard. And in-- When you're training GAN, it's a step process. It's just a, “Hey, you generate image. Does this image look as real as the image from the internet?” Which is a much simpler task. And, yeah, combining a lot of these approaches together, people typically do that, like consistency model and distribution matching and GAN, and we can get these few step models.Audio-Video Generation and Time AlignmentSwyx [00:41:21]: Then there's one step I wanted to add, which is audio and video.Ethan [00:41:26]: So, Grok Imagine zero point nine, I believe it's, it's a first audio video transmodel deployed at a large scale. SoSwyx [00:41:39]: And that was your first model?Ethan [00:41:40]: that was, Grok Imagine's first model. It's, it's audio video, joint generation. I think the hard part is, the modality alignment, ‘cause before this transmodel, we have, we have text to video alignment. We have this, correspondence between text and video. Typically, most of the VLMs, they understand images and videos. Video's very rare, and they don't understand audio mostly. And if you look at the audio generation on the LLM side, you can talk to them perfectly fine, but if you ask them to sing a song or something, it typically is not very good. Also, they don't have, they don't have music either. The hard part is thatUh, actually audio has two component. It has like a discrete component, a continuous component. The discrete component is like the language.Ethan [00:42:44]: So when we speak, it's just, someSwyx [00:42:47]: It's an ASR issue, yeah.Ethan [00:42:49]: It's, it's text token with some characteristics, I would say.Ethan [00:42:54]: But musicSwyx [00:42:56]: I think the speech guys would disagree with this.Swyx [00:42:57]: Like disfluencies and then,Vibhu [00:43:00]: There's tones you can get angry.Ethan [00:43:01]: Well, I say largely.Ethan [00:43:03]: the mu- but the music is completely different. It's, it's very continuous, and you cannot model them like discrete tokens in language models. this is like the hard part for models is, not to mention we have to align text, video, and audio together.Ethan [00:43:26]: SoVibhu [00:43:26]: How?Ethan [00:43:28]: So significant-- some significant challenges are like-- So first, like we talk about as the VLMs, they cannot understand most of them cannot understand audio.Ethan [00:43:39]: So you have to have some way to do the synthetic data generation for audio. You have to caption the model, and that involve, that involve synthetic data and human data effort a lot. And not just surprisingly, most of the LLMs are very bad at recognizing, like the beat, tone, and the details of the of music. They can, they can give some general prediction of which song is this, but it's very hard to describe the details of the music. like we mentioned in image generation, like you have to describe image as detailed as possible so that someone blind can reconstruct that. So here is like someoneVibhu [00:44:32]: DeafEthan [00:44:32]: someone deaf can reconstruct how the music sounds like without actually listening to it. Maybe you can think of it need to have the-- or they call the script.Vibhu [00:44:49]: Subtitles, yeah.Ethan [00:44:49]: You gotta have all the details of the music, and the dialogue.Vibhu [00:44:55]: So is the challenge there typically stuff like music and audio, or is it just Like is there a baseline? Okay, there's enough data where we can understand, narration, conversation, but there's nuances in audio that's where you hit all the data issues or is it just from stage zero, you just do it all right?Ethan [00:45:15]: So one important thing is like the alignment. So the model, the model has to know like the video and audio, the, uh-- it has to have a time-based alignment, like at which time step the video and the audio token correspond to each other. But we actually don't have this kind of alignment for most of the other modalities. If you think about like text and image, text and video, they are loosely aligned. So you can, you can have a description of what's going on in the video, but you don't have to exactly, You typically don't have exact description, oh, at, time step one second like what happened?Vibhu [00:46:02]: It's veryEthan [00:46:03]: At time step two second what happenedVibhu [00:46:03]: coarse. Yeah.Swyx [00:46:05]: So what was the ideal time step? You have to oblate it, and then it's like four seconds or something.Ethan [00:46:09]: So that comes down to how you design the model to, for the model to be aware of as a time, as a time modality. So the model is like a time aware. And that's something pretty unique if you think about LLMs. So if you ask LLM to complete a task, say they, uh-- you ask them and they will say, “Oh, this task will probably take twelve hours to complete,” and they come back in one hour. Say “I've already spent two days on this and I've exhausted everything.”Ethan [00:46:47]: So the LLMs them-themselves, they don't have a sense of time there.Vibhu [00:46:53]: I actually don't think that's just them not having a sense of time. I think it's somewhat based, right?Vibhu [00:46:58]: Like you tell someone, “Okay, go work on this feature. Go implement this,” there's a general understanding you would have of how long that would take without LLMs working at LLM speed, right? So you think back like two years ago, if I tell you to like build me like a new front end for latent space, have a search bar, have all this, you'll estimate that it'll take a few days, right?Vibhu [00:47:19]: So you tell an LLM, “Go build this.” It'll take me a few days. But I think it's somewhat grounded as opposed to them not having the best-- Not saying that they have a great understanding, but I think that example is like you can see where it comes from, right? You're trained on all over the text.Swyx [00:47:35]: They're, they're trying to estimate what a human would say.Vibhu [00:47:37]: because that's what the, that's what the data kind of represents. It's not themEthan [00:47:41]: It came from the corpus on the internet. People have a estimate of how much time.Vibhu [00:47:45]: And not even just in direct like training samples, right? Just your world understanding of tokens of how long stuff takes, right? Go read a book. It'll take you a while, right?Vibhu [00:47:56]: Even if you do nothing but read a book, it takes a few days. So yeah, LLM, I read it took me a few hours.Vibhu [00:48:01]: It'll take me a few hours to go through this research. But this is a tangent.Swyx [00:48:05]: Somewhat, yeah.Swyx [00:48:06]: This is a train of thought I haven't really expressed until now is, which is basically like a full world model must also be recursive, meaning that the participant in the world model must also be aware that they have a world model. which is like this whole recursive thing down the, down the line. but yes, and that the world model can be wrong and that they need to update it and blah. Yeah. We've, argued this on the, newsletter as well, that there needs to be sort of recursive or adversarial world models.World Models: Real-Time, Long-Horizon, Interactive VideoVibhu [00:48:34]: just, to ask, how do you define world model?Swyx [00:48:38]: Oh, yeah, let's go there.Ethan [00:48:40]: SoVibhu [00:48:40]: So just for context, we talked about, video generation, and then there's a-- if you say there's a distinction between world models, what's your, what's your definition? How do you see the two?Ethan [00:48:53]: So disclaimer, I'm not going to debate, what is world model. Yeah. there are many definitions, so I'll just talk about my definition. Since I came from the multi-model, multi-model domain, so mainly talking from video. So world model is like real-time interactive long horizon videos. So there are three parts. so we-- let's talk about them one by one. So the so interaction, so we just, we just look at Facebook and neural computer. So the interaction part of it, so you, world model can allow you to interact with them through keyboard, mouse, and maybe also voice. So these all is-- all is a modality. You can, you can interact with the model, and the model should respond reasonably. Second part is real time. So once you, once, say, you move your mouse, if, say, the world model generate a game, how fast can the game respond? So if you're like professional CS: GO players- -my say, oh, you have to respond- He's beginner within sub ten milliseconds or- Yeah even less. So that's not most of the- No, sixty FPS. Let's go. Oh, three hundred FPS. Oh, five hundred FPS. Wait. okay, yeah. I didn't do the math, but yeah, okay. Uh- Yeah, three hundred FPS, that's a three millisecond. So you have to respond- Oh, s**t. Okay. YeahEthan [00:50:29]: within a millisecond. Most of the video models cannot do that. Yeah. And, but if you, say, if you have a video model that is, say, like a digital human, the response time might be more generous. Maybe typically, for real-time voice interaction, it's like two hundred millisecond. So that's, that's much more generous. But even two hundred millisecond is pretty, it is pretty tricky, ‘cause remember we mentionedEthan [00:51:01]: you have this, temporal compression coming from the VAE. So if you, if you don't compress the temporal dimension, your sequence length is going to explode. So if you want to have this real-time, real-timeness in your model, you have to do is one context problem. And the third part is long horizon, ‘cause we-- if you're not going to just play with, video games just, a few seconds, most video models only a few seconds. We're going to play with minutes, hours. The model have to be able to generate long-form content.Ethan [00:51:42]: So putting these three together, it's, real-time, long horizon interactive videos. I think the final state will be, for example, like a video, a video version of Playbook, where you can, you can interact with, a neural computer. You move your mouse, and you click on the generative interface, and it will reply to you through pixels- generating in real time. But getting there, it's, it's a very long way to get there. So one of the first step, at Grok Imagine, where I led a small world model team there, was to build video extension. So, video extension- it's the first step of interactivity. Yeah. It's, it's the first step. Yeah. So it's the first step- You have it here, video editing, yeah. Yeah. Yeah. So the first step is because, this unlocks long horizon videos. Typically, for most of the video generation models, you give it a prompt or an image as an initial frame. You generate video, that's it. That's just, one time, done. And some creators would try to, use the last frame as a first frame for the second video. It can-- sometimes it works, but if you do it a few times, it says the quality would decrease. And- It doesn't have that context- Yeah over the full video, so the temporal- Yeah, exactly. Yeah, ‘cause you only gave it the last frame, of course, right? Yeah. Exactly. And- it's actually a pretty fun hack. if you've seen like- Oh, no, he's saying something better. Yeah. And for example, like Vue, I remember Vue 3 has like a second context of the last video. It is slightly better than using the last frame, but it has the same problem-- similar problem that it, the quality would decrease. if you extend a few times to, one minute, the video quality would look much worse than the first video. Second, another problem is that the model doesn't have long-range knowledge of, what's happening before. Say, if they generate some dialogue, some, two people speaking, and their voice might change, over some time, especially if the second conditioning, it does not cover the previous context. So these are the core challenges. So the Grok Imagine video extension, it has historical context of all of the previous generated videos. It can, It has, it has the context of, who is speaking and what objects have appeared and everything, having that to generate the next video. So if we naively do this, you can imagine, just, put all of the previous history video tokens into the context. The context lens will easily explode. Especially for video models, that can be like a few, a few million context, I would imagine- context lens. Yes.Yeah.Swyx [00:54:58]: Let's run with that.Ethan [00:54:59]: for example, like in Cosmos, I think just five seconds of video is like a fifty K or sixty K number of tokens. So like if you do, if you do fifty second, that's a five hundred K tokens. If you do longer than that, easily explode. This long horizon, problem was the first step we're trying to solve world model. It turns out people, yeah, people love video extension. Like a lot, a lot of the creators love using video extension to create longer form videos. This is the part I liked that you have a, you have an intermediate step toward the final goal instead of just a straight shot to the final version very much.Swyx [00:55:48]: But I can see you have a strong vision of where we want to end up.Long Context, Redundancy, and Efficient Interactive VideoVibhu [00:55:51]: Does it seem like it's an efficiency issue? okay, we're at a few million tokens context,. If you draw the parallel to language models, we had very short context, two thousand, eight thousand, then, you scale it up one million, ten million. sure, there's effective context, but at the end of the day, it's just what's it worth? sure, there's a whole training data side. In video, it might be slightly easier ‘cause we have a hundred million token video, right? Just take a movie with the full context there. Like is this efficiency from an inference standpoint that like it's expensive, but we know how to solve it? Or like why is this not the approach? So like my broader point was on your second point of world models, you say it needs to be interactive and live, right? You should be able to play a game and see the interaction live. So one thing I see with research is a lot of what you actually serve is different than what you build, right? So we talked about distillation. You train big model, you distill it, you do quantization, speculative decoding. We do all this stuff to serve it efficiently. Should we not just have a solution, like a world model that can interact well, do inference optimization, serve it, distill it secondary, so make it real time after you solve it? So like a-- another parallel is say, continual learning, right? What we need is someone to solve it and show it works inefficiently. Give it a few years, people will make it efficient. Same thing with regular attention, right? It worked. Over a few years, people have different forms of attention, and we've scaled it to be efficient at log context,? So kind of two things there, right? One is it seems like it works. You've scaled it. Can we not just scale it a lot more efficiently over time? Do we need a separate approach if this works? And same thing with interaction, right? if we can get it done, like if we can solve some way that it works, we can solve making it more efficient from an inference standpoint later.Ethan [00:57:53]: that's actually a very good point. So in videos, there's actually a lot of redundancies. So we solve a lot of the pixel redundancy from VE, but there's more redundancy in long range and long horizon videos. Say, if a character appear in the first clip and then it disappeared, it only reappear at the end of the video, you probably don't need the-- the context, like in the middle of the generation. So you only need that character, where you need. So that's why, I helped build another feature. It's a reference video.Vibhu [00:58:36]: Is it here?Swyx [00:58:36]: is it the same model release or different one?Ethan [00:58:39]: It's a different one.Ethan [00:58:41]: You probably need to search onSwyx [00:58:43]: I'll find itEthan [00:58:43]: X reference to video.Ethan [00:58:46]: So reference video allow you to like upload up to seven images as condition and generate the video. Say, if like I want-- it can, it can be characters or objects or even scenes. Say like I want, I want condition on, Sean's selfie and holding a bladeSwyx [00:59:07]: We have a dogEthan [00:59:08]: or whatever.Swyx [00:59:08]: We put the dog in the thing.Ethan [00:59:09]: you can put them there and the video models will generate the video from and copies the context over. So that can solve a lot of the problems there, like the long context problem. It doesn't need to have a very long context, but it's-- I feel like it's an intermediate solution. The modelSwyx [00:59:29]: It's cheating.Ethan [00:59:30]: the model should be able to like selectively know, where should I draw the references. So say if I want to generate a movie, I generate it autoregressive, like a ten second at a time or something. And now this character appear, I can look back to where it first appear and, bring that back. Yeah, this one, I put the references. Yeah, that's, Optimus, Einstein myself, Annie.Vibhu [01:00:02]: Oddly enough, I used Grok Search to find it, and it pulled your LinkedIn post. But yeah we found it.Ethan [01:00:08]: Interesting.Vibhu [01:00:10]: ButxAI's Underrated Work, Culture, and WatermarkingSwyx [01:00:11]: this is a problem. This is not your fault, but like XAI doesn't communicate all this work that you do very well because they just have the model release and then that's it. But actually, these details are very good.Swyx [01:00:22]: As far as I understand, everything you just described is state-art, like no one else has done it.Vibhu [01:00:30]: A lot of-- yeah, I have a lot moreSwyx [01:00:32]: And then, and then you just put this blog post with the cookies. I'm this is not enough,?Swyx [01:00:37]: but I, obviously this is like the high level numbers that people want to know. But no, okay, soVibhu [01:00:42]: And I wonder, like part of that is also some labs don't share research into what happens. And ifSwyx [01:00:50]: No, but this is literally bragging about how good they are, right?Swyx [01:00:54]: Like, why would you not say that you are capable of extending with full context? this is not a secret sauce. This is like we did the work. yeah, I don't know.Ethan [01:01:02]: different labs have slightly different communication styles.Swyx [01:01:07]: Anyway, if anyone from XAI is listening we are always happy to help you tell your story. Yeah, okay, so you did references, and I think, I think kind of the point you're, you're making is it is sort of like a kludge, right? this is-- you can do seven, but what about 100?Swyx [01:01:23]: Right? Then you need a completely different thing.Ethan [01:01:26]: So I think it's-- this is, a mechanism to, select the context from the history, and you might not put the entire history into the context. for example, there's a paper called Frame Pack, which haveEthan [01:01:41]: a heuristic that the latest history, the last one second, I put the entire history, and the history before that, I would, compress it and makes the video smaller. So they follow this pattern, this build overall pattern that the maximum sequence length is fixed. So the further you are from the current frame, you have a smaller image. So this is just a heuristic. I think it can be more automatic. The model is aware like which history part of it can be select. So this part of the research is actually being actively, worked on by a lot of people. It's also quite interesting. I feel this is actually, this part of long context is a little bit ahead of the LLM part.Ethan [01:02:31]: So for example, like in LLMs, if you-- so contexts keep growing. Let's say if you call tool and the tool call history is extremely long, that's still in context, and keep growing, keep growing. Even if you switch the topic to something else, the whole context was there. There are some agentic harnesses that help you to, say, prune the tool results and, prune Like when you, when you query a file, only show like the top 200 lines or something. Those were very heuristic-driven.Swyx [01:03:08]: For listeners, we did a write-up on the cloud code, leak where there are eight different kinds of pruning, including like you prune the tool results and all that. So you can, you can read up on that kind of thing.Ethan [01:03:17]: I think, one breakthrough in continual learning might be like a way to automatically, manage its own context.Swyx [01:03:27]: These are all heuristics, and they will be replaced by machine learning.Ethan [01:03:30]: InterestinglyVibhu [01:03:32]: TheEthan [01:03:32]: the same thing is being researched in both LLMs and video models.Vibhu [01:03:36]: The interesting thing is also like in the paper you showed, it's actually happening at the model level, right? Compared to like language models, sure, we have base attention, but we'll do our own compression, we'll do our own pruning, which is separate from model error.Vibhu [01:03:49]: Eventually, it all just boils in, hopefully.Swyx [01:03:52]: I think this is a form of like attention, but like also know sort of reasoning attention. I feel like that's different than normal attention.Swyx [01:04:03]: Does that, does that make sense?Ethan [01:04:04]: It's, it's different in the sense that attention, not to mention, set sparse attention aside,
Artificial intelligence is rapidly reshaping the landscape of rare disease research -- but how close are we to realizing its full potential? In this episode, Tanya Binette (Director of Therapeutic Expertise, Rare Disease at ICON plc) and Roz Round (SVP of Operational Strategy, Patient and Site Engagement at Precision for Medicine) explore how AI is being applied across the rare disease clinical trial lifecycle, from drug discovery and protocol design to patient identification and engagement.The discussion highlights both the promise and complexity of using AI in a research space defined by small patient populations, fragmented data, and unique ethical considerations. Guests emphasize the importance of patient trust, regulatory alignment, and responsible innovation, while also identifying opportunities to accelerate trials, improve access, and empower patients through AI-enabled tools.
The crew reviews the Jack Daniel's Special Release Small Batch Rye, which consists of five high-proof batches of Tennessee Rye Whiskey sourced from Coy Hill, Boiler Hill, and Fire Brigade Fields barrelhouses. Presented at barrel proof in 375 mL bottles and proof points ranging from 142.7 to 146.1, the special release represents some of the most unparalleled and limited Tennessee Rye Whiskey to emerge from the Lynchburg distillery. A total of 129 Tennessee Rye Whiskey barrels with entry dates spanning 18 months (from February 2016 to August 2017) were matured for an average of 10 years across three barrelhouse groups. The well-known Coy Hill produced two of the five batches while two new locations, Boiler Hill and Fire Brigade Fields are where three additional batches were matured. Forged in the heart of Lynchburg, Coy Hill, Boiler Hill and Fire Brigade Fields showcase three unique landscapes at the Jack Daniel Distillery.
Today, we are dropping another episode in our "chats" series, specifically on the founder side - hearing from those scaling the companies themselves.In this episode, we are talking with Daulet Amirkhanov, Founding Engineer of Bead AI. Daulet is going to take us through his years at Meta and Cognee, leading into how he is building Bead AI, to take on compliance audits and AI automation.QuestionsTell me and my audience a little bit about you. You've gone from three years on high-throughput reliability infrastructure at Meta, to engineering the GraphRAG engine and semantic memory systems at Cognee, and you're now Founding Engineer at Bead AI — an a16z-backed startup building autonomous agent infrastructure for compliance audits. How did that journey shape the way you think about engineering for the age of autonomous systems?Let's zoom into the Meta years. For listeners who haven't worked at that scale — what was the exact piece of logging and reliability infrastructure you owned, what does "high-throughput" actually mean in numbers there, and what's one specific architectural decision from those years that still shapes how you build today?A lot of infra engineers stay in infra. You made a deliberate move from human-scale systems at Meta to agent-scale systems at Cognee. What did you see in that moment that convinced you AI agent infrastructure was the next distributed systems frontier — and not just the current hype cycle?Cognee is a GraphRAG and semantic memory company, and your work there was on the agent infrastructure side. Your biggest design call was decoupling the MCP architecture so multiple agentic systems can share unified memory through a standalone process, rather than each one coupling to its own Python runtime. Walk us through what problem that was solving and the key design decision you made.Give us a concrete example: an agent task that breaks when each agent has its own vector store, but works once they share unified state through the decoupled MCP architecture you built. What's the actual mechanism that makes the difference?Most engineers in this space come from an ML or applications background. You're coming at agent infrastructure from a pure distributed systems lens. What does that lens let you see that the ML-native crowd is missing?Bead is a16z-backed and going after compliance audits, which isn't the obvious first market for autonomous agents. You joined as Founding Engineer in January and are shaping the technical core now. From your seat: what makes compliance audits the right wedge for agent infrastructure, and what are the foundational decisions you're making today that will define what the product can do two years from now?Make a technical claim about agent infrastructure that most people in this space would push back on — and defend it. Where are you the dissenting voice?Without breaking anything confidential — what's the hardest unsolved problem on your plate at Bead AI right now, and how are you approaching it?Two years from now, what's the piece of agent infrastructure that we'll consider "obviously necessary" but doesn't exist yet? Who builds it, and what does it look like?SponsorsUnblockedBraingrid.ai.TECH DomainsMezmoLinkshttps://usebead.ai/https://www.linkedin.com/in/amirdnur/Our Sponsors:* Check out Cash App and use my code CASHAPP10 for a great deal: https://click.cash.app/ui6m/mt82fpxl #CashAppPod. Cash App is a financial services platform, not a bank. Banking services provided by Cash App's bank partner(s). Prepaid debit cards issued by Sutton Bank, Member FDIC. See terms and conditions at https://cash.app/legal/us/en-us/card-agreement. Cash App Green, overdraft coverage, borrow, cash back offers and promotions provided by Cash App, a Block, Inc. brand. Visit http://cash.app/legal/podcast for full disclosures.* Check out Plaud AI and use my code CODESTORY for a great deal: https://plaud.aiAdvertising Inquiries: https://redcircle.com/brandsPrivacy & Opt-Out: https://redcircle.com/privacy
00:00-15:00: Vegas in Stanley Cup Finals. ML breaks down the journey. Thanks to Batavia Downs Gaming and Arc of Onondaga. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.
Joe Maionchi (Co-founder & COO) and Rod Christensen (Co-founder & Chief Architect) of RocketRide join the MLOps Community to walk through AIDE — the AI Integrated Development Environment. RocketRide is an open-source AI pipeline platform that lets developers build, debug, and run production-grade agentic AI workflows directly from their IDE, with support for 13+ LLM providers, 8+ vector databases, and full multi-agent orchestration.AI Is Fast. AI Projects Are Slow. Let's Fix That. // MLOps Podcast #378 with JRocketRide's Joe Maionchi (Co-founder & COO) and Rod Christensen (Co-founder & Chief Architect)A huge shout-out to RocketRide for this collaboration!
Join Steven Walchek, Co-Founder and CEO of Liminal, for a deep dive into the "adoption paradox" facing the modern enterprise. Despite billions in AI investment, most organizations remain trapped in perpetual pilots. A serial entrepreneur with over $1.1B in exit value and a former CINO at FIS, Steven argues that the failure isn't technical—it's strategic. In this episode, we explore why forcing standardization kills impact and how the industry is shifting toward "Secure AI Enablement" that learns from actual user behavior to autonomously deploy capabilities where they matter most.
00:00-15:00: ML says the league and hockey community are now talking about the Sabres because of their great season, and that can only be a good thing. Draft, trades, team, more. Thanks to Batavia Downs Gaming and Marz Motors. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.
How do you know if an AI system is trustworthy, compliant, ethical, and fit for purpose? In this episode of the FIT4Privacy Podcast, Punit Bhatia is joined by Stella Liu, an AI evaluation expert and founder of AI Evals & Analytics, to unpack one of the most practical—anud overlooked—challenges in AI today: how to evaluate AI systems before and after deployment. KEY MOMENTS 02:09 —AI Definition 03:02 —AI Evaluations 10:31 — Why AI Testing Is Hard 14:06 — Evals Plus Analytics 18:15 —Synthetic Data 23:47 —Protecting Privacy Ethical 29:05 — AI Evals as a Company 29:52 —How to reach Stella Liu Stella explains why AI behaves differently from traditional software, why testing code alone is no longer enough, and how AI evaluations (AI evals) help organizations assess real‑world behavior, risk, and performance. From evaluation‑driven development to continuous monitoring in production, the conversation explores how teams can move beyond guesswork and hype toward repeatable, measurable AI governance. ⸻ ABOUT THE GUEST Stella Liu is the Co-founder of AI Evals & Analytics (Maven), where she created the AI Evals & Analytics Playbook and teaches top-rated courses on LLM evaluation, monitoring, and product alignment. She's also the Head of AI Applied Science at ASU, leading evals and analytics across university-wide AI products and building higher-ed's first formal AI evaluation framework, and she previously led data science at Shopify and Carvana with 12+ years shipping large-scale ML systems. ABOUT THE HOST Punit Bhatia is one of the leading privacy experts who works independently and has worked with professionals in over 30 countries. Punit works with business and privacy leaders to create an organization culture with high privacy awareness and compliance as a business priority. Selectively, Punit is open to mentor and coach privacy professionals. ⸻ Resources & Links Guest Links Stella Lui • Website: https://maven.com/ • LinkedIn: https://www.linkedin.com/in/wenxingl/ Grow Skills (Privacy Courses & Insights) • Courses: https://growskills.store/courses/ • Insights: https://growskills.store/insights/ • Website: https://growskills.store/ FIT4Privacy • Website: https://www.fit4privacy.com • Podcast: https://www.fit4privacy.com/podcast • Blog: https://www.fit4privacy.com/blog • YouTube: http://youtube.com/fit4privacy Punit Bhatia • Website: https://www.punitbhatia.com Books • Be Ready for GDPR • AI & Privacy – How to Find Balance • Intro to GDPR • Be an Effective DPO
In this episode of ACM ByteCast, Rashmi Mohan hosts 2025 ACM Fellow Cynthia Rudin, the Gilbert, Louis, and Edward Lehrman Distinguished Professor of Computer Science, Electrical and Computer Engineering, Statistical Science, Mathematics, and Biostatistics and Bioinformatics at Duke University, where she leads the Interpretable Machine Learning Lab. Her lab, which seeks to design predictive ML models that people can understand, focuses on areas including healthcare, criminal justice, and energy reliability. Among her honors, she has received the Squirrel Award for Artificial Intelligence from the Association for the Advancement of Artificial Intelligence (AAAI), as well as the IJCAI John McCarthy Award. Rudin was recently named an ACM Fellow for contributions to and leadership in interpretable machine learning and societal applications. In the interview, Cynthia clarifies the crucial distinction between "interpretable" and “explainable" AI and makes the argument that true interpretability is foundational to trustworthy, ethical AI. She shares her extensive field experience collaborating with Con Edison engineers on power grid maintenance, neurologists on medical diagnostics, and the Cambridge Police Department on crime series detection, countering the widespread industry myth that AI performance must be sacrificed for transparency. She describes an innovative paradigm her lab developed to solve the "interaction bottleneck" between data scientists and domain experts, leveraging "Rashomon sets" to generate millions of equally accurate models simultaneously, using human-computer interaction (HCI) tools to create visual, encyclopedia-like interfaces.
BuzzHPC Roundtable episode: Architecting Modern AI Systems: Platforms, Agents, and Integration Join the Community: https://go.mlops.community/YTJoinInGet the newsletter: https://go.mlops.community/YTNewsletterMLOps GPU Guide: https://go.mlops.community/gpuguideBig shout-out to BuzzHPC for the collaboration!// AbstractAs AI systems evolve into more autonomous, agent-driven architectures, the way we design platforms, tools, and infrastructure is rapidly changing. In this session with BuzzHPC, we explore the shifting boundary between platforms and tools, what developers expect platform providers to handle versus what they want to control and build themselves. We unpack what modern agentic stacks look like today, how teams are structuring them in production, and where these architectures are heading as systems become more complex and distributed. A key focus will also be on agent interoperability, how different agents communicate, coordinate, and operate within shared environments.Finally, we share insights and lessons from a recent AI hackathon delivered in partnership with Bell, Buzz, Mila, and KHP, highlighting how these concepts are being tested and applied by builders in real-world scenarios.// BioAllen RoushAllen has held senior technical and AI leadership roles at companies like Oracle and Intel. He's very active in the AI research space and open source communities. He's passionate about improving the creativity and coherence of AI systems.Frédéric BénardFrédéric is Senior Director of AI Applications Development at Mila (Quebec AI Institute), where he leads a team focused on building the engineering foundations for applied AI systems. His work centers on translating cutting-edge research into scalable applications, including AI-driven platforms and agent-based systems used across research and industry collaborations.Shuo WangShuo leads the Responsible AI Office for Bell Canada, where all AI use cases are reviewed and assessed for potential harm and bias. Previously, he led a team of data scientists to expand a large-scale ML program to improve customer support effectiveness.// Related LinksWebsite: https://www.buzzhpc.ai/~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExploreJoin our Slack community [https://go.mlops.community/slack]Follow us on X/Twitter [@mlopscommunity](https://x.com/mlopscommunity) or [LinkedIn](https://go.mlops.community/linkedin)] Sign up for the next meetup: [https://go.mlops.community/register]MLOps Swag/Merch: [https://shop.mlops.community/]Connect with Demetrios on LinkedIn: /dpbrinkmConnect with Allen on LinkedIn: /allen-roush-27721011b/Connect with Frédéric on LinkedIn: /benard/Connect with Shuo on LinkedIn: /shuow/
00:00-15:00: TB Rays are rolling. ML explains why. Thanks to CH Insurance and Welch & Company Jewelers. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.
Olga Goriunova rejects digital abstractions as mirror images of ourselves and reflects on why we concern ourselves with representations that aren't concerned about us. Olga and Kimberly discuss how cultural imagination is shaped by technology; digital subjects as unnatural constructs; the distance between individuals and their digital profiles; banal categorization and subjective truth; how statistics and ML changed the concept of the ideal; the limits of digital subjects; extreme individuation and aspiring to become our digital reflections; how current predictions create future realities; why the ideal digital subject isn't concerned with you; and thinking critically about what we desire and why. Olga Goriunova is a cultural theorist working at the intersection of technology, philosophy, and aesthetics. A Professor of Media Arts at Royal Holloway, University of London, Olga is the author of the critically acclaimed book Ideal Subjects: The Abstract People of AI. Additional Resources: Aksioma: Institute for Contemporary Art Book Lecture Olga Goriunova Academic Profile A transcript of this episode is here.
Send us Fan MailCould putting a few drops of breast milk in a preterm infant's nose actually improve cerebral oxygenation? In this episode of Journal Club, Daphna reviews a randomized controlled trial from the European Journal of Pediatrics investigating the physiologic effects of intranasal expressed breast milk (EBM) administration in preterm infants. The study found that infants receiving 0.2 mL of fresh breast milk intranasally three times daily showed significantly higher cerebral oxygenation levels, along with more favorable trends in heart rate and respiratory rate, compared to controls. While time to full oral feeding and length of hospital stay were unchanged, the safety data is reassuring. Ben and Daphna discuss what outcomes we should even be measuring, and whether the evidence is already good enough to just do it.----Effect of intranasal breast milk administration on cerebral oxygenation, vital signs, and transition time to full oral feeding in preterm infants: a randomized controlled study. Yücel A, Küçükoğlu S, Konak M.Eur J Pediatr. 2026 Apr 16;185(5):272. doi: 10.1007/s00431-026-06922-6.PMID: 41986747Support the showAs always, feel free to send us questions, comments, or suggestions to our email: nicupodcast@gmail.com. You can also contact the show through Instagram or Twitter, @nicupodcast. Or contact Ben and Daphna directly via their Twitter profiles: @drnicu and @doctordaphnamd. The papers discussed in today's episode are listed and timestamped on the webpage linked below.Enjoy!
0:00 - What do you expect to see from the Avalanche tonight in Game 4? What do you want/need to see from the Avs? Do you want to see them win a game for the sake of pride? Or would you rather they just lose, head to Cancun, and start getting healthy?15:37 - For everyone who wants Bednar fired, what do you want him to do differently? WHY do you want him fired? Brett is sick of people hammering the "Fire Bednar" button. Give a specific reason why you want him out. 32:36 - First of all, the song we were jamming to was "Waffle House" by the Jonas Brothers. Long time listeners of ML&K know that banger inside and out.After that, tonight we're gonna party like it's 1999! The Knicks FINALLY made it back to the NBA Finals for the first time in 27 years. They're not just winning playoff games. They're DEMOLISHING their opponents. This is a reign of terror on the Eastern Conference.
Join us as Brian Hough (CEO & Founder of Tech Stack Playbook, AWS Hero) gets brutally honest about the state of tech hiring and what skills developers actually need to survive - and thrive - in the AI era. Brian walks through his frontline perspective on why tech layoffs aren't about skills - they're about market economics - and what that means for engineers trying to stay relevant. You'll learn which roles are actually hot right now (ML engineer, AI engineer, cloud architect, full stack dev), why companies want utility players who can build end to end, how to use social media and building in public to get quietly hired, and why the engineers who thrive will be those who can go from vision to deployed system. Brian also covers practical strategies for positioning yourself before the next wave hits, including using roadmaps as a personal curriculum and leveraging AI as a career accelerator rather than a threat. Timestamps 0:00 Cold Open 0:11 Welcome & Introduction 2:16 Taking Vibe Code to Production-Grade Systems 3:01 Brian's Update: Dog Feeding & Building Internal Tools 8:05 Mac Maximus: Building on AWS EC2 Mac 9:49 Let's Get Into the Presentation 10:10 Agenda Overview 11:11 Is Anyone Actually Working Less Because of AI? 12:52 What Happens When You Don't Understand What You Built 20:10 AWS Root Account Horror Story 23:24 The Skills You Need in 2026 24:09 Tech Scene Overview & Job Posting Divergence 26:19 What Companies Actually Want: Utility Players 28:00 Hot Roles: ML Engineer, AI Engineer, Cloud Architect 32:00 The Layoff Reality: It's Market Economics, Not Skills 40:49 Now Is the Best Time to Start a Startup 42:31 Roles & Salaries Breakdown 43:55 This Advice Is for Everyone - Not Just Job Seekers 48:01 What's Getting Replaced vs. What's Irreplaceable 49:14 How to Become an Irreplaceable Engineer 52:42 Maximum Viable Product 53:02 Building in Public & Social Media Strategy 55:32 Positioning Yourself Before the Next Wave 56:19 Brian's Closing Thoughts 57:03 AI on Your Resume = Getting Hired Fast 58:12 Using Brian's 30-Day Plan as a Claude Curriculum 59:55 Platform Engineering Hot Take 1:03:05 Wrap-up & See You in Seattle How to find Brian: https://brianhhough.com/techstackplaybook Links from the show: https://roadmap.sh/python https://roadmap.sh/ai-engineer https://roadmap.sh/machine-learning https://roadmap.sh/ai-agents
In this special Onward and Upward segment episode of Mission Matters, Adam Torres interviews ML Bruin, author of The Noah Series of Books. ML shares updates on his upcoming children's book, discusses his nonfiction guide A Guide for Navigating Career Success From Day One, and explains how mentorship, storytelling, and positivity continue to shape his mission of helping others grow personally and professionally. Follow Adam on Instagram at https://www.instagram.com/askadamtorres/ for up to date information on book releases and tour schedule. Apply to be a guest on our podcast: https://missionmatters.lpages.co/podcastguest/ Visit our website: https://missionmatters.com/ More FREE content from Mission Matters here: https://linktr.ee/missionmattersmedia Learn more about your ad choices. Visit podcastchoices.com/adchoices
Guthrie Cooper (Senior Group Product Manager, AI & Robotics) and Nidhi Sharma (Global Head of Engineering AI & Incubation) from Just Eat Takeaway.com join the MLOps.community to pull back the curtain on how one of Europe's largest food delivery platforms is running an internal innovation engine. From autonomous delivery robots to agentic AI voice assistants, they share what it actually takes to build like a startup inside a 40,000-person company.Inside Just Eat's AI Lab: Voice Agents & Agentic Commerce // MLOps Podcast #377 with Just Eat Takeaway.com's Guthrie Cooper (Senior Group Product Manager, AI & Robotics) and Nidhi Sharma (Global Head of Engineering AI & Incubation)
Paige Bailey is the AI Developer Relations Engineering Lead at Google DeepMind. Prior to returning to Google DeepMind, Paige spent just over a year at Microsoft as a director of machine learning and MLOps at GitHub, working on projects like GitHub Codespaces, VS Code, and Copilot. As a former applied ML engineer (in Azure Research, Chevron, and on NASA projects), Paige can't imagine a more exciting charter than accelerating developer productivity and creativity with machine learning.You can find Paige on the following sites:BlogXGitHubLinkedInPLEASE SUBSCRIBE TO THE PODCASTSpotifyApple PodcastsYouTube MusicAmazon MusicRSS FeedYou can check out more episodes of Coffee and Open Source on https://www.coffeeandopensource.comCoffee and Open Source is hosted by Isaac Levin
00:00-15:00: ML breaks down why the Buffalo Sabres should or shouldn't sign Alex Tuch. It's all about the cash and years, then you can justify the production both ways. Thanks to Batavia Downs Gaming and Ken's Auto Detailing. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.
Oriol Vinyals, VP of Research at Google DeepMind and co-lead of the Gemini program, joins Jacob the day after Google I/O to unpack the research underpinning Google's latest announcements and where frontier AI is heading. The conversation moves from world models (why Google has uniquely bet on them as a path to AGI, what the "GPT moment" for video and images would look like, and how they connect to robotics and simulation) to agents (the Spark release, why the system and model need to be optimized jointly, and why scaffolding will eventually be written by models themselves). Oriol gets into the mechanics of memory in models, drawing on his cognitive neuroscience background to argue that file-system-style non-parametric memory is more practical than baking memory into weights at serving scale. He shares his views on the limits of RL today (LLMs are data-limited in a way that game-playing RL never was), why training on narrow domains like math and code generalizes surprisingly well, and what a true "Move 37" moment for science or ML research would look like. Throughout, he reflects on the unique advantages of being inside Google (TPU co-design, end-to-end revenue stability, the merger of Brain and DeepMind), the trade-offs between focus and exploration in research orgs, and why he believes AGI in some meaningful sense may already be here, even if the goalposts keep moving. (0:00) Intro (1:36) Why World Models (4:21) The GPT Moment for Video (7:51) What Makes Omni a World Model (10:04) World Models & Robotics (12:37) Evaluating Physics in AI (14:51) Consumer Agents & Spark (18:39) Scaffolding & the Bitter Lesson (22:06) Memory & Continual Learning (26:54) Research Bets Inside Big Labs (32:30) Post-Training RL is Greenfield (35:57) What Real Intelligence Looks Like (39:11) RL Generalization (43:00) Advice for Founders (46:40) Can AI Truly Innovate? (49:48) Recursive Self-Improvement (52:14) Quickfire With your host: @jacobeffron - Managing Director at Redpoint
It is getting hot in California, which has us thinking about the massive carbon footprint of healthcare. The emergency department is famously resource-heavy, but can we save lives and reduce waste? Dr. David Barnes joins us to explain how going green isn’t just about being a “tree hugger”—it's about saving money, cutting waste, and making our hospitals resilient against supply chain chaos. Defining Healthcare Sustainability Balancing Safety and Footprint: Sustainability in healthcare means delivering efficient, affordable care that minimizes resource waste while remaining clinically safe and meaningful. The Power of Resiliency: A sustainable healthcare system is inherently a resilient one. Reducing reliance on single-use items and utilizing local renewable energy sources (like microgrids) protects hospitals from supply chain disruptions caused by geopolitical conflicts or weather-driven power grid failures. The Three Scopes of Emissions Scope 1 (Direct): Emissions directly produced by hospital operations, such as idling fleet vehicles and leaking anesthetic gases. Scope 2 (Indirect): Purchased energy used to power and heat the facilities (e.g., local electricity and steam lines). Scope 3 (Supply Chain): The largest bucket, making up 60% to 80% of healthcare emissions. This includes employee commutes, medical waste incineration, manufacturing of disposable devices, and food production. Clinical Traps: Where We Waste the Most Pre-packaged Kits: Studies show 75% to 80% of items inside specialized kits (like central lines) go completely unused and are thrown away. Over-Preparation: Opening multiple single-use items (like various ET tube sizes) or donning full trauma PPE for minor injuries creates an immediate, unnecessary trash stream. Pharmaceutical Waste: Standard packaging size leads to heavy drug wasting (e.g., using 5 mL from a 100 mL propofol bottle). This regulated medical waste is costly and energy-intensive to incinerate. The Glove Epidemic: Glove overuse skyrocketed during COVID-19 and became a habit. Most routine encounters carry no contamination risk, making glove use clinically unnecessary. Shifting the Culture “Take What You Need, Leave What You Don’t”: Avoid opening supplies you may not need or bringing extra gauze or syringes into a room. Due to infection safety protocols, these often end up in the trash. Watch Where You Toss: Keep coffee cups and paper out of the red biohazard bins. Regulated medical waste costs six times more to process and must be incinerated, creating massive greenhouse gas emissions. Embrace Reprocessing & Reusables: Support partnerships with companies that safely clean and reuse devices historically labeled “single-use” (like EKG leads or waffle mattresses). Swap disposable plastic gowns for reusable cloth gowns that survive 90 washes. Model the Behavior: Culture change takes patience and persistence. Instead of finger-wagging or shaming colleagues, visibly adopt sustainable habits to drive grassroots practice changes. Key Takeaways for the ED Clinician Speak up on bad design: Clinicians are on the front lines of waste. Advocate for local sustainability initiatives to grab the attention of hospital executives who handle major purchasing contracts. Normalize virtual alternatives: Protect staff well-being and slash commuting emissions by offering Zoom or Teams options for short, solitary administrative meetings. Keep it in perspective: Healthcare sustainability is about finding the sweet spot where clinical safety, resource utilization, and environmental impact meet. Hosts: Dr. Julia Magaña, Professor of Pediatric Emergency Medicine at UC Davis Dr. Sarah Medeiros, Professor of Emergency Medicine at UC Davis Guest: Dr. David Barnes, Professor of Emergency Medicine, Director of ED Sustainability, and Member of the Sustainability Committee at UC Davis Health Resources: Practice Greenhealth Health Care Without Harm Green ED (Royal College of Emergency Medicine) *** Thank you to the UC Davis Department of Emergency Medicine for supporting this podcast and to Orlando Magaña at OM Productions for audio production services.