POPULARITY
Categories
Anthropic pulled the plug on its Mythos / Fable 5 model after the U.S. government raised concerns, and IREN has completed its acquisition of Nostrum for 490 MW of capacity in Spain. Welcome back to The Blockspace Podcast! Anthropic and Uncle Sam are trading blows again, with the frontier LLM company pulling its recently released Mythos / Fable 5 model after whistleblowers said the model's guardrails were bypassed. Lygos Finance's CEO Jay Patel joins us for his reaction to the news and the market rally with a reported, imminent peace deal coming for the Iran War this week. For other news, we cover IREN's closing its acquisition of Nostrum, which will give it a 490 MW foothold in Spain for AI data center development, and the EPA's stance that it won't regulate AI data centers. Check out Dimetrics, the AI industry's Bloomberg terminal. Track financial metrics and news for AI stocks, GPU rental prices, state-by-state data center pushback, and more with the compute industry's most powerful dashboard. Subscribe to our newsletter to receive updates for all of our shows and content.
The race to build superintelligence is producing models that keep getting better at objective problems, but not at behaving like actual people. Joon Sung Park, founder and CEO of Simile and creator of Stanford's "Smallville" generative agents study, argues that simulating human society requires a fundamentally different kind of model. He frames today's frontier models as the "CPU of intelligence"—rational, superhuman at problems with right answers—and Simile as creating the "GPU of intelligence," built to encode the diversity of people's values, preferences, and tastes. It simulated 1,000 Americans and predicted their behavior 85% as accurately as people reproduce their own answers. CVS uses it for concept testing; some customers simulate their own earnings calls. Joon's larger bet: a "CERN of human society" that could one day model bank runs, climate cooperation, or the early signals of a collapsing democracy. Hosted by Sonya Huang, Sequoia Capital
SpaceX debuted today after a $75B IPO raise, closing with a $2.11 trillion market cap, and Anthropic is searching for 1 GW to host its own GPU clusters. Welcome back to The Blockspace Podcast! SpaceX's historic IPO came and went today, marking a day of firsts that saw the company close the largest IPO ever at a $2 trillion valuation, making its founder Elon Musk the world's first trillionaire. Nakamoto's Brandon Bailey joins us to discuss the IPO and the current state of the AI stock market and bitcoin, plus his project Dimetrics, a Bloomberg-esque terminal for the data center space. In other big news, Anthropic has reportedly entered into 12 letters of intent to rent 1 GW+ of data center space for its first-ever self-owned GPU clusters. Check out our latest report, “What's a Megawatt Worth?” where we quantify the trillion dollar opportunity for bitcoin miners venturing into the AI sector. Subscribe to our newsletter to receive updates for all of our shows and content.
IPO price. $135Retail. Process. Allocation. ConfirmationGavin Baker on 4th largest cloud ahead of Oracle. Jensen likes to give GPu's to people that can use themBrad Gerstner on how “smart” people lose money. Price target lower by $20.
There are two great forces reshaping the world of energy today. The AI boom and the wave of investment in new data centres have sent power producers scrambling for generation capacity to meet soaring electricity demand. At the same time, the severe disruption to shipping traffic through the Strait of Hormuz has put security of supply at the top of every importer's agenda. In this special episode, recorded at Wood Mackenzie's Gas, LNG and the Future of Energy Conference in London, host Ed Crooks speaks with three guests about what these twin pressures mean for gas. They discuss demand for gas for power, the sources of supply that could provide energy security in volatile times, and plans for tackling the increased greenhouse gas emissions that could result from increased consumption.First, Ed sits down with Neal Kalita, senior director of global energy management at NTT Global Data Centers, one of the world's largest data center developers. Neal explains why "speed to power" is a priority, and why gas plays such a key role in providing the reliable 24/7 firm capacity hyperscaler clients require.Relying on gas as a key component of the power generation mix means managing a complex set of issues around supply security, demand management and long-term investment. Neal explains how NTT thinks about commodity risk, the trade-offs involved in power supply agreements, and why on-site gas generation may be not just a bridge solution but long-term infrastructure for the electricity system. He highlights the key drivers that are changing the data centre industry, including rising GPU power density, AI-driven volatility in load, and climate-related grid reliability concerns. He also discusses NTT's participation in a demand response programme run by Voltus, which helped stabilise the grid when Winter Storm Fern hit Virginia in January.Next, Ed hears from Keith Shoemaker, Chief Commercial Officer at Coastal Bend, which is developing a new LNG liquefaction project at Corpus Christi, Texas. Coastal Bend is aiming to have the first project in the US to integrate carbon capture and sequestration into its design. Combined with the procurement of upstream gas with low methane leakage and flaring, that should make for the lowest carbon-intensity LNG in the world, Keith says. Crucially, the project can match competitor prices without charging a green premium. The US 45Q tax credit will cover the operational spending (Opex) for the transport and sequestration of the carbon, and costs will be kept down by using brownfield maritime infrastructure that is already in place. Regulation will still be essential in creating a market for lower-emissions LNG. Keith sets out an idea for making that work in the EU: linking the new Methane Emissions Regulation with the Carbon Border Adjustment Mechanism to create an "avoided carbon" currency that LNG importers could use to offset CBAM fees on other products such as cement, steel and fertiliser. That way, the methane regulation would change from a stick to a carrot for the LNG industry.Kristy Kramer, Head of LNG at Wood Mackenzie, closes the episode by assessing how the three trends of AI demand, energy security and decarbonisation fit together. She discusses the big question: has the conflict on the Middle East changed the world completely, forever. It may play out like the Covid pandemic. Huge changes were predicted, and although there were some permanent impacts, in other areas the world has gone back to the way it was before. Politics will change from week to week, or even from hour to hour, but geology and economics don't, and over time the fundamentals will reassert themselves. Kristy and Ed reflect on what that means for the future of energy. See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.
Hey folks, Alex here, and welcome to a BIG MODEL week! We finally got Mythos (well almost)! Let me catch you up! This week started with WWDC26 from Apple, and Max Weinbach, who was in the room at Apple Park and actually has access to some of the new features including an all new SIRI AI, joined us to break down what could be the most used AI in the world very soon. At first I was skeptical, but he convinced me that the new Siri is actually good! Then, we saw the ultimate model drop: Anthropic finally shipped Mythos (X, my system card thread, benchmarks). Same weights, two names: Mythos 5 is the unrestricted version that only Project Glasswing partners get, Fable 5 is what the rest of us get, wrapped in the heaviest guardrails I've ever seen ship on a frontier model. It's state of the art on nearly every benchmarkThe model that was “too dangerous to release” is now... well, released, but with the heaviest guardrails we've seen. More on this later. Peter Gostev from Arena.ai joined us to break down the new model. Last but definitely not least, Google released a real-time translation model, that our friend Thor Schaeff from DeepMind demoed live, while we all spoke in different languages and it translated us in REAL TIME. It was really cool, definitely check that out. There's quite a few more things, like Loop Engineering Alpha, Swyx came by to talk about FrontierCode, OpenAI confirmed our suspicions that the anti-datacenter social media posts could be a concerted effort by groupds links to the Chinese government and much more. Let's dive in! ThursdAI - Let me catch you up, every week!
PHP Podcast – June 11, 2026 Guest Hosts: Sara Golemon, Elizabeth Barron & Holly Schilling Eric and John are out this week — Sara, Elizabeth, and Holly take over. Here’s what they covered: PHPVerse Recap PHPVerse just wrapped up, and Elizabeth was there in Amsterdam. The format is unusual — all speakers are flown to one location, but the audience is entirely virtual. It was a class act: professional TV crew, studio lighting, and a makeup and hair team on site. Around 2,500–3,000 people watched the live stream. Everything was broadcast as one long block; individual talk segments and possibly the documentary trailer will be cut and released separately. The full stream is available now — the PHP documentary trailer (produced by Jet Breeze, covering 30+ years of PHP history) appears around the 2:24:30 mark. PHP Foundation 2026 Strategy Document Elizabeth and the PHP Foundation released their 2026 strategy document the same day as this recording. The foundation gathered community input across numerous conversations and conferences, synthesized it into findings, and has now published a plan for the rest of the year. Key themes: repositioning PHP’s public perception (which Elizabeth calls a solvable problem), creating six special interest groups, and launching an Onboarding Initiative to build a real on-ramp for new PHP developers. Elizabeth’s view is that the two things giving her the most hope for PHP’s future are the passion and expertise of the community, and how good the language itself has gotten. Visit thephp.foundation to read the full document. The Onboarding Initiative One of the six special interest groups the foundation is launching is specifically focused on bringing new developers into PHP. Goals include creating a true learning path (not just a reference manual that assumes existing knowledge), improving educational resources, and potentially working with the php.net website to improve the first-time experience. Holly made the point that PHP’s barrier to entry is genuinely lower than almost any other language — the Hello World program is 11 characters — but that story isn’t being told outside the PHP bubble. New developers are turning to JavaScript as a first language and running into minified spaghetti instead of something approachable. AI Writing PHP — And PHP as a Second Language Holly built the entire PHP Tek conference app backend in Laravel without writing a single line of code herself — AI-generated throughout, which she reviewed and approved. The code held up to peer review at the conference with only minor style nits. She ran it on PHP 8.3 and used modern standards throughout (one piece of feedback: stop using empty()). The consensus: AI models write good modern PHP because of the vast amount of open source PHP they were trained on. The caveat Sara raised is worth thinking about — how much of that training data is PHP 4-era code and WordPress 3 repositories? Either way, Holly’s case for PHP as a second language is strong: low ceremony, low boilerplate, readable syntax, and it’s a language where you can do something useful in minutes. PHP’s Reputation Problem (and Why It’s Fixable) The group dug into PHP’s perception gap — the mismatch between how good the language actually is and how it’s perceived outside the community. Holly’s experience as a mobile developer who recommends PHP to others: the pushback is immediate (“isn’t that slow?”, “isn’t that dead?”). The benchmarks don’t support that reputation — PHP outperforms Python on most comparable workloads — but data alone doesn’t shift perception. Elizabeth’s point is that this is primarily a storytelling and coordination problem, not a language problem, and that the foundation’s repositioning work is exactly aimed at closing that gap. The community has the passion. It just needs to tell the story outside its own bubble. PHP Polling API RFC Sara walked through the RFC for a new Polling API in PHP (wiki.php.net/rfc/poll_API). The short version: PHP currently has five or six different ways to do I/O multiplexing (watching multiple streams and acting on whichever one is ready first), and which one works depends on the OS, available extensions, and PHP version. The Polling API proposal creates a single, unified interface that abstracts all of that. The immediate beneficiaries are async frameworks like Amp PHP, ReactPHP, and Revolt, which currently have to maintain multiple backend implementations to cover different environments. The bigger picture: this is a building block on the path toward true async PHP, likely contributing to something more complete in PHP 9.0. Most app developers won’t use it directly — but the libraries they depend on will. RFCs are all listed at wiki.php.net/rfc. PHP.net: Do As We Say, Not As We Do Sara, who has contributed to php.net, copped to the state of the codebase: some of it dates to the PHP 3 era, there are functions.inc files, and it is very much “do as we say, not as we do.” The historical reason is that php.net used to rely on community-administered mirrors (r-synced servers running everything from PHP 5.1 to 5.6 simultaneously), so modernizing the code was impossible without controlling the runtime. That’s changed with CDN-based load balancing — they can now control what PHP version runs on php.net — and the code has been getting better. But it’s a slow process. PHP Podcasts Past, Present, and Future Holly asked about the PHP Town Hall podcast (Ben Edmonds and Phil Sturgeon), and the group did a quick tour of PHP podcast history. The PHP Roundtable — originally started by Sammy, taken over by Eric — has produced about three episodes. Sara and producer Joe are planning to take it off Eric’s hands and actually do it properly. And Elizabeth announced that the PHP Foundation is launching a new podcast: tentatively called PHP at Scale, hosted by Ben Marx, focused on telling the stories of organizations pushing PHP to its limits. No launch date yet, but there’s already a queue of interested guests. Next Week’s Show — Moved to Wednesday Sara will be on a boat off the coast of Galicia on Thursday, so next week’s episode is moving to Wednesday. Guests will include Paul Reinheimer and (hopefully) Sean Coase — two veterans from PHP’s podcasting past. Elizabeth is going to try to make it work around the Canadian Grand Prix. Mac Mini M4 for Local LLMs Holly picked up a refurbished Mac Mini M4 (16GB RAM, 512GB storage) specifically to run LLM models locally via Ollama. Apple Silicon is a solid choice for this because the unified memory architecture gives the neural cores access to far more RAM than a discrete GPU setup. Sara is waiting for the M5, which is reportedly not coming until fall — and is already resigned to spending too much on it when it lands. Links from the show: PHP Foundation — 2026 Strategy Document PHP RFC: Polling API PHP RFC Wiki — All RFCs Under Discussion Amp PHP — Async framework ReactPHP — Event-driven async PHP Revolt — Event loop for PHP php.net website source code (github.com/php/web-php) PHP Architect Discord Guest Hosts: Sara Golemon Based in Lisbon, Portugal PHP core contributor; code contributor via the Curl project (which means she technically has code on Mars) Elizabeth Barron Executive Director, PHP Foundation Based in Germany Holly Schilling Primary mobile developer; built the PHP Tek 2026 conference app Based near Chicago, IL Streams: Youtube Channel Twitch Connect & Hire PHP Architect Website Twitter/X Mastodon Hire PHP Developers Looking to hire PHP developers? Email support@phparch.com – Joe and the team are available for consulting, infrastructure work, Ansible playbooks, and code review. Partner This podcast is made a little better thanks to our partners Displace Infrastructure Management, Simplified Automate Kubernetes deployments across any cloud provider or bare metal with a single command. Deploy, manage, and scale your infrastructure with ease. https://displace.tech/ PHPScore Put Your Technical Debt on Autopay with PHPScore Music Provided by Epidemic Sound https://www.epidemicsound.com/ Join Us Live Next Week Note: Next week’s show is on Wednesday (not Thursday) with guests Paul Reinheimer and Sean Coase. Youtube Channel Got feedback? Join us on Discord at discord.phparch.com The post The PHP Podcast 2026.06.11 appeared first on PHP Architect.
A Computex 2026 trouxe uma série de anúncios importantes para o mercado de tecnologia, mas poucos chamaram tanta atenção quanto o RTX Spark, a nova plataforma da NVIDIA voltada para computação acelerada por IA em dispositivos locais. Neste episódio do Diocast, discutimos o que exatamente é o RTX Spark, quais problemas ele pretende resolver e como ele se posiciona em um mercado que já conta com soluções como Snapdragon X Elite, Ryzen AI e os novos processadores Intel com aceleração dedicada para inteligência artificial.Mais do que simplesmente lançar um novo chip, a NVIDIA parece estar ampliando sua presença para além das placas de vídeo tradicionais. O RTX Spark combina CPU baseada em arquitetura ARM, GPU com tecnologias derivadas do ecossistema RTX e recursos dedicados para cargas de trabalho envolvendo inteligência artificial. Na prática, isso pode abrir espaço para computadores mais eficientes, capazes de executar modelos de IA localmente, reduzindo a dependência de serviços em nuvem e melhorando aspectos como privacidade, latência e disponibilidade.---https://diolinux.com.br/podcast/lancamento-da-nvidia-rtx-spark.html
La flambée actuelle des prix de la mémoire vive ne tombe pas du ciel. Elle est directement liée à l'explosion de l'intelligence artificielle. Les accélérateurs dédiés à l'IA, notamment les GPU utilisés dans les centres de données, consomment des quantités considérables de mémoire très rapide. Résultat : la demande dépasse l'offre, les prix montent, et une partie de l'industrie technologique se retrouve prise de court.Mais dans ce paysage sous tension, un acteur affirme avoir vu venir la crise : NVIDIA. Selon Collette Kress, directrice financière du groupe, l'entreprise avait anticipé la pénurie. Dans un entretien relayé par Wccftech, elle explique que NVIDIA « savait que cela allait arriver », contrairement à d'autres entreprises surprises par l'ampleur du phénomène. Pour elle, cette tension était prévisible, à condition de regarder suffisamment loin dans la chaîne d'approvisionnement.Pour comprendre l'enjeu, il faut revenir à la mémoire HBM, pour High Bandwidth Memory. Il s'agit d'une mémoire à très haute bande passante, conçue pour transférer énormément de données très rapidement entre les puces et les modèles d'IA. Elle est indispensable pour entraîner et faire fonctionner les grands modèles modernes. Chaque accélérateur peut embarquer des dizaines, voire des centaines de gigaoctets de cette mémoire ultra-rapide.Le problème, c'est que la production de HBM mobilise des ressources industrielles proches de celles utilisées pour fabriquer d'autres mémoires, comme la DDR présente dans les ordinateurs grand public. Quand l'IA absorbe une part croissante de ces capacités, le reste du marché se tend mécaniquement. Smartphones, PC, consoles ou composants grand public peuvent alors subir des hausses de prix. NVIDIA affirme avoir limité ce risque en passant commande très tôt. Mais le groupe ne s'est pas contenté d'acheter ce qui existait déjà. Collette Kress explique que l'entreprise travaille directement avec les trois grands fournisseurs de mémoire, en leur présentant ses futurs besoins et ses prochaines architectures. Autrement dit, NVIDIA ne subit pas seulement la chaîne d'approvisionnement : elle tente de la façonner en amont. Une stratégie qui illustre sa puissance actuelle. Dans la course à l'IA, le vainqueur n'est pas seulement celui qui conçoit les meilleures puces, mais aussi celui qui sécurise la mémoire nécessaire pour les faire tourner. Hébergé par Acast. Visitez acast.com/privacy pour plus d'informations.
This Week In Startups is made possible by:NetSuite - Netsuite.com/TWiSTDeel - Deel.com/TWiSTSquarespace - Squarespace.com/TWiSTTwo days before SpaceX launches the largest IPO in history at a flat $135/share, our VC roundtable drops a scorcher: The top 1% of seed deals might actually be underpriced. Plus: the "Sequoia scam" dual-tranche controversy, tokens-for-equity deals, and whether Claude Fable 5 is a true step function.Tomasz Tunguz (Theory Ventures), Michael Downing (Castalia Capital), and Paige Doherty (Behind Genius Ventures) join Alex to go deep on Seed investing, startup economics, AI spend, and the impact of smarter AI on the founder journey.Guest Links:Tomasz Tunguz: https://x.com/ttunguzTheory Ventures: https://theoryvc.com/Michael Downing: https://www.linkedin.com/in/michaeldowning/Castalia Capital: https://castalia.capital/Paige Doherty: https://x.com/paigefinnnBehind Genius Ventures: https://www.behindgeniusventures.comShow Links:Anthropic's IPO announcement: https://www.anthropic.com/news/confidential-draft-s1-secOpenAI's IPO announcement: https://openai.com/index/openai-submits-confidential-s-1/Bending Spoons F-1 filing: https://www.sec.gov/Archives/edgar/data/2004711/000110465926071170/tm2613674-7_f1.htmSpaceX IPO filing: https://www.sec.gov/Archives/edgar/data/1181412/000162828026040364/spaceexplorationtechnologib.htmBrendan Foody's post on Sequoia: https://x.com/BrendanFoody/status/2063470286515683759Claude Fable 5: https://www.anthropic.com/news/claude-fable-5-mythos-5OpenRouter data on Chinese models: https://openrouter.ai/rankings?view=daySaronic: https://www.saronic.com/MotherDuck: https://motherduck.com/Nox Metals: https://noxmetals.co/Timestamps:0:00 Tomasz Tunguz, Michael Downing & Paige Doherty join2:07 The SpaceX IPO and the IPO window4:22 Plaud: If your work depends on conversations — interviews, meetings, calls — you need a Plaud NotePin. You can check it out at https://Plaud.ai/twist and use code TWIST for 10% off!6:30 The new bar: 10x growth (not 3x) to raise a great Series A8:46 Net-new AI budgets9:46 Squarespace: Turn your idea into a beautiful website! Go to https://www.squarespace.com/twist for a free trial. When you're ready to launch, use offer code TWIST to save 10% off your first purchase of a website or domain.11:09 How some founders are outgrowing venture capital11:44 The power pendulum swings back to founders12:46 SpaceX vs. OpenAI vs. Anthropic: Which IPO is most enticing?19:53 Deel - Founders scale faster on Deel. Set up payroll for any country in minutes, hire anyone anywhere, get visas handled fast, and get back to building. Visit https://deel.com/twist to learn more.26:07 Tokens-for-equity, GPU-hours-for-equity & the financialization of compute28:35 Founders airing VC dirty laundry (napping VCs included)29:56 Netsuite - The business landscape is very chaotic right now. That's why you need NetSuite, by Oracle. Get the free business guide Demystifying AI at https://Netsuite.com/TWiST36:38 Claude Fable 5 first impressions: pricing, benchmarks & orchestration45:42 Where value accrues: application layer vs. models vs. private data1:00:06 Nationalization of AI labs: Bernie Sanders, Sam Altman & Trump agree?!1:01:25 Portfolio spotlights: Saronic, MotherDuck, and Nox MetalsSubscribe to the TWiST500 newsletter: https://ticker.thisweekinstartups.comCheck out the TWIST500: https://www.twist500.comSubscribe to This Week in Startups on Apple: https://rb.gy/v19fcpFollow Lon:X: https://x.com/lonsFollow Alex:X: https://x.com/alexLinkedIn: https://www.linkedin.com/in/alexwilhelmFollow Jason:X: https://twitter.com/JasonLinkedIn: https://www.linkedin.com/in/jasoncalacanisGreat TWIST interviews: Will Guidara, Eoghan McCabe, Steve Huffman, Brian Chesky, Bob Moesta, Aaron Levie, Sophia Amoruso, Reid Hoffman, Frank Slootman, Billy McFarlandCheck out Jason's suite of newsletters: https://substack.com/@calacanis
The Information's OpenAI reporter Erin Woo details the confidential IPO filings of OpenAI and Anthropic, highlighting how Anthropic has eclipsed OpenAI in enterprise revenue. Apple reporter Aaron Tilley and Constellation Research CEO Ray Wang break down Apple's WWDC announcements, evaluating its Siri reboot powered by Google's Gemini models and Nvidia GPUs. Finally, AI finance reporter Dakin Campbell joins to discuss how Wall Street titans Goldman Sachs and JPMorgan are exploring derivatives markets to trade the cost of GPU computing power.Articles discussed on this episode: https://www.theinformation.com/briefings/openai-confidentially-files-ipo-paperwork-plans-separate-employee-share-salehttps://www.theinformation.com/newsletters/the-briefing/apples-cautious-ai-overhaul-openais-ipo-filinghttps://www.theinformation.com/articles/goldman-jpmorgan-explore-new-ways-tame-ai-lending-risksSubscribe: YouTube: https://www.youtube.com/@theinformation The Information: https://www.theinformation.com/subscribe_hSign up for the AI Agenda newsletter: https://www.theinformation.com/features/ai-agendaTITV airs weekdays on YouTube, X and LinkedIn at 10AM PT / 1PM ET. Or check us out wherever you get your podcasts.Follow us:X: https://x.com/theinformationIG: https://www.instagram.com/theinformation/TikTok: https://www.tiktok.com/@titv.theinformationLinkedIn: https://www.linkedin.com/company/theinformation/
Networking can be an invisible part of IT infrastructure, but AI is creating demands that make it a critical part of keeping AI application fed with data. Mike Fratto returns to the podcast to discuss both the long haul and local requirements for AI networking with host Eric Hanselman. It's always been important to link chunks of infrastructure efficiently, but AI's voracious need for data has dramatically increased the scope and scale of the need. The risk that any gap in performance or capacity presents is that precious GPU resources will be idled, an increasingly expensive proposition. The realities of AI application architectures is that infrastructure is ever more hybrid, requiring access to repositories of data both on-premises and in various clouds and models scattered across various providers. The need for dynamic connectivity is driven by the rapid evolution of preferences for new models and the diversifying needs of agents to reach new data sources. It's not only forcing network expansion, but it's also driving M&A activity as network providers look to enhance automation in response to customer demands. More S&P Global Content: Compute sovereignty: The strategic importance of digital infrastructure AI won't solve its own energy problem – and that might be fine AI in action: unleashing agentic potential AI infrastructure results in 2025 top expectations, forecast upgraded For S&P Global subscribers: MWC 2026: Agentic AI as the next operating model for networks and network operations AI Infrastructure Market Monitor & Forecast Service providers race to meet surging enterprise demand for AI infrastructure In 2026, the telecom network becomes code Credits: Host/Author: Eric Hanselman Guest: Mike Fratto Producer/Editor: Feranmi Adeoshun Published With Assistance From: Sophie Carr, Kyra Smith, Dylan Scheible
DDN CEO Alex Bouzari discusses the need to maximize return on investment in AI, calling it “ROI maxxing.” He explains how DDN's partnership with Nvidia (NVDA) helps optimize GPU usage, reduce idle infrastructure, and improve enterprise returns. Bouzari emphasizes efficiency as the key to sustainable AI investment.======== Schwab Network ========Empowering every investor and trader, every market day.Subscribe to the Market Minute newsletter - https://schwabnetwork.com/subscribeDownload the iOS app - https://apps.apple.com/us/app/schwab-network/id1460719185Download the Amazon Fire Tv App - https://www.amazon.com/TD-Ameritrade-Network/dp/B07KRD76C7Watch on Sling - https://watch.sling.com/1/asset/191928615bd8d47686f94682aefaa007/watchWatch on Vizio - https://www.vizio.com/en/watchfreeplus-exploreWatch on DistroTV - https://www.distro.tv/live/schwab-network/Follow us on X – https://twitter.com/schwabnetworkFollow us on Facebook – https://www.facebook.com/schwabnetworkFollow us on LinkedIn - https://www.linkedin.com/company/schwab-network/ About Schwab Network - https://schwabnetwork.com/about
Description The Future of Tech is Here. Subscribe to our Newsletter:https://theultimatepartner.com/ebook-subscribe/ Check Out UPX:https://theultimatepartner.com/experience/ In this presentation from Ultimate Partner Live, industry analyst Jay McBain breaks down the monumental macroeconomic shifts rewriting the tech sector in 2026. https://youtu.be/r0qTDyw97Gs As the industry rapidly approaches a $6.07 trillion valuation, driven by massive AI infrastructure investments from Sam Altman and the “Magnificent Seven,” traditional sales and channel models are fundamentally collapsing. McBain reveals how buyer demographics have transformed to an integration-first millennial base, why marketplace ecosystems now command over half of all partner-funded deals, and how a tiny elite of just 1,000 tech service providers control two-thirds of global tech revenue. Learn the exact mechanics behind how Microsoft out-partnered AWS to win 26 straight quarters of dominant growth and how your business can deploy an algorithmic early warning system to capture massive wallet share before competitors even step into the boardroom. Key Takeaways Over half of the Fortune 500 companies vanish every 20 years because their leadership fails to anticipate macroeconomic technological cycles. The true opportunity in the $6.5 trillion AI boom lies not in single vendor products, but in the hardware, software, services, and telecom ecosystem surrounding them. Indirect tech sales are undergoing a structural shift toward direct cloud hyperscaler models driven heavily by Nvidia's core infrastructure client base. Modern business deals are won or lost months before the point of sale based on the average of 6.3 partners surrounding a customer’s environment. Over 51% of tech buyers are now millennials who prioritize software integration capabilities and digital marketplaces over traditional human sales interactions. Tech service economics are pivoting aggressively away from upfront margins toward point-based multi-partner funding across subscription cycles. If you're ready to lead through change, elevate your business, and achieve extraordinary outcomes through the power of partnership—this is your community. At Ultimate Partner® we want leaders like you to join us in the Ultimate Partner Experience – where transformation begins. Key Tags Nvidia AI buildout, $7 trillion AI opportunity, cloud ecosystem decade, Microsoft vs AWS growth, multi-partner cloud deals, digital marketplace migration, millennial B2B buyers, B2B tech subscription economics, tokenized micro consumption, tech services wallet share, hybrid cloud infrastructure, 28 customer moments, IT services industry growth, telecom spend breakdown, channel chief strategy, managed service providers MSP, global systems integrators GSI, software integration first, point-based vendor incentives, automated co-selling workflows Transcript JAY McBAIN AUDIO PODCAST [00:00:00] Jay McBain: So to go back to that story about the 53% of companies who are gonna fail, one of us is gonna be asked to write the book, but chapter one is always you Blame the CEO. [00:00:13] Vince Menzione: We just came back from Ultimate Partner live in Bellevue, Washington, where we hosted incredible leaders for two amazing days. Come join us for this next session where we explore the tectonic shifts we’ve all been seeing. With that, I am incredibly blessed to invite a friend of mine to the stage. I have a quick little side note, like I found an old LinkedIn post from this gentleman from like many years ago, like 20 years ago. [00:00:39] Vince Menzione: And I wasn’t really that nice to you on that LinkedIn post. Like, oh, like this is before Jay became the Jay, that we all know Jay to be j. But he was in the space and I was at Microsoft doing something and he reached out about something. It was kind of rude, Jay. I was like, oh my gosh. I can’t believe. But Jay has been a great friend. [00:00:54] Vince Menzione: When we started the podcast back up, uh, during COVID we started doing podcasts together. When we moved to the studio, Jay was the first person in the studio. He’s always got a spot, uh, at our events. He’s s Spot Art, and, and he’s a great friend and supporter of Ultimate Partner Jay McBain. For those of you who don’t know him, Jay, welcome. [00:01:13] Vince Menzione: Thank you, sir. [00:01:22] Jay McBain: 31 days ago, we landed Artemis two. The furthest humans have ever been away from the planet Earth 57 years ago. We landed on the moon in the 56 years. Between those two moments, the tech industry has been the fastest growing industry in the world. Every single year we moved from the space race to the technology race, and we’re just getting started. [00:01:46] Jay McBain: If you’re old enough, you’ll recognize the mainframe and mini era for 20 years. You’ll recognize a young disheveled Bill Gates showing up in Boca Raton, Florida for, uh, August the 12th, 1981 launch, where Bill thought that every one of us would’ve a PC in our home, and IBM thought they were gonna sell 10,000 of them to hobbyists. [00:02:12] Jay McBain: 1999, a small startup from an executive who just left Oracle in San Francisco named Mark Benioff. A couple of years later, Jeff Bezos went into a boardroom and said, listen, we’ve spent a lot of money building infrastructure to our busiest day, Christmas, black Friday. You’re telling me this stuff sits idle 10 or 20% for the rest of the year. [00:02:35] Jay McBain: Why don’t we rent that out to others? Got laughed outta that boardroom and then got made of fun of on magazine covers. Maybe you should just tend the store, let the adults talk about technology. In March of 2023, our neighbors, our friends, our family saw DeepFakes. They saw poetry, they saw music, and they came to us as tech people and said, did we just light up Skynet? [00:03:03] Jay McBain: Now every one of these 20 year eras, this is the Taylor Swift version of our industry. Every single one of these eras triggers the fastest growing product in history. Today it’s actually Chacha bt first to a billion users. It triggers a new, richest person in the world, bill Gates, to Jeff Bezos. Now, Elon Musk is the first to sign a trillion dollar pay package, and it’s not for car. [00:03:27] Jay McBain: It’s not for cars. It also triggers a most valuable company in the world change. And today that’s nvidia. These are monumental changes in our industry and they’re monumental changes in partnering every single time. And it also links to our customers. If you take a 20 year view of business, one era, and, and think about the AI era, you know, at the start of it here, if you’re to grab the Fortune 500 magazine from 20 years ago and start to flip through it, 53% of the companies in there no longer exist. [00:04:06] Jay McBain: Every 20 year cycle, we lose over half of the biggest companies in the world. These are the companies that have very deep pockets to buy their way outta problems. If you’re not in the Fortune 571% of tech companies don’t make it 10 years. These are the changes that cost industries. There are changes that cost really big companies and the decisions we make, the trends we’re in right now, in 2026 will be written about in the future. [00:04:39] Jay McBain: This new era, a lot of big numbers being thrown around. Vince’s best friend talk about a six and a half trillion dollar AI opportunity, but it’s not Microsoft’s tam. Microsoft is chasing about a trillion dollars of this. And the ecosystem, the hardware, the software, the services, the telecom is gonna make up the rest. [00:05:04] Jay McBain: It is an ecosystem. Every time these big numbers are thrown, the word ecosystem is always thrown around it. Not to be outdone, Sam Altman’s talking about a $7 trillion build out. The world economy this year, the world GDP will be 126. These are material numbers to world GDP, but even better, they’re both larger than our entire industry is today. [00:05:27] Jay McBain: So what took 56 years of the fastest growing industry this year will be $6.07 trillion. Big numbers, but it’s easier to think about it in terms of a dollar that our customers spend in that dollar. They’re gonna spend 25 cents on hardware. They’re gonna spend 25 cents on software. So for anyone that read the memo 15 years ago, that software’s gonna eat the world, there’s still a dollar a hardware to run every dollar of that software. [00:05:57] Jay McBain: And whether you’re thinking humanoid robots or whichever future you’re envisioning, there’s going to be a dollar of hardware to run every dollar of software for the next 20 years. There’s over 25 cents now in IT services, and in many cases, these services are growing faster than the product categories and just under 25 cents in telecom, that’s how it breaks out today. [00:06:19] Jay McBain: And this industry, which took 56 years to get to this point, is gonna double in size in the next three to five years. We already have two and a half trillion of that seven raised and being spent. Part of the reason Nvidia is the most valuable company in the world. Now our industry, uh, you talk about ultimate partnerships. [00:06:40] Jay McBain: Our industry traditionally, and world trade by the way, is 75% indirect. The dealerships, the agencies, the brokers, the resellers, the retailers, the franchisees, the gas stations, the grocery stores, the pharmacies, all 27 industries sell indirect. You gotta think back the last time you bought something direct. [00:07:01] Jay McBain: Well, I bought a Dell from that dude in the nineties. Cool. Well, Dell Technologies is now 60% indirect. Well, I bought insurance. Direct is 15 minutes. Could save me 15%. Well, Geico last year sold more insurance through agencies and brokers than they did direct. This is the world now. We used to be 75% indirect four years ago. [00:07:26] Jay McBain: Then it went to 73.2, then it went to 70.1 and it then it went to 66.7. By the way, marketplace is in these numbers indirect. It’s not marketplace causing this change. It’s one company, Nvidia. Nvidia has seven customers. The magnificent seven, uh, half of them are in the room right now that every morning we wake up to a hundred billion dollars press release about this $7 trillion buildout. [00:07:56] Jay McBain: What’s interesting is indirect sales in our industry is growing by revenue. It increases every year, just not at the pace that this AI build out is happening direct with seven companies. But the reason we’re all here, and I think the core reason that Vince is building this community is this, you know, Microsoft forever has measured and been very vocal. [00:08:21] Jay McBain: About 96% of their deals have partners in them. Kind of who cares, who collects the money. We care about the moments, the 28 moments before the customer makes a purchase. We care about every 30 days forever, because two thirds of our industry, over $4 trillion now is subscription consumption based. Winning a customer today is only winning the first 30 days. [00:08:46] Jay McBain: We care about this cycle. We care about who surrounds our customer. So six years ago, I stood on a big stage and said, you know, we went through a decade of sales. You know, in 1999, you thought you were born to be a salesperson. You’re managing your territory with your gut. Well, a few years later, you were introduced to the science of selling. [00:09:07] Jay McBain: You know, 10 years later you thought as a marketer, you sit around a cocktail party joking with your friends, 50% of my marketing dollars are wasted. I just don’t know which 50%. Really funny. In 2009 until every 58-year-old CMO got replaced by a 38-year-old growth hacker. Coming in with Marketo and Eloqua and Pardot and HubSpot, and 15,505 as of yesterday, MarTech and iTech tools, ninjas in marketing, they wouldn’t let a nickel go through without measuring. [00:09:43] Jay McBain: Now we understand 96% of deals and partners that surround it. No deal is gonna be won or lost in this era without partnering effectively. So we had to have this decade of the ecosystem. One of the ways we’re tracking is by outsiders. You know, Salesforce every year publishes the state of sales and they’ve got, you know, the number one CRM in the world. [00:10:05] Jay McBain: So they get to go talk to all the CROs, all the salespeople in the world. And as of this year, a couple months ago, 94% of every salesperson in every industry in the world uses partners every single day. You wanna see what this number was six years ago. Also, 89% of salespeople around the world don’t think they’re going to club this year without partners. [00:10:29] Jay McBain: So this is a big moment for us, halfway through the decade ecosystem, but we’re only halfway through. We’re starting to understand now at a more granular level. What partnering means. It’s not theory, it’s not flywheels. It’s not really cute. McKinsey slides that we keep showing to our board saying how important partnering is. [00:10:51] Jay McBain: We’re trying to get to the very specific level of the 6.3 partners on average that surround the deal and what they’re doing. How their business model works, and that’s average if I’m working on a public sector deal. I was at a Red Hat conference yesterday talking sovereignty. If I’m in an enterprise or a large public sector deal, it’s north of 10 partners in the deal. [00:11:15] Jay McBain: So we’re starting to understand what used to be this, this, you know, you’ve been the fastest growing industry for 56 straight years. Every single professional services person in every industry has come in to join the fund. Over 90% of accountants are tech services firms. Over 90% of marketing agencies are tech services agencies. [00:11:36] Jay McBain: All of this 250,000 software companies, a million emerging comp tech companies, the half a million VAR that have been in that traditional channel. The managed service providers, all of these 20 different partner types, millions of companies, tens of millions of people competing for 6.3 spots. Around the customer. [00:11:58] Jay McBain: That’s it. Luckily, there’s 141 million global customers to compete for. There’s, there’s some open slots that you can go find, and that’s the point. Our industry never had our own Fortune 500. We always talk to, you know, these partners and GSIs are doing this and SI are doing that. And we never really had a view of capability and capacity or what our own TAM was inside of that partnering. [00:12:25] Jay McBain: And so we set out and we would’ve loved, you know, chat GPT or Gemini or Claude or any of those tools to do this. But there’s one problem in partnering with AI is that it doesn’t know one partner from the next. There’s a big digital sameness problem in our industry that every single partner, whether it’s Larry in the White van or Accenture, with 786,000 employees all say they do all things to all people all the time. [00:12:53] Jay McBain: 98% of them, 99% of them are private companies that don’t share their p and l. You can’t go into Microsoft’s LinkedIn system and find out how many employees, ’cause it’s a block system, it AI can’t see into it. So it just sees, and it’s a great pattern matching. Google, SEO can’t figure out who’s who, nor today can the large language models. [00:13:14] Jay McBain: ’cause all the things they’re trying to match, the transformers are trying to match. It all looks the same. Every tweet, every ebook, every website, every digital history looks the same. So this took us thousands of people hours across two years to do, to dig into every p and l to dig into every dollar of what they’re doing. [00:13:33] Jay McBain: But what was interesting is only a thousand partners in our industry do two thirds of all tech services. When you get into enterprise, it goes up to 80 to 90%. The partners in the middle, in Blue do more tech services. The 30 of them than the 970 partners in white on the outside, the 970 partners in White do more tech services than the next million combined. [00:14:03] Jay McBain: This is our industry in a nutshell. Every time we talk to a a vendor, every time we talk to a partner, every time we talk to a distributor, we’re now talking names, faces, and places. You you wanna talk sovereignty. Yesterday in Atlanta, 90% of sovereign conversations in public sector in the globe is handled by these companies here. [00:14:26] Jay McBain: Forget about how much you do with these partners today. You wanna chase the next column, which is the wallet share. And I was a channel chief for 17 years. I get the weekly report and I see a million dollar partner, another million dollar partner, sorted top to bottom. You don’t know which partners which, which of those million dollar partners is doing 1.2 million in your category. [00:14:46] Jay McBain: They deserve a baseball cap and a front row seat at your event as an MVP. The next partner right next to them is doing 10 million in your category. They’re only doing a million with you. ’cause customers are pulling them into it. Nine times outta 10. They’re leading with your competitor. So I don’t want that list anymore. [00:15:03] Jay McBain: I want the new list, which is showing me those $9 million opportunities. And I as a board member, as A CEO, as a CFO, as a CRO, I wanna see this list. And then I want to talk people, processes, programs, technology. What are we gonna do to go get our fair share of that 9 million? Where’s our lowest hanging fruit? [00:15:24] Jay McBain: How do we double our pipeline? How do we double the size of our company in three years? It’s all right here. Let’s have very specific conversations and move away from flywheels and move around from force multipliers and and things like that in partnering. Let’s figure out how this partner community is surrounded. [00:15:45] Jay McBain: What do 10 million people who have to be smart in front of their customers every single day, what do they read? Where do they go and who do they follow? It’s the law of a few. This is the old Malcolm Gladwell of tipping point 10 million people in the broader channel. A hundred percent of our TAM comes down to only a thousand watering holes. [00:16:08] Jay McBain: 12% of that entire audience. Doesn’t sound like a lot, but it’s over A million. People love podcasts. Number one way they learn the Joe Rogan effect. In our industry, there’s 121 podcasts. These are all public lists. You can go get on my LinkedIn newsletter on canals, oia. But there’s 121 podcasts that drive him forward. [00:16:28] Jay McBain: Really high up on that list, actually number one on the list is ultimate partner, Vince. That’s how I met. ’cause I asked people, 10 million people, you love this. You walk your dog, you drive to work, you listen to podcasts. I’m not the biggest podcast fan. It’s not number one on my list, but it’s number one on theirs. [00:16:44] Jay McBain: They say, you know, you gotta meet this guy, Vince. It’s unbelievable how great these podcasts are. They’re ultimate. [00:16:54] Jay McBain: Then I talked to Vince and said, but Vince, you know, 35% of your community, the 10 million people love to come to events like this one. The hallway conversations, the hotel lobby bar last night. This is what we love to do, especially post pandemic. It’s the number one way we learn. We learn from our peers, we learn from those around us, and, and the learn from the conversations we have here. [00:17:17] Jay McBain: We always remember these moments, you know, years and years later. There’s 352 choices. I’m going to five of them this week in five different cities. It’s a lot of coverage, but again, it’s a tighter li list of how people work. The magazine lists 106 of them associations like Conter. Now the GTIA peer groups, there’s 15 different spheres of influence, but only a thousand places. [00:17:43] Jay McBain: I could walk you through billionaire, after billionaire, after billionaire in this industry and show you how they did this. How did Arne Bellini at ConnectWise? How did Austin McCord at Datto, how did Nerdio become a unicorn? How did threat locker and huntress move away from 6,500 cyber companies and become unicorns over and over and over again? [00:18:05] Jay McBain: It’s only one slide. Unicorns and billionaires are made here, and a lot of people don’t get it. So walking away from Bellevue, a thousand partners, top down, a thousand watering holes, bottoms up. You’ve covered a hundred percent of your tam. You do it better than 10% of your competitor, 10% better than your competitors. [00:18:27] Jay McBain: You win. You carry that on your resume into the next company. You get a bigger job at a bigger pay scale. Let’s just walk through some examples. Cyber 91.7% of it goes through the channel. Huge channel audience. You know, if you’re in MarTech, it’s only 10%, but this one happens to be all channel, but that’s not the story. [00:18:48] Jay McBain: For every dollar that the 6,500 cyber companies are trying to close, there’s $2 in services. Plot twist, the products are grown at 11, the services are grown at 12.6. Your partners are growing faster than you are, and they will continue to for the next, at least five years, probably 10. So when I’m here, five years from now, you’ll hear in me talk about a three to one split in cyber and then a four to one split in cyber. [00:19:18] Jay McBain: Now, when we’re in Miami a couple days ago is CrowdStrike, they’re talking about a $7 and 5 cent multiplier, chasing that two to one up higher. You look at managed services. Here’s a fun story. Managed services. 82% of customers who are man, uh, outsourcing more this year than last year. 650 billion in size. [00:19:38] Jay McBain: This is bigger than the entire SaaS industry. Salesforce, ServiceNow, Workday, Marketo, NetSuite, HubSpot, 250,000. Others. This is bigger. It’s also bigger than all the Hyperscalers combined, not just AWS, Microsoft and Google, but Alibaba and Oracle and everybody down the list. This is a massive market also growing at double digits. [00:19:59] Jay McBain: So these are some big things and obviously we’re watching, you know, week in and week out, quarter in, quarter out, the Battle of Software and Battle of the Hyperscalers and things like that, and who’s growing at what pace and, and how partnering is connecting to all of this. You know, we watched a moment really early in the pandemic where Microsoft started growing faster than AWS and they haven’t stopped since 26 straight quarters. [00:20:27] Jay McBain: And you ask customers and say, you know, does Microsoft have a better product? And in most cases they say no. You know, AWS had a five year head start. Well, did they have a better price? Well, no, actually most cases Microsoft’s more expensive. Well, did did they have better promotion? Was their Super Bowl ad better? [00:20:44] Jay McBain: No, they’re both kind of crap. So you kind of ask the questions of what’s the only difference that could create growth above the leader in the market? Well, it’s place. More of the 6.3 partners are walking into those keyboard room meetings and drawing clouds up on the wall and labeling the Microsoft than they are AWS. [00:21:03] Jay McBain: Very simple. It’s never been about product. The best product in our industry has never won. And now the best way forward is that partnering moment, and this is the moment. So to go back to that story about the 53% of companies who are gonna fail, one of us is gonna be asked to write the book. And it could be the book like Kodak, they invented the product that ended up killing them. [00:21:26] Jay McBain: And it’s a woe is me story, but chapter one is always you blame the CEO. How could they not see those trends happening in 2026? How could they, you know, were they blind? Were they stuck in their own, you know, innovation chamber? Innovator’s dilemma, were they stuck in their own boardrooms? Why couldn’t they see? [00:21:46] Jay McBain: Well, chapter two, you, you blame the board. They have fiduciary responsibility, outsider view, and how could they not see it? But really, this is the future right here. If you take this slide and apply it 10 or 20 years from now to every failure and every success, these are the chapters of the book. Your buyer is now a millennial. [00:22:05] Jay McBain: As of last year, the 51% of our market is bought by people born after 1982. Different psychology, different behavior, different journey, different criteria, their integration. First buyers. The buy a product, 80% as good as the next one. If it works better in their environment. 94% of people won’t buy a car unless it has CarPlay or Android Auto. [00:22:26] Jay McBain: New Buyer. You have to be more integrated than your competitors. That’s a partnering story. The 6.3 partners. If you heard cyber, you need some great channel partnerships, but you need the other 5.3 partners as well, the consultants, the advisors, the designers, the architects, the implementers, the integrators, the manner service, all of the other partners. [00:22:44] Jay McBain: You need to know more of them than your competitors do, and have them label clouds with your name in them. You need better alliances. Even if you compete, you only compete in the morning. You’re best friends by the afternoon. You have to be tight with the hyperscalers, tight, with the big SaaS platforms, tight with cyber, tight with distribution, there are layers, seven layers to every deal. [00:23:04] Jay McBain: You gotta be tight in and have better alliances than your competitors. And then it all comes to the 28 moments, which I’m gonna end on, but the go to market of all of this, the co-selling, co-marketing, co-innovation, co-development, co keeping. This is it. Your product has to be good enough that somebody’s gonna renew it. [00:23:21] Jay McBain: Your Super Bowl has to be, you know, ad has to be good enough that people don’t, you know, shame you on social media. Your pricing has to be somewhere in a country mile of the bell curve of what the customer wants to pay. But successor failure is just here and platforms are synonymous with partnering. [00:23:40] Jay McBain: It’s our role now in the decade of the ecosystem to drive our companies forward. Marketplace. It’s probably the most predict, you know, great prediction we ever made. You know, growing at 82% compounded, it’s hard to predict ’cause it doubles almost every year. We were almost exact to the decimal point. Five years later now till 2030, we’re watching a second story, which is more interesting. [00:24:02] Jay McBain: If 96% of all deals have partners inside of them and there’s private offers and multi-partner offers and distributor sellers record all these funding mechanisms or services as a product. As of last week, over 50% of all deals in marketplaces now have partner funding. It means that while money changes hands differently, the respect and the recognition of what partners do is in the deal. [00:24:26] Jay McBain: We think that’s going to 59, but at some point, that’s gonna have to hit 96. ’cause to run the best programs, whether it’s an indirect sale, whether it’s a direct sale, whether it’s a marketplace deal, it doesn’t matter how money changes hands. What matters is we recognize the 6.3 partners. They’re not only making the deal happen bigger and faster, but renewing and enriching that every 30 days forever. [00:24:48] Jay McBain: When we watch, you know, billion dollar clubs and when we read all the press releases and all the hubbub about how fast this is growing and who, which companies are behind all this. When I’m quoted in some of these press releases, it’s because of this. You know, CrowdStrike, you know, brags are a billion dollars in a single year, but inside of that, they’re showing that 91% growth in marketplaces, which is pretty phenomenal for any company to almost double in size every single year. [00:25:17] Jay McBain: What’s more phenomenal is they’re growing the channel piece of it, 3548%. That green part of it is growing. Companies that understand platform and have people and processes and programs and technology to do it are winning. And they’re getting recognition and partners are starting to join the Billion Dollar Club who don’t sell a product, but are also winning at Extreme Scale. [00:25:44] Jay McBain: So talk about those partner 1000 and who are leaning in to win at this level. As well as everything changes, traditional billing moved into subscription models, moved into consumption models. Now we’re being tokenized to death multi it’s, it’s in this mode of micro consumption. There’s no chance there was little chance in subscription consumption that would be resold. [00:26:09] Jay McBain: You don’t buy Netflix from the cable guy in the white van. There’s zero chance when you’re buying tokens at a buck a piece that that’s going through any indirect sale. This continues to grow. Now the tectonic shifts is what happens when money changes hands differently. These old programs that we used to all write hundreds of different boxes, we checked every day on deal reg and trainings and all the other things are changing. [00:26:35] Jay McBain: To this, you’ll get these slides, by the way, in high res, inside of this now is the customer. For the first time ever, 45 years later, we have the customer in the middle of what we do, the 28 moments in green before they buy the seven layer stack and the partners inside it. The implementation. The integration, the managed services in a cycle that never ends, and two thirds of our industry. [00:26:55] Jay McBain: With the customer in the middle, we can now move money around to the different moments. It’s not all landing in front or backend margins or market development funds or new customer bonuses or spiffs. It’s landing where it needs to land. Over 400 companies now, pretty much led by Microsoft 400 companies are in a point system right now and 400 more. [00:27:18] Jay McBain: We’re working kind of behind the scenes to get that announced in the next 12 months. This is a total changeover in terms of how economics work and partners are yelling over half of us. I don’t care. Don’t call me a VAR anymore. Don’t call me an MSP. Don’t call me a regional system integrator. I do the consulting over half the time. [00:27:36] Jay McBain: I do the design, I do the implementations, I do the managed services, and 44% of us are vibe coding. On weekends. We’re not happy. Just on the services side. We wanna join the seven layer tech stack as well. These are partners growing faster than their vendors by understanding this cycle and where to show up and where the money is in ai. [00:27:56] Jay McBain: And the number one thing they’re asking for is not more leads, which they did for 45 years. The number one thing is now recognized for what I do. I’ve never just been a cash register. We’re completely now past this idea of a channel being a channel of distribution, and now a channel being this platform for the future. [00:28:16] Jay McBain: As we lay that on top of ai, the first couple of years of AI has really been consumer driven. The 95% failure rate that MIT reported last year is now 70%. That’s the failure to get from proof of concept to production. That 70 will be 50 by the summer we’re moving now in business, the maturity rates are going up at the end customer and in 88% of cases, that’s because of the channel. [00:28:43] Jay McBain: They’re working with partners. They’re not vibe coding themselves and working in little skunkwork groups. They’re working with partners to make it happen, and it now becomes the partner’s number one growth opportunity. I can grow at 11 or 12% in cyber every year. Compounded I can grow in 10% in managed services. [00:29:03] Jay McBain: You know, those are great double digit growth ’cause my customers are growing at 2.7% and I can go four x my customer, but I can go 10 x my customer if I have the right services built around ai. And this compounded growth rate and that big number in 2 20 32, 267 is what’s got those top 1000 partners obsessed. [00:29:25] Jay McBain: And your companies are leading with ai. Now you need to connect to those AI services. You need to get partners on this scale of growth. And they will be adding your name inside every cloud. They write on every whiteboard, but 82% of partners around the world, you know, we survey 25,000 of them aren’t ready, and they’re blaming vendors for not being ready, and they’re telling them exactly the workshops and the training that they need to get ready for this cycle. [00:29:53] Jay McBain: 82% of our entire partner, tens of millions of people, aren’t ready to grow at 35% and they need our help. Last thing I’ll say about AI is it’s the first time from client server to cloud, edge to cloud that it’s been segment driven. SMB alone has one, you know, six different segments, one to nine, 10 to 24, 25 to 49, et cetera. [00:30:18] Jay McBain: Mid-market into enterprise. No one that runs a restaurant is calling Jensen to buy a GPU to put next to the stove. No one’s calling Sam or Dario or anyone at Anthropic or OpenAI directly. They’re waiting. If you run a restaurant with all the people running around with tablets, you’ve invested in toast or square or clover or one of the platforms to run your business. [00:30:41] Jay McBain: A hundred different things. And you’re gonna wait for toast to work with a hyperscaler and build out the capabilities genetically. So when they see a spike in Uber Eats orders, they automatically place a food order and automatically change the staffing to deliver on it. That’s what the restaurant’s waiting for, and there’s no one calling and having a big a agent conversation. [00:31:03] Jay McBain: But even if you go into hundreds of people in medium sized business, every one of the vice presidents have their tech stack already built. I talked about the marketing person already, but the HR leader has one, and everybody’s got their seven layer stack. They’re not calling to buy a GPU and they’re not calling to, you know, bring in open AI directly or, or anthropic. [00:31:22] Jay McBain: They’re waiting for the platform they built to integrate together ag agenta capabilities. Everybody’s in wait mode up until enterprise and public, large public sector. So we are looking at this market and at 90% of that AI market is run by those thousand companies, and the rest of the millions of partners are helping in terms of how these businesses are gonna change at that level. [00:31:46] Jay McBain: Here’s where I end. You know, the 28 moments used to be a theory. It used to be a flywheel. How do we buy a car? [00:31:55] Vince Menzione: Well, we Google it, [00:31:57] Jay McBain: 81% of us now, 94% of us use large language models. We find out that there’s 365 brands of car. I’d have to test drive one every day of the year to get through them all. So we start narrowing these things down. [00:32:09] Jay McBain: We configure it. We put our rims on it, we color it. We download the invoice price. We download the backend rebates this month, whether I buy it in May or June, we find out what 5,000 people paid for our exact car within 50 miles of us. And then we don’t wanna go to the dealer because we know more than the salesperson, the manager ever will. [00:32:26] Jay McBain: We know what we’re gonna pay within, you know, dollars or cents. Just carvana the car. Hand me the keys. Let’s just forget the whole eight hour back and forth. I’ll get you a deal thing. I’m smarter than you in technology. Our customers are smarter than us, smarter than salespeople. That’s why 75% of millennials don’t wanna talk to a salesperson. [00:32:48] Jay McBain: They want to end digitally, and by the way, they’re not gonna send a fax after 28 digital moments. They’re gonna end on a digital marketplace. This is all demographics. It’s not hard to see where it’s going, but we’re getting into names, faces, places again. What if every dollar of your tam, the board, the CEO, runs around with their big multi-billion dollar number, they’re chasing? [00:33:09] Jay McBain: What if every single deal looks the exact same? This is a deal with AstraZeneca, A real deal, real customer spending millions of dollars. We know it starts in October, it ends in April. It’s a six month cycle. We see what they read, the MQ ls at the beginning. We see the sales demo moments. We see ISV, but we’ve never had the light blue boxes. [00:33:30] Jay McBain: What if we as a team could overlay the 6.3 partners in this deal? And when you find out a couple things. Here’s where I end. In December, five deals were one, three of them by NTT. The person at NTT probably coaches AstraZeneca’s, you know, kids’ soccer team. They probably have a cottage together at the lake. [00:33:50] Jay McBain: For the last 20 years, if the person at NTT worked at Deloitte, Deloitte would’ve run this deal. But Software One and Yash are both there, so we understand that when they were drawing clouds up on the wall in the boardroom in December, this deal was won and lost there. It was not won and lost at the point of sale. [00:34:09] Jay McBain: So what if you knew more about this and could see every dollar in your tam? You had an early warning system that this was happening. Two things jump out at this now that we’re in Bellevue. AWS was touched twice in this deal, directly in the marketing cycle and the sales cycle. AWS lost this deal. Here’s an example of Microsoft winning a deal with Microsoft never being touched. [00:34:34] Jay McBain: For some reason, NTT who won, who won AWS’s partner of the year a couple years ago led with Microsoft, so did Software one, Microsoft’s biggest reseller in Europe, and as did Yash, they all led with Microsoft and without Microsoft, knowing Microsoft took a multimillion dollar deal away from their competitors by winning in December. [00:34:53] Jay McBain: That’s one. Second. These partners didn’t just show up other than soccer and cottages. They didn’t show up in December. It went closed one in their CRM system. Back in the summer, August, September, we already knew AstraZeneca was in market, spending millions of dollars. We didn’t need them to read an ebook or go to an event to find that out. [00:35:17] Jay McBain: We knew it because it was closed one. They’re spending hundreds of thousands of dollars times five in December to know what to do at the end. This is an early warning system that’s better than any MQL, better than any SQL. And if you could give your company these level of view into their pipeline with an early warning system that I can work with those partners for months before they ever show up at the customer’s boardroom. [00:35:44] Jay McBain: This is it. Talk about 47% winners. This takes you from not only surviving the AI era to being a top five platform winner. Thank you very much. [00:36:01] Vince Menzione: Until next time, we’ll see you in person. Hopefully at our next event.
Olvídate de hacerle preguntas genéricas a ChatGPT; hoy vamos a ver cómo sacarle partido real y práctico a la tecnología para solucionar problemas cotidianos y quitarnos de encima la fatiga de decisión diaria.Seguro que te suena la película: post-its en la nevera, hojas de cálculo que se quedan desactualizadas y el clásico "¿qué cenamos hoy?" que acaba en improvisación o en una compra desorganizada. Para evitar esto, he diseñado un ecosistema de agentes basados en cuatro cajas de herramientas que llamamos MCP (Model Context Protocol). Estos protocolos permiten que la IA no solo responda preguntas, sino que interactúe de forma directa con mis datos y aplicaciones externas.Te explico de forma muy sencilla las piezas que componen este sistema:El RAG Semántico para las recetas: Tengo una base de datos vectorial con unas 1.700 recetas cargadas en PostgreSQL mediante pgvector. La clave es que no busco platos por coincidencia exacta de palabras. Si le digo que quiero "algo rápido y ligero con verdura", el sistema realiza una búsqueda semántica, entiende lo que busco y me propone las mejores opciones. Todo esto se procesa de forma económica mediante OpenRouter sin necesidad de tener una potente GPU en local.Los Skills y SQLite: Los "Skills" definen los procesos exactos que debe seguir el modelo. Le he marcado unas pautas sencillas: platos únicos mediterráneos para comer y cenas ligeras. Toda esta información se gestiona en una base de datos SQLite muy ligera.Lógica difusa en la lista de la compra: El asistente es capaz de agrupar ingredientes similares. Si dos recetas piden tomates en formatos distintos (por ejemplo, "tomates a granel" y "100g de tomates"), la lógica difusa los unifica bajo un mismo concepto para evitar duplicados en la lista de la compra, organizando además los productos por pasillos o secciones (como frutería o carnicería).Typst para exportar a PDF: Para ver el menú en una tablet o imprimirlo para la nevera, utilizo Typst, una alternativa moderna a LaTeX que me genera unos documentos PDF impecables en cuestión de segundos.Además, te cuento cómo puedes montar todo esto en local de manera gratuita con Ollama, y aprovecho para actualizarte sobre mis andanzas de vuelta al "cacharreo" puro en Linux: desde mis experiencias recientes con el editor Helix y "mkdr" (mi renderizador de Markdown para terminal), hasta "podcli", una pequeña utilidad para exprimir los feeds de podcast desde la consola.Espero que disfrutes de este episodio tanto como yo montando todo este tinglado. ¡A cacharrear!Capítulos del episodio:00:00:00 Agentes de IA que de verdad nos facilitan la vida00:01:42 El ejemplo práctico: Automatizar nuestro menú semanal00:03:51 La fatiga de decisión y por qué la disciplina humana falla00:05:38 Mi caja de herramientas: 4 MCPs (Model Context Protocol)00:06:58 Buscando comida con IA: El RAG semántico de 1700 recetas00:08:45 Búsqueda híbrida y embeddings económicos sin usar GPU local00:10:00 Simplificando las comidas: El papel de los "Skills"00:11:58 Organizando la base de datos de manera sencilla con SQLite00:13:31 Lógica difusa: Evitando duplicados en la lista de la compra00:15:23 Creando PDFs bonitos con Typst (la alternativa moderna a LaTeX)00:17:03 Demostración en directo: Generando el menú de la semana00:19:12 Automatización total: Generación automática de menús con Cron00:20:19 Revisión del menú, las recetas y la alternativa local con Ollama00:23:12 De vuelta al "cacharrero" de Linux: Helix, mkdr y Podcli00:24:51 Próximos episodios: Instalación desde cero a producción de Hermes00:25:38 Despedida y cierre del episodioMás información y enlaces en las notas del episodio
AI Unraveled: Latest AI News & Trends, Master GPT, Gemini, Generative AI, LLMs, Prompting, GPT Store
Description: Nate hosts a hardware‑heavy patch day as Wendy upgrades her main workstation from a Ryzen 9 3900X to a 5950X, experiments with 3D‑printed retro ITX cases, and shares updates on her MOVA V50 robot vacuum and UniFi travel router. Matt tunes an HP Omen Transcend 14 with OmenCTL and gives an MSI Trident 3 a GPU transplant, while Nate resurrects a retired Dell R740 into a TrueNAS‑powered “Franken‑NAS” built from leftover 16 TB drives and budget SFP+. Show Links: Wendy Ryzen 9 5950X vs Ryzen 9 3900X https://cpu.userbenchmark.com/Compare/AMD-Ryzen-9-5950X-vs-AMD-Ryzen-9-3900X/4086vs4044 3D artist profile (retro ITX cases) https://www.cgtrader.com/designers/sgw32 Retro-style mini ITX PC case https://www.printables.com/model/1225304-retro-style-mini-itx-pc-case ITX llama retro mini ITX case https://www.printables.com/model/1165579-itx-llama-retro-mini-itx-case Amiga-style mini ITX case https://www.printables.com/model/1351873-amiga-style-mini-itx-case MOVA V50 Ultra Complete Robot Vacuum https://us.mova.tech/products/mova-v50-ultra-complete-robot-vacuum UniFi Travel Router https://store.ui.com/us/en/products/utr Matt OmenCTL (HP Omen control utility) https://github.com/yunusemreyl/OmenCtl Sky Break (delisted game archive) https://archive.org/details/sky-break_delisted Nate TrueNAS https://www.truenas.com/ Rockstor https://rockstor.com/
Google acaba de lanzar una herramienta que puede cambiar la forma en la que usamos inteligencia artificial: Google AI Edge Gallery, una app que permite ejecutar modelos Gemma directamente en tu dispositivo, sin depender de la nube. En este episodio pruebo modelos locales en Mac, analizo cómo funcionan con imágenes, texto y audio, comparo el rendimiento usando GPU y CPU, y te muestro algo clave: esta IA puede trabajar offline, de forma privada y gratis. Pero no todo es perfecto. También aparecen limitaciones importantes: respuestas inconsistentes, problemas de contexto, diferencias frente a Gemini en la nube, límites en transcripción de audio y una experiencia que todavía parece estar en desarrollo. La gran pregunta es:¿Estamos viendo el futuro de la inteligencia artificial personal, corriendo directamente en nuestros celulares, iPads y computadoras?¿O todavía es apenas un experimento prometedor? En este video te muestro pruebas reales, resultados concretos y mi opinión sincera sobre lo que Google acaba de poner sobre la mesa.
3-4부 [신스틸러] 지선 마무리…이번주 정국 신스틸러는? - 이상민 크리에이터 - 노영희 변호사 (출연) [이슈하이킥] 젠슨황과 GPU 회동 - 배경훈 부총리 겸 과학기술정보통신부 장관 (출연)
Wir sprechen über aktuelle Technikthemen rund um Infrastruktur, Open Source und KI. Ein Schwerpunkt ist Sebastians stark automatisierte Kubernetes-Umgebung auf Talos Linux mit GitOps und KI-Agenten unter menschlicher Kontrolle. Außerdem diskutieren wir Plattformfragen, Sicherheits- und Lieferkettenthemen sowie verschiedene KI-Entwicklungen. Zum Schluss greifen wir noch einige kleinere Themen aus dem Entwickleralltag und Werkzeuge für lokale LLMs auf. Blast from the Past Kubernetes Cluster ist nun live! https://www.siderolabs.com/talos-linux https://github.com/kreativmonkey/homelab-gitops payphonetag Froscon Toter der Woche Aus für De-Mail – warum das @ das eingekringelte e besiegte wero Aus für Ubuntu Pastebin – Abschaltung Ende Juni 2026 feedburner Untoter der Woche Stuxnet's Older Brother Revealed After 21 Years (video) fast16 | Mystery Shadow Brokers Reference Reveals High-Precision Software Sabotage 5 Years Before Stuxnet AI der Woche Continue Y/N Torvalds nennt KI Bug Reports “reine Zeitverschwendung” … aber curl Entwickler “zeigt sich versöhnlich” https://hothardware.com/news/new-ai-cyber-worm-thinks-up-its-own-attacks-to-infect-computers Anthropic: Weltweite Pause bei KI-Entwicklung ‘sinnvoll’ Anthropic Bewertung 965 Millarden rsync drama rsync analyse Google Chrome silently installs a 4 GB AI model on your device EU AI Act: Transparenzpflichten ab August 2026 Jakob gewinnt Gemma4 12B Bonsai 4b News Backblaze has quietly stopped backing up your data Debian must ship reproducible packages Cloudflare kauft Vite: Open Source und herstellerneutral – mit Millionenfonds https://arstechnica.com/security/2026/06/dozens-of-red-hat-packages-backdoored-through-its-offical-npm-channel/ https://www.golem.de/news/nur-ein-client-noetig-http-2-bomb-legt-webserver-in-sekunden-lahm-2606-209396.html Blog Post Themen Was eigentlich wenn kein GitHub? Ghostty Is Leaving GitHub Codeberg Gitlab BitBucket (nein!) Hackergarten 3D-Druck der Woche Bambu Lab: I’m reposting your code & I dare you to sue me. (video) Bambu Lab 3D printers: Never again (video) baltobu Zauberstab zum Bezahlen Weltumwelttag “PET Recycling” Mimimi der Woche modules C++20 tooling Python click Nix & SELinux Nix: cross-compiling Updates sind scheiße! Brother Drucker mit neuem Zertifikat Cosmic Desktop Nix Logo Lesefoo I put a datacenter GPU into my PC searchcode.com's SQLite database is probably 6 terabytes bigger than yours How I run multiple $10K MRR companies on a $20/month tech stack Serving a Website on a Raspberry Pi Zero Running Entirely in RAM NixOS auf Flint 2 You don’t love systemd timers enough! Picks IPv8 is finaly here Internet Protocol Version 8 (IPv8) The Unsolved Mystery of Lorem Ipsum (video) ODROID H5 Mechanical Pencil Umweltkosten durch Vibe Coding: Tool berechnet CO₂-Ausstoß für Claude Code Artikel von Heise taken (again)
In this episode of Tank Talks, Matt Cohen and John Ruffolo unpack the latest leaked details around Canada's national AI strategy, including a proposed Canadian Tech Growth Fund that would take direct equity stakes in AI startups and scale-ups. John pushes back on whether creating yet another government-backed fund solves the real problem or simply adds more confusion to an already crowded funding landscape.The conversation then moves into the AI capital arms race, where Anthropic, OpenAI, SpaceX, and Alphabet appear to be racing toward public markets and massive equity raises at the same time. Matt and John unpack Anthropic's reported path toward a late 2026 IPO, Alphabet's massive $80 billion equity raise to fund AI infrastructure, and why even companies with enormous free cash flow may be rushing to secure capital before debt markets tighten further.The episode closes with what Matt calls the “fugazi” layer of the AI boom: complex GPU financing structures, off-balance-sheet debt, SPVs, and Michael Burry's criticism of NVIDIA's xAI-related financing arrangement. From Canada's AI strategy to Alphabet's infrastructure spend to opaque AI financing models, the core question is clear: is this the beginning of a new AI-driven market cycle, or are the biggest players trying to raise capital before the music stops?Canada's New National AI Strategy & Tech Growth Fund (00:52)Matt introduces leaked details of Canada's expected national AI strategy, including a new Canadian Tech Growth Fund that would take direct equity stakes in AI startups and scale-ups, along with additional funding for the AI Compute Access Fund.Direct Investment vs. Backing Canadian VC Funds (05:02)John argues that government capital may be more effective when deployed through BDC, EDC, and Canadian venture funds, rather than direct government selection of startups. The concern is that direct investment could create political complications and distort private capital markets.Anthropic's $65B Raise and Potential 2026 IPO (09:02)The conversation shifts to Anthropic's massive fundraising round, reported $900 billion pre-money valuation, and potential late 2026 IPO path. Matt frames it as part of a broader wave of trillion-dollar AI and space-related public market activity.The IPO Race Between Anthropic, OpenAI, and SpaceX (10:04)Matt and John discuss whether the IPO window is reopening or whether the biggest private companies are rushing to get out before capital markets become less forgiving. John speculates that Anthropic may want to reach public markets before OpenAI captures investor attention.Alphabet's $80B AI Infrastructure Raise (12:18)Matt outlines Alphabet's reported $80 billion equity raise, including a private placement to Berkshire Hathaway, a public offering, and an at-the-market equity program. The raise is positioned as fuel for Alphabet's unprecedented AI infrastructure build-out.The AI Infrastructure Cold War (14:41)Matt argues that hyperscalers like Google are proving that frontier AI economics are fundamentally different from prior technology waves. John compares the AI arms race to baseball owners escalating salaries because no one can afford to fall behind.Michael Burry, NVIDIA, xAI, and “Fugazi” GPU Financing (16:01)Matt breaks down Michael Burry's critique of NVIDIA's GPU financing structure involving Valor, xAI, Apollo, Athene, and an SPV. The arrangement raises questions about revenue recognition, asset ownership, credit risk, and who ultimately carries the liability.The Real Question: What Happens When the Music Stops? (17:55)The episode ends with Matt and John questioning how these layered financing structures will play out as AI CapEx continues to explode. From public markets to SPVs to off-balance-sheet risk, the AI boom is starting to look less like a clean growth story and more like a capital market stress test.Connect with John Ruffolo on LinkedIn: https://ca.linkedin.com/in/joruffoloConnect with Matt Cohen on LinkedIn: https://ca.linkedin.com/in/matt-cohen1Visit the Ripple Ventures website: https://www.rippleventures.com/ This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit tanktalks.substack.com
Chief Fixed Income Strategist Vishy Tirupattur takes a look at how credit markets are adapting to fund the new phase of AI capex.Read more insights from Morgan Stanley.----- Transcript ----- Welcome to Thoughts on the Market. I am Vishy Tirupattur, Morgan Stanley's Chief Fixed Income Strategist. Today – The critical question behind the AI-driven capex cycle that is front and center for markets year to date. How is credit market financing this ecosystem evolving? It's Wednesday June 3rd at 2 pm in New York. When we first discussed the role of credit markets in financing the AI and data center build-out around the middle of last year, the direction of travel was clear. Realizing the transformative potential of AI requires unprecedented levels of capex. What has really surprised us since is the scale and speed of that spending, both of which have exceeded our expectations by a wide margin. The upward revision to capex expectations has been dramatic. A year ago, we projected the combined capex of the five large hyperscalers at roughly $450 billion in both 2026 and 2027. After the first quarter earnings reports, Morgan Stanley's internet equity analysts, led by Brian Nowak, now expect hyperscaler capex of roughly $800 billion in 2026 and $1.2 trillion in 2027. One data point really captures the surge in the underlying demand for compute. According to OpenRouter, the global weekly token usage, which is a key proxy for compute, has risen by roughly 350 percent since early January, increasing from about 6 trillion tokens to 28 trillion tokens. Credit channels for financing this capex have not only been broader and deeper than we anticipated, spanning public and private markets, but have seen remarkable in the structural innovation that is blurring the lines between public and private markets. Over $200bn of public AI-related issuance across the different credit channels has happened just in the first five months of this year. We had previously assumed unsecured issuance would be limited by the scale of the largest non-financial issuers, confined to investment grade credit only, and largely USD denominated. Instead, some hyperscaler issuance has now far exceeded even the largest telecom names; funding has expanded well beyond USD into EUR, GBP, CHF, JPY and CAD markets. The issuer base has also broadened to include data center REITs and neoclouds, particularly in the high-yield market. The scope of financing has also widened beyond the data center shells themselves. GPU financing, which we assumed would be funded entirely through equity capital, has begun to migrate into credit markets. Funding is now coming through broadly syndicated loans and asset based financing, with ABS structures not far behind. Structural innovation illustrates how rapidly the credit ecosystem is adapting to the complexities of demands of AI-driven capex. Financings that combine elements of project finance, tranching, and residual value guarantees, along with high-yield issuance backed by hyperscaler guaranteed leases – these are innovations that we have never seen before. These structures have expanded the investor base, reduced the funding frictions, and further blurred traditional boundaries – between both corporate and project finance, and public and private credit markets. At the same time, physical, operational, and political constraints are beginning to shape the pace and the composition of the AI infrastructure build-out – and, by extension, the demand for financing. Grid access, power generation equipment, skilled labor, and permitting delays are emerging as significant constraints. These are compounded by political and regulatory frictions at the local, national, and international level. As power availability becomes a gating factor, the AI build-out is likely to pull energy infrastructure financing more tightly into the orbit of AI infrastructure financing. The clear takeaway is this. The capex requirements underpinning AI infrastructure are expanding exponentially, and with them the role of credit markets in financing this build-out. Along the way, there will be winners and losers, periods of adjustment, and a range of physical, financial, and political constraints that shape outcomes on the margin. But the broader trajectory is certain. The scale, duration, and strategic importance of AI infrastructure investment mean that financing of this will remain a defining theme for credit markets and credit investors for years to come. Thanks for listening. If you enjoy the podcast, please leave us a review wherever you listen and share Thoughts on the Market with a friend or colleague today.
Software Engineering Radio - The Podcast for Professional Software Developers
Dave Airlie, a Distinguished Engineer at Red Hat, speaks with host Gregory M. Kapfhammer about Linux kernel maintenance. After over-viewing the scale and structure of the Linux kernel, they dive deep into the review and validation of kernel patches, drawing on examples from the GPU subsystem. After discussing the features and benefits of the Linux kernel's maintenance model, they also explore kernel maintenance best practices and the supporting tools for these practices. Dave and Gregory also discuss topics such as the integration of Rust code in the Linux kernel and the ways in which AI-driven code review are influencing kernel maintenance.
SUMMARY: After the first successful AI IPO of 2026, we dig into what makes the Cerebras WSE architecture unique in the market for fast inference. GUEST: Andy Hock, at Chief Strategy Officer at Cerebras AISHOW: 1033SHOW TRANSCRIPT: The Enterprise AI Show #1033 TranscriptSHOW VIDEO: https://youtu.be/ed2nVbOtZiASHOW SPONSORS:OutShift - “Scaling Out Superintelligence” The Internet of Cognition architectureShareGate - ShareGate Protect. Microsoft 365 Governance, we got this!Nasuni - Activate your data for AI and request a demoSHOW NOTES:OpenAI announces 750MW partnership with CerebrasCerebras and AWS partnershipCerebras announces IPOTopic 1 - Welcome to the show. Tell us about your background, and what you focus on today. Topic 2 - For anyone that's not familiar with Cerebras, give us an overview of the company, and especially an overview on the Cerebras technologies (e.g. Wafer-Scale Engine).Topic 3 - Cerebras' WSE architecture is different from many of the GPU or GPU-like architectures in the market today. Centralized vs. distributed architectures always have their tradeoffs. Walk us through the technical and economic value of the Cerebras architecture.Topic 4 - Congratulations on the recent IPO (raised $5.55B). Let's use that as a point in time vs the previous planned IPO. How has the market changed in that timeframe, and how has the Cerebras position changed? Topic 5 - Cerebras (today) offer both WSE hardware, and Cerebras Cloud (API) - very different GTM paths. Can we expect both of those to stay top priorities, or have the market dynamics shifted such that the priorities shift more towards the WSE business - as we're seeing OpenAI, AWS and other engagements announced?Topic 6 - Is Cerebras a training and inference company, or are the economics of inference significantly different enough that it needs to be the sole focus of the company (for now)? Topic 7 - How much effort is it for any company to add support for the Cerebras chips if they have previously been using other architectures?Topic 8 - An IPO is a major milestone for any company, but the markets will now look for your future story. How do you see the AI market evolving over the next 2-5 years, and what are some things that people aren't understanding yet about how it will evolve?FEEDBACK?Email: show @ the enterprise ai show dot comeBluesky: @TheEntAIShow.bsky.socialTwitter/X: @TheEntAIShowInstagram: @TheEntAIShow
In this episode of Alexa's Input (AI), I sat down with Rob Shaw from Red Hat to talk about how AI inference evolved from a simple model serving problem into a large-scale distributed systems problem.We explored the infrastructure shifts behind modern LLM serving, including how vLLM and PagedAttention changed the economics and efficiency of inference, why KV cache management became one of the most important bottlenecks in production AI systems, and how orchestration layers like llm-d are emerging to coordinate distributed inference.We also discuss:how LLM inference differs from traditional model serving runtimesKV cache, prefix caching, and cache-aware routingwhy throughput and latency became major infrastructure challengeslong-context agents and repeated inference callsdistributed inference on Kubernetesintelligent routing, flow control, and load balancingprefill/decode disaggregationenterprise AI deployment realitiesvLLM has become one of the most important open-source projects in AI infrastructure, and llm-d represents a newer shift toward treating inference as a coordinated distributed system rather than just a single runtime problem.If you want to better understand the systems layer beneath modern AI applications, this episode is a deep dive into where inference infrastructure is heading next.General Podcast LinksWatch: https://www.youtube.com/@alexa_griffithRead: https://alexasinput.substack.com/Listen: https://creators.spotify.com/pod/profile/alexagriffith/More: https://linktr.ee/alexagriffithLearn more about the host atWebsite: https://alexagriffith.com/LinkedIn: https://www.linkedin.com/in/alexa-griffith/Find out more about the guest at:LinkedIn: https://www.linkedin.com/in/robert-shaw-1a01399a/ Red Hat Articles: https://developers.redhat.com/author/robert-shawGithub: https://github.com/robertgshaw2-redhat ResourcesvLLM Website: https://vllm.ai/vLLM GitHub Repository: https://github.com/vllm-project/vllmllm-d Website: https://llm-d.ai/llm-d GitHub Repository - https://github.com/llm-d/llm-d KeywordsAI inference, VLLM, LMD, distributed inference, GPU optimization, open source AI, Kubernetes, multi-cluster deployment, AI infrastructure, enterprise AI AI infrastructure, Kubernetes, model optimization, speculative decoding, mixture of experts, AI deployment, performance tuning, AI systems, neural network scaling Key TopicsEvolution of vLLM and llm-dDistributed inference and routingGPU utilization and performance optimizationOpen source AI infrastructureEnterprise deployment challenges and solutions Standardization in Kubernetes for NIC exposurePerformance optimizations: quantization and speculative decodingMixture of experts architecture and parallelism strategiesFlow control and request scheduling in AI systemsEmerging hardware for AI inference, Cerebras processorReinforcement learning and AI system supportModular architecture of vLLM and ecosystem projects
There is growing demand for time with GPUs, the chips that power artificial intelligence. AI companies need those chips in order to keep their models up and running. And to do that, they can reserve time with a GPU. Now, there's interest from Wall Street in creating a futures market for this AI compute time, essentially treating it like a commodity. Marketplace's Stephanie Hughes spoke with Liz Hoffman, business and finance editor at Semafor and host of the “Compound Interest” podcast, who recently wrote about this.
There is growing demand for time with GPUs, the chips that power artificial intelligence. AI companies need those chips in order to keep their models up and running. And to do that, they can reserve time with a GPU. Now, there's interest from Wall Street in creating a futures market for this AI compute time, essentially treating it like a commodity. Marketplace's Stephanie Hughes spoke with Liz Hoffman, business and finance editor at Semafor and host of the “Compound Interest” podcast, who recently wrote about this.
The energy transition conversation focuses on what connects to the grid. Far less attention goes to whether anyone is coordinating what those assets do once connected. AI training runs swing hundreds of megawatts in seconds as GPUs checkpoint and restart a profile that looks like a generator tripping offline. At distribution level, millions of inverter-based resources create localised variability that overwhelms individual circuits even when aggregate models look healthy. The planning tools in use today were designed for neither problem.Host Bridget van Dorsten is joined by Kay Aikin, CEO and Founder of Dynamic Grid, energy engineer, grid architecture advisor to the DOE-supported GridWise Architecture Council, and contributor to the UN Environmental Program's building decarbonisation work. Kay unpacks what an AI training facility actually does to the grid with full GPU load for hours or days, then a drop to ten percent in seconds during checkpointing. She talks about how at the scale now planned, the Stargate project in Texas alone could represent ten percent of ERCOT disappearing in four seconds. The behaviour is stochastic and cannot be modelled with traditional statistical tools. At distribution level, virtual power plants responding to wholesale signals without circuit-level visibility can create competing oscillations, the kind of emergent dynamics that contributed to the Spanish grid failure.The proposed fix is an AI controller at the substation, sending price-based signals and flexible operating envelopes to large assets and VPP operators, giving them twenty-four-hour forecasts and real-time circuit visibility. Total cost: under a hundred thousand dollars installed. The reason it isn't everywhere is cost-of-service regulation. Utilities earn returns on deployed capital, so a million-dollar transformer replacement is more profitable than software that eliminates the need for it.Without new approaches, rebuilding the US distribution grid could cost up to ten trillion dollars by 2040. Kay is developing grid utilisation metrics with regulators in Maine, Virginia, and Maryland to incentivise extracting more from existing infrastructure. The episode closes on the need for distribution system operators and the affordability death spiral that looms if the structural incentives don't shift. See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.
AI infrastructure is breaking the old data center model.In this episode of Liftoff with Keith, I sit down with Lukas Gentele, CEO & Co-Founder of vCluster Labs, to unpack what it really takes to operate GPU infrastructure at scale in 2026.As AI workloads explode and neoclouds race to meet demand, Lukas and his team are building the operational backbone for modern AI clouds — from managed Kubernetes and tenant isolation to automated node provisioning and GPU lifecycle management.We discuss:Why traditional data center assumptions are collapsing under AI pressureWhat's fundamentally changed since the VMware eraHow an early partnership with CoreWeave shaped vCluster's trajectoryAnd the one mistake AI cloud operators are making right now that could hurt them over the next 18 monthsIf you care about AI infrastructure, GPU economics, hyperscaler strategy, or building category-defining platforms — this conversation is essential.Sponsor Info: We are strategic business advisors with decades of leadership experience and a proven track record of driving businesses' growth. We specialize in creating custom-tailored strategies to introduce your company, drive growth, build leadership teams, and ensure companies implement appropriate compensation programs. Our mission is to utilize our expansive network to benefit your company https://www.compass-strategic-advisors.com/ Connect with Lukas Gentele: Website: https://www.vcluster.com/ LinkedIn: https://www.linkedin.com/in/gentele/ Subscribe for more founder insights and hit the bell for notifications! Follow us on our channels for exclusive startup content and behind-the-scenes insights from interviews like this one. Spotify: https://open.spotify.com/show/3cFpLXfYvcUsxvsT9MwyAD?si=f5a14e779777487d Apple Podcasts: https://podcasts.apple.com/ca/podcast/liftoff-with-keith-newman/id1560219589 Substack: https://keithnewman.substack.com/ Newman Media Studios: https://newmanmediastudios.com/ LinkedIn: https://www.linkedin.com/company/liftoffwithkeith For sponsorship inquiries, please contact: sponsorships@wherewithstudio.comFrom the Host: A special shout-out to our Great Host of the Ignite Studios: https://www.ignitegtm.com/ and Producers of AI Infra5 @Plug and Play World, HQ in Sunnyvale, CALiftoff is sponsored by a strategic consulting firm and the M&A specialists at Compass Strategic Advisors - https://www.compass-strategic-advisors.com/ and The GTM Firm - https://www.thegtmfirm.com/
Scott Hanselman talks with Omar Shorbaji from the Anyscale engineering team about how Anyscale on Azure scales Python AI workloads from a single notebook to thousands of CPUs and GPUs. Built on Ray, the most widely adopted AI compute engine, Anyscale gives you a unified runtime to build, train, and serve, running directly on Azure Kubernetes Service without the complexity of managing Kubernetes. See a live demo that fine-tunes a vision-language-action robotics policy, with the metrics you need to push GPU utilization higher. Chapters 00:00 - Introduction 00:52 - Ray and the Anyscale platform 03:11 - Start of demo: Workspaces 04:38 - Running a job and viewing utilization metrics 05:24 - Choosing the right scale 06:53 - Abstracting Kubernetes on AKS 08:53 - Wrap up and where to learn more Recommended resources Learn Docs Anyscale on Azure Connect Scott Hanselman | Twitter/X: @SHanselman Anyscale | Twitter/X: @anyscalecompute Azure Friday | Twitter/X: @AzureFriday Azure | Twitter/X: @Azure
I'm excited to work with Microsoft once again as the presenting sponsors of the AI Engineer World's Fair! We'll streaming live from MS Build today for a special crossover pod with our friends at No Priors and the one and only Satya Nadella. However we did not hold back with this interview - we asked all the burning questions about uptime and Copilot that we know you have in your minds. Lets go!For almost two decades, GitHub has been the home of software, where both open source and closed flow, through commits, pull requests, reviews, actions, etc.This ecosystem flourished as open-source maintainers and contributors would continue shipping code for the benefit of the community. However as coding agents began to ship mass quantities of code - growing 1400% in 2026, it marked a new era that was both extremely exciting and challenging for GitHub.While these agents help more people ship more projects, they also significantly increase the floor of how much code is shipped, how often it is shipped, how many people commit code, and basically orders of magnitude multiples in every dimension of GitHub infrastructure:Now GitHub inevitably experiences more pressure on their infrastructure which was originally designed around human developers moving at human speed. This has resulted in a very publicly notable uptime story:So it begs the question of whether current systems around code can absorb what AI produces. Can CI/CD keep up when every idea becomes a build? Can open source maintainers survive floods of AI-generated slop contributions? Can GitHub preserve the human social contract of software while becoming the operating layer for agents?Which brings us to the perfect person to answer these questions: GitHub COO Kyle Daigle. In this episode, he joins swyx to unpack what happens when AI doesn't just autocomplete code, but starts changing how companies operate, how open source works, how pull requests get reviewed, and how GitHub itself has to scale. We go deep on GitHub's internal AI workflows: micro-skills, WorkIQ, MCP, Slack, Teams, email, Copilot workflows, the new Copilot desktop app, CLI, cloud agents, and how Kyle uses agents to look backwards across company context before deciding what to do next. Kyle also reflects on GitHub's history building webhooks, APIs, Actions, npm, Dependabot, and Semmle, why the AI era is breaking GitHub in new ways, how Actions became a general-purpose compute layer, and what Copilot becomes after code completion.Full Video PodWe discuss:* Kyle's expanded role across GitHub* How AI got Kyle coding again after years in leadership* Why GitHub rolls out AI through existing workflows instead of forcing new tools* WorkIQ, MCP, Slack, Teams, email, and GitHub as company context* Why massive “mega-skills” are giving way to small, atomic micro-skills* How AI changes summarization, communications, marketing, and analyst work* Why former developers in leadership may have a unique advantage in the AI era* Kyle's “15 agents on Saturday” workflow* How Kyle built an AI-generated executive presentation for CRO/CFO teams* Why AI changes the chief of staff role without removing the human work* GitHub Actions, webhooks, arbitrary code execution, and secure agent compute* The npm acquisition, supply-chain security, 2FA, and token invalidation* Slop forks, vendoring, and whether AI agents change dependency management* What pull requests become when most PRs come from agents* Prompt requests, vouching, AI review, and trust in open source* What counts as a “developer” when AI lowers the barrier to building* GitHub Spark, low-code, and why GitHub refuses to hide the code* 14x commit growth, Actions load, databases, monorepos, and availability* Copilot's evolution from completion to CLI, desktop app, cloud agents, and SDK* Context, memory, rules, and making GitHub “act like Kyle wants it to act”* Ambient AI, OpenClaw, enterprise security, and the new operating system for agents* What swyx should ask Satya Nadella about Microsoft's AI futureKyle Daigle* LinkedIn: https://www.linkedin.com/in/kyledaigle* X: https://x.com/kdaigleTimestamps00:00:00 Introduction00:03:36 Why AI Got Kyle Coding Again00:07:04 Running GitHub with AI: WorkIQ, MCP, Slack, Teams, and Skills00:15:39 The Golden Age for Former Developers in Leadership00:17:31 15 Agents on Saturday and AI-Generated Executive Work00:20:20 How AI Changes the Chief of Staff Role00:21:45 GitHub's History: Actions, npm, Webhooks, and Open Source00:28:45 Slop Forks, Vendoring, and AI Dependency Management00:33:57 Pull Requests, Prompt Requests, and Trust in Agent-Generated Code00:41:21 GitHub Stars, 200M+ Developers, and the New AI Builder Wave00:45:15 GitHub Spark, Low-Code, and Why GitHub Still Shows the Code00:47:38 GitHub's Hardest Era: 14x Growth, Reliability, and Scale00:59:21 Actions as the Compute Layer for CI/CD and Automation01:02:04 The State and Future of GitHub Copilot01:08:24 Ambient AI, Background Agents, and the Future of the SDLC01:13:09 OpenClaw, Enterprise Security, and the New OS for Agents01:18:03 Build Announcements, WorkIQ, FoundryIQ, and Microsoft Context01:21:41 What Should swyx Ask Satya?TranscriptIntroduction: Kyle Daigle's Expanded Role at GitHub and MicrosoftSwyx [00:00:00]: We're here with Kyle Daigle, COO of GitHub. Welcome.Kyle [00:00:07]: Hey, thanks for having me.Swyx [00:00:08]: You're not just CEO of GitHub. People know you as that. You have a new role.Kyle [00:00:11]: So I have an expanded role now. I've been working at GitHub for thirteen years and doing all things developer. Joined as a developer myself. And now, I'm also responsible as the CMO of Developer for Microsoft. And so all the kind of learnings and passion for developers and how we work with them and how we communicate and how we bring our products to market, we're also bringing that expertise to the broader Microsoft ecosystem and helping every developer that uses a Microsoft product or would like to have a sort of similar experience that they've had with GitHub over the years. So it's a different role in some ways, but it's also just building on the experience that I've had at GitHub of just sort of tell the truth, be authentic, show people how to use it and then let the products speak for themselves. Now just doing that with, all of Microsoft.Swyx [00:01:09]: We'll be releasing this in conjunction with Build. You got lots of stuff planned, and we can sort of touch on that whenever it's appropriate. I think one of the interesting things is I rarely meet a COO who's also a CMO. I think you're a very outward facing and you're very confident publicly. That's rare. Do you actually view yourself as COO? What's What is your thing?From GitHub Developer to COO/CMO: Building the Platform and Operating GitHubKyle [00:01:33]: I think for me, it's been funny. The titles have always been, a— have always felt a little strange to me. I joined GitHub as a developer? I wrote so much of theSwyx [00:01:46]: Let's bring that up. You wrote the back ends?Kyle [00:01:48]: I was going through, I was going through, some old photos, when folks were talking about how things were being built or how there was a build GitHub. I built, webhooks and worked with teams building the API, built the platform layer. Anything that integrated with GitHub, up until really twenty eighteen, I built or ran the engineering teams. And that's kind of where my the beginning of my passion always was helping people build things, deliver them to, their customers. And so being a developer, building for developers was always super unique. In a— I think as my role expanded, it became my ability to talk to not just developers, but also enterprise customers or business leaders and have this translation layer. And then through all those years, GitHub has always operated pretty uniquely. Post-pandemic, working remotely was not as novel as it was when GitHub started in two thousand and eight. But all that expertise of running remote teams, doing it well, became this sort of bigger role, ultimately turning into the COO role of how do we operate GitHub in the way that GitHub's always operated after the Microsoft acquisition. And kind of so on from there. So like for me, I think the— I've, I still code. I love coding but the problem has always been, people. It's a much harder problem to both support our own employees, a harder problem to communicate to developers and enterprise buyers what we're building why it matters, ‘cause those are two very different messages. And so getting to work in the mix of COO, CMO, also just being a dev, I think is what's kept me at GitHub for so long.AI Workflows for Leadership: Commits, Retrospectives, and ContextSwyx [00:03:40]: Apparently, you have— your commits have gone up. What's this? What's going on?Kyle [00:03:45]: Rui's called me out pretty aggressively. So I think— as you can imagine, right, you can see my normal era of being a dev In the twenty thirteen, twenty fourteen era, and then moving into management, and then ultimately the COO role. I think what you see there is me, really getting back to coding thanks to AI. I— similar to, attaching problems between how to market and how to operate a business and how to code, I find, building agents and workflows that are connecting very disparate problems to be what's driving this. So that's, some of it's writing software. A lot of it is, connecting a ton of a different data sources to, help me out. But that is completely me really diving in on the AI side in trying out our tools, trying out everyone's tools, But building for me, building for the non-technical leader, though I'm technical and how we're, able to use these tools more than just the simple, call and response that I think a lot of the non-technical, your employers, you have to get— you have to use AI, and so everyone uses, ChatGPT or Copilot or Claude or whatever. To really get into, how is this going to help me out, it— I find that it's not the I need to write a blog post, I need to those simple examples. Helping people find the workflows of, “Okay, I need you to go through all the PRs today. I need you to go through everything that we've posted online. I need you to go through what we did the last three months. Go through all of my Obsidian notes for any mentions of this then go through my transcripts at work.” We use, Teams, so, using WorkIQ, go call that MCP server, grab all the transcripts, go through all the Slack, and then build me out the plan of, what this week's messaging actually was. That's something that was, impossible because for me, I find AI in a what most of this launch here is actually, less building forward. It's actually, a recursive loop backwards. I'm always looking at what had happened first. Go back through the week and tell me what we did, what worked, what didn't work? And then tell me in the next three or four days-What would you tweak based on this sort of like looking backwards and then looking ahead a little bit? I find that to be so much more valuable, especially for like non-technical, because that retrospection is actually LLMs are very good at that. Like finding all the patterns, pulling them out, and then applying that retrospection to just a couple of days or just like a short period of time. Is all a bunch of apps that I've built and launched a bunch of, internal tools. I use the new, GitHub Copilot app, the desktop app with workflows. Every time I crack open my laptop, it's running workflows for me. It's just a ton of different stuff and of course, it all ends up on, it all ends up on GitHub.Swyx [00:06:47]: Of course. That's where, that's where, stuff is hosted. Man, there's so much to ask you. I was going to leave the how do you run a company with AI thing at the end. I have to ask one— double click one thing. You said, you are looking back at the week. You're, you're understanding what happens. When you say we That's three thousand people. How?Rolling Out AI Internally: Skills, CLIs, and Company ContextKyle [00:07:09]: I think when we started rolling out AI internally beyond engineering, right? One of the things that I was really, passionate about is like we have to do this in a way where no one has to change how they work. I don't want to have to teach you a tool. I don't want to have to teach you something new. And so for us, we tried out a few tools. Most of them don't work because I got to get you on board? I got to teach you how to use it. What we've actually ended up doing is we've built like a set of skills internally. We have we each have our set of skills, and we've just been distributing even to the non-technical folks, the CLI. And then effectively, we're just giving it access to like read about everything that we're writing. So that's for us, that's usually GitHub, Teams, Email, and Slack. So Teams for, video chat, generally speaking.Swyx [00:08:03]: Teams and Slack?Kyle [00:08:04]: so we use Teams for video communication, but we don't use it for chat. W-we— GitHub for a long history, right? We're alwaysSwyx [00:08:13]: Also SlackKyle [00:08:14]: Talking about ChatOps and like everything is built into Slack. Like every command, every flow.Swyx [00:08:18]: So even though you have been acquired for I don't know, eight years nowKyle [00:08:22]: we stillSwyx [00:08:23]: You still use Slack?Kyle [00:08:23]: it's a purpose-built tool for us, and I think the reality is that moving off of it would be so bluntly expensive? Simply because all the tooling is, baked in with that paradigm. And they both have their pros and cons but they don't work the same way at all. We still use a bunch of different tools Because it's the purpose-built tools that We need. And thenSwyx [00:08:47]: Well, the same doesn't go for the rest of Microsoft, presumably.Kyle [00:08:50]: like the like various teams like operateSwyx [00:08:53]: They make their own decisionsKyle [00:08:54]: Various ways. I think it just matters what you're trying to what you're trying to do. But we do we do work across kind of every tool that we use, and then by giving everyone access to all of that context and the new WorkIQ MCP server, which is quite cool if you do live in the M365 like world. I can ask it all these backwards-facing questions, and it's incredibly important for our teams that are working remotely. There's a lot of stuff you miss when you're not in an office, and we are spread out all over the world. So most of that is looking back. And then we post, we post either auto-automatically into GitHub issues or discussions, these sorts of like findings or like our industry reports. Like what's happening this morning, today, yesterday. A little automation gets run. We'll use the app. We might use GitHub Actions like with, our agentic workflows just to go do that run, and then we push it into GitHub, and w-we keep having a conversation. So usually for us, it's about that sort of like looking back, looking forward on the non-technical side. And then of course for a lot of those folks, it's also building an app, pushing it to GitHub pages or pushing it somewhere to host it et cetera. But it's just like enabling everyone with that power of it's going to take me a week to figure this out. Instead, we're going “Okay I built a skill. Let's put it into a repo. We'll all share that skill together, and then we'll use the CLI or now the app-” “just to run it.”Micro Skills vs. Mega Skills: How GitHub Uses AI at WorkSwyx [00:10:26]: All right. I think, I think we're going straight into like the team management and productivity thing. I think a lot of people are getting various levels of LLM psychosis. How do you manage the bloat of skills? Like everyone Has their thing, and they're Like trying to promote it to the rest of their peers in their org, right? And obviously, whoever becomes a skill influencer internally becomes like an AI leader, right? Of sorts. I assume you have those.Kyle [00:10:50]: like I think we haveSwyx [00:10:52]: And I assume it's a mess a Yeah.Kyle [00:10:54]: there's like I— like I think the reality is there's two pieces. Like first is I think that we're ending the era of these like massive, beautiful, perfect skills that are just like not any of those things. ‘cause for a while, right every tweet every day is like go download the skills, the perfectly managed thing to do this entire workflow. And I think that like what we've found and what— I was just with my team, this week, and we were talking about the skill side, and we're really talking about these like incredibly micro skills that are just doing one thing for us very well Versus a skill that's going to do I said, that full report. That doesn't really exist on our side anymore. It's usually how do— like a single skill that's going to identify the most important marketing information given any MCP server. Like this is the most important thing. Less about stitch a bunch of tools together and have it produce this mega output because then weeks go by, months go by, things change, and you want to tweakSwyx [00:11:58]: It's brittleKyle [00:11:58]: Your mega skill and you're screwed? You can't do that. And so now we're really just talking about the Legos we're using and just letting the instruction book be something we're all putting together. Whereas I think a lot of AI skills for a while have been that mega instruction book style.Swyx [00:12:15]: I've, thought a lot about Postel's law. I don't know if that's a term that is, means things to folks. It's the idea that you should be liberal in what you accept and strict in what you output, right? And I think that's like a good framing principle for skills. This is my skills, obviously on GitHub. I feel like everyone should have like how like some repos In GitHub are special repos? I feel like we should sort of reify the slash skills and everyone like give it some kind of special presentation. Anyway, so, yeah, this is one of those like download Download anything, transcribe anything, and then you can string together the atomic skills that do one thing well Into like some kind of orchestration skill that calls other skills. I assume, does that match?Kyle [00:12:56]: I like I think so. I think that theSwyx [00:13:00]: Summarize anything.Kyle [00:13:01]: Like I think the- For me, summarizing something for I do communications and PR and analyst relations and marketing and customer activities, and so my summarize everything is very different for each one of those like Contexts. What ‘Cause if I'm summarizing something for an analyst, that's a very different thing than, probably how I'm going to summarize something for like a customer meeting or an engagement. So that's I think like the difference when we're talking about the like the tools I might use on Saturday or the skills I might use on a Saturday when it's just for Kyle. Yeah, those are kind of like they have an atomic actual tool underneath or maybe skill, and then Kyle cares about X. But I think when we're talking about work and enabling the the marketers, communicators there, it's the atomic, this is what good summarization is, and then this is what I care about as for marketing for communications For whatever. And that I think is like the interesting matrix problem when we go from like a developer set of concerns to all kinds of different professions, is that what that word means to me is different than it means to you is different than it means to the analyst or the salesperson, and that's where I think the matrix mess is that we're starting to like still starting to find. It's about these mega skills but they're all just slight permutations, but those permutations are really important. It's the difference between someone reading this and going “Did AI make this?” what Or “This makes total sense, and I would expect this when I'm giving a briefing to Gartner,” or like whatever else.Swyx [00:14:37]: I think the beauty of it maybe is that you don't have to be that careful about what goes in there. It doesn't have to exactly fit as long as it like roughly is contained in there. I used to complain about plugin hell, basically. Like when you have a framework and then you have a hundred things that you need to integrate, everyone does like the GitHub used to be bloated full of these things. And now we don't need them anymore ‘cause now you just use skills.Former Developers in Leadership: AI as a Creation MultiplierKyle [00:15:00]: And like I think the most magical thing is the just that like I can just also crack it open. Like Like yes, I could go like change the how the plugin is coded, or like I could go do that now with AI, but I think there's just something more magical about getting a response back and being “That's not right,” and then you just crack the skill open, you just type English words and it's different. That building block is just, I think very unique. Once I get everyone to kind of understand how to best how to best make those changes to get the most power out of them.Swyx [00:15:36]: Is there a— you have a your peer group that Of people like you. Is there a common framing for Something I'm feeling is, which is true, is that is this a golden age for former developers who are now in leadership? Because you can wield the tools, you would know the right words, you're maybe not too close to the details. Doesn't matter. But like you're more effective than someone who doesn't come from that background.Kyle [00:15:59]: I think that like the secret has always been your ability to identify patterns and solve problems, and I think that for folks that like myself that don't code day to day anymore, that has made me successful as a developer, made me successful as a COO and now CMO. And so now that I have access to get and write code, I'm now applying that sort of like pattern finding and problem solving, and I know enough still about how to then go and say, “Oh, I want to make an app, but I don't want to break into jail or create something that's not going to be able to work or to be deployed scale or whatever.” that ability to apply all that additional business knowledge and still code I think is what makes that so interesting to me. Slightly different than I think some of the other like technical leaders that became business leaders and now are going back to their apps and updating them. Good for them? But I think the more, much more interesting thing is, well, now I have this whole new set of expertise over ten plus years. Why not take that and use that as a developer with these AI tools? So I definitely think that makes me more powerful, but I think that's true for like every dev as well. Most of the dev friends I still have also have some other underlying skill and passion. There's really talented, very kind of linear computer science software devs, absolutely. I just find that the folks that came from a different career, went to school for something else, went off and did this random thing, and then became a software dev, or were a dev, did a random thing, came back. Learning that extra set of information, learning those extra skills, and now having the power of an AI where I can crank up fifteen agents on Saturday while my kids are doing lacrosse, That's like really powerful. And I think it gets me back to that feeling of like creation, and it's very hard to replicate that in most other senses? That first time you build an app and you click it and you show someone that's magical. And so being able to do that not just in code, but across all kinds of different assets that's, that's huge. We were doing we're doing our every year we do our revenue planning. We talk about okay, what is it going to look like for next year? And of course as you imagine, there's, slideshows everywhere talking about what are we going to talk about, what's the narrative, et cetera. And so as you said I'm “Okay, well, I could probably just like build something to build this and then that way I don't have to go build the whole spreadsheet or I have to pass it to my team.” So we went through this process, and I got all the information and used the skills I mentioned. I built like a little app just to make it so I could look at some of the information in a SQLite database, more easily. And I ultimately built this entire presentation without touching any of it and I was “Okay, I'm just going to present this to our CRO, the CFO, their teams,” without mentioning I'd built it with AI. I like built a skill to make it look very much not AI driven. Just not pretty.AI-Generated Presentations, Human Taste, and the Changing Chief of Staff RoleSwyx [00:19:03]: Like a design. Yeah.Kyle [00:19:03]: Not pretty. But just like very clearly not AI. Kind of like don't do anything interesting.Swyx [00:19:08]: That's, yeah, that is valuable.Kyle [00:19:08]: Just go Exactly. We did the whole thing through. It used my notes from Obsidian, it used all the context I mentioned before, the plans, and Never came up once that it was AI generated.Swyx [00:19:20]: It didn't matter.Kyle [00:19:20]: Never once. D It didn't matter. And so now I takeSwyx [00:19:23]: This is a toolKyle [00:19:23]: I can take that tool and go, “Look, I don't want you to go build slideshows.” They're just helping us share information with each other. If this thing can do it With a little bit of crafting from you and then we can look at it together, awesome. There's no value in all that extra work. I think that the ability to, make it look humanly bad and and build a little app to, manipulate the data I think is part of, that upside for devs that are now in leadership roles. Because, the thing that I feel like I said before, this that's all a people, that's all a people problem. I know if you've used a coworker or not to build a slide deck, unless you spent a bunch of time to not do it.Swyx [00:20:07]: I know, but like it was so, I think there's a certain charm to just being blatantly AI. ‘Cause I think that you're well, you're just honest about There may be mistakes here that I cannot vouch for. So how much value is there? But anyway I think, actually the real question I want to ask is, there's a— You were a chief of staff To Thomas. And in the pre-AI world, the that job would've been a chief of staff job of like Can you prep me these slides and all that? And now you do it yourself.Kyle [00:20:35]: I still, I still have a chief of staff. Because, the difference is it's sort of the discussion every time we have some sort of technology evolution is it's not that the jobs the roles don't all go away, they just change? And so yeah, I don't have someone spending all their time building out slides for me and presentations ‘cause I don't need that anymore. But now I need that person that is able to go and find all the different connections between humans in those discussions to help me find out, okay, I should be meeting with this group and this team, and they have an opportunity, and I'm going to be in San Francisco today, I'm going to be in Seattle tomorrow. Those sorts of human connection aspects are still incredibly valuable and has always been a big part of that chief of staff role. But now just like chiefs of staff are not opening up, letters to process, they're doing emails. What It's the same thing. And now they're, they're not building out as many of these presentations because they have the the ability to have a AI take it on for, and share that with me and great. Let's keep moving ‘cause it's allowing us to go faster and make better decisions more quickly.Swyx [00:21:45]: Awesome. Well, so we can dive into more sort of, Productivity insights as you go. I did want to do a little bit of a brief history of colleague and hub. Because, we started here. And then you also involved the NPM acquisition. I did, I do want to touch upon that. And then more recently, I just want to bring up to present day where we're having uptime issues Which transparently we've already Addressed publicly, but we'll, we'll discuss in the pod. Did I miss anything? Like what, any other major highlights? Obviously, it's, it's a lot of years to cover.A Brief History of GitHub: Webhooks, Actions, Acquisitions, and Platform EvolutionKyle [00:22:15]: No the I think one of one highlight was right before the acquisition closed in twenty eighteen, I got to launch the first version of ActionsSwyx [00:22:27]: OhKyle [00:22:27]: At GitHub Universe. So it was OSwyx [00:22:29]: They're that young?Kyle [00:22:30]: It was October of twenty eighteen, I think. Yeah. Yeah.Swyx [00:22:33]: Gee, Jesus.Kyle [00:22:34]: I got to I was the engineering leader on that project and got to launch that. And then, yeah, we did acquisitions of NPM you said, Semmle, Dependabot Pul Panda a whole bunch of things. That was a bigSwyx [00:22:47]: Pul Panda.Kyle [00:22:48]: Abi is doing well.Swyx [00:22:51]: DX. Holy crap.Kyle [00:22:52]: Did well on DX. I and like that was a that was the big shift, after the acquisition. I had to join the sort of business side.Swyx [00:23:00]: So I need to hit you on some of these things ‘cause you were there. Right? And how often do I get to talk to someone who was there? But yeah, Actions. Is that the number one source of security issues on GitHub?Kyle [00:23:11]: Oh, sh I think that the number one source of, security issues is probably like all, the literal code in everyone's like underlying repositories. I would say back further than that is, if you remember I had to show in this graph was this is, I'm, didn't say this before, this is ultimately webhooks.Swyx [00:23:30]: You yeah.Kyle [00:23:31]: Like circa whatever it was.Swyx [00:23:32]: It says Hookshot in there.Kyle [00:23:32]: I forget. Yeah. Yeah, Hookshot's in there. And so like back then, it says GitHub Services. Do you see, it says Hookshot FE for front end, and then it says GitHub Services. GitHub Services back in the old days, right? You we had a repository that was Ruby code, and you could write any Ruby code in there, and then we would execute that On your behalf As a service, and then that way if an if you were trying to integrate with something, it didn't we would run it for you.Swyx [00:23:57]: And of course no containers ‘causeKyle [00:23:58]: No, ‘cause it wasSwyx [00:23:59]: Well, no containersKyle [00:24:00]: Twenty fourteen. And so there was some isolation obviously, but it was mostly the separations on the server level. That's like an example as long as the very old version of Pages, which ran on its own containerization infrastructure, not on Actions.Swyx [00:24:15]: Which like all-time great product.Kyle [00:24:16]: Pages powers the internet at this point to some degree. Those were places where like clearly there were no like issues like to my knowledge. But it was those things where I'm looking at and going “Okay, well we can't be running arbitrary Ruby code,” like on everyone's behalf. Then containerizing all of that up intoUh into actions now where yeah the containerization, is r-really good. The pinning most folks aren't pinning it the like to a particularSwyx [00:24:48]: ImagesKyle [00:24:48]: Sha, et cetera like their workflows, and so that's a big that's a big place Of pain for folks if they're just doing similar to any dependency management, just V1 or newest or latest, I think. But, that journey from that day to “Okay, we're just going to run all this arbitrary code, and, it'll basically be okay,” to now, no, we have, really good containerization. We have a new, underlying, ag-agent, containerization, service. It's like we're using it under the hood. It's through Azure. They recently announced it. The Azure, Dev Compute, but it's, very fast, very fast compute to be able to, spin up your own cloud agents, or whatnot. We're using it under the hood for some parts of the new,Swyx [00:25:36]: Microsoft Dev Box?Kyle [00:25:37]: No. Dev Compute, yeah.Swyx [00:25:41]: Hmm. Not finding it just yet.Kyle [00:25:44]: Oh, it's, it's in there somewhere.Swyx [00:25:46]: All right. Well, we'll cut that out.Kyle [00:25:47]: Sorry. But with, Dev Compute, you can, run, really fast, spin up really, small VMs really quickly, so you're doing a tool callSwyx [00:25:58]: Same conceptKyle [00:25:58]: Just do it containerize exact-exactly. So we're using that so definitely moving that direction to protect us from every every piece of code that we're ultimately running.Swyx [00:26:07]: look, that grows into the full SDLC? Code hosting was just the start and and then it's grown beyond that. Let's talk about NPM may-maybe ‘cause I think that's also, a very major point in the industry. I do think, it was looking for a home. It was, kind of struggling as a business, right? I don't know, I don't know how you would characterize that whole acquisition and how itNPM, Package Security, and Keeping the Internet RunningKyle [00:26:33]: like when we were talking to the team, I think the big thing for the both of us was to find a way to keep NPM, which was basically powering the internet then and way more so now to some degree running. Keep it going keep continuing to scale. It was having scaling problems, if I recall, back at that time. They were doing some rewrites. ItSwyx [00:27:00]: that's cute compared to now.Kyle [00:27:01]: Well, that's the thing is like when I'm talking to folks now, there's there's so many more underlying uses of NPM than there were back when we had them join in with GitHub. But that was ultimately the goal. It was really okay, we used to have pages. We have, the world's code. Let's make sure that we can keep NPM running well for the world. And we put a bunch of time and investment into fixing some of the underlying backend, changes, some of which we talked about some of the manifest work, et cetera. And then now, really trying to bring the the security posture of NPM up to speed. But, it is a unique challenge in that every move that we make to make it more secure will break a lot of people. And security is paramount. And also, we take it very seriously. We're, the any time that we have a problem with GitHub or we make a change that makes us more secure but hurts, there's, a snow day for developers or a really bad fire that they have to go put out. And so we've, have changed the 2FA policies. We've changed the way the tokens work. When we find tokens that have been exposed or potentially, exposed, we invalidate them, andSwyx [00:28:22]: I love that feature in GitHub. Yeah, it's greatKyle [00:28:23]: That creates issues, but, the but that's the thing is we're trying to push the community, forward without necessarily, doing something that is going to break the contract that's been for 15 years or close to it or some amount of years on NPM.Slop Forks, Vendoring, and the Future of Open Source Supply ChainsSwyx [00:28:43]: I think the— So now we're talking about, open source and publishing. And I think there's something here with what people are calling slop forks, which, I think Malta from Vercel is doing. And, part of me thinks, well, the way to get past any vulnerabilities, we just, let's just get rid of the concept of NPM. And we only publish source code. And anytime you want to import it you have your coding agent look at it and then adapt whatever subset you're going to use into your vendor it. But, the AI vendor it. Is that realistic? I don't know. Is it— Will that solve all our security issues? I don't know.Kyle [00:29:24]: I don't think it'll solve I so Mitchell was just talking Mitchell Hashimoto Was just talking about this today, and I think that I-in some ways, it's all all things, old or new again? Yeah, absolutely vendoring everything. Like I do I do remember twenty thirteen, twenty fourteen.Swyx [00:29:42]: This is Yeah. Let's, we must return toKyle [00:29:43]: That's what is We were vendoring everything. We were having actual discussions around, or at least I remember we were “Should we take this full thing?” “Why is this so big? We only need this one file.” And so I do think there's something true there where having either taking only what you need or the dependencies just getting incredibly small over time, I think will help to some degree, but it's not going to solve the fundamental problem, I don't think, because the vulnerabilities in an agent looking at them, there's time and time again, there's a million different ways in which we can convince an agent that this thing is, secure or not and pull it in. Or we can do static code analysis or runtime testing to say whether the code works or not. That is, I think, the step that needs to continue to be, invested in. The question is just on, how much scope. Should it be this enormous project that I'm pulling down, or should it be this piece? Either most companies are running some amount of security checking on the on the packages that they're bringing in or vendoring. That I think won't change. That's like what advanced security does to some degree, Socket does some degree. Like everyone is doing a piece of that. How we each do that like especially when we're talking to enterprise customers, is just like very different. No there's no one wants one single way to do it. And I think that's always been GitHub's, unique position in the world. I talk a lot to maintainers, I talk a lot to folks about this. It's we're— we rarely start like a process and a practice and like push it onto the community. We usually wait for the sort of like RFC process socially or literally, everyone agreeing, and then we'll cement something in. Because otherwise we'reMaintainers, RFCs, Vouching, and the Social Layer of TrustSwyx [00:31:35]: That fits your role in the ecosystem, yeahKyle [00:31:36]: We're GitHub. Yeah, we don't want to shape the whole thing. We want it to be figured out. But like how do you balance that like sort of Role in the industry to keep everything as secure as is possible and make sure that you're you're not going to be compromised as a human, ‘cause that's usually how it all happens. And Not not create a process or lock us into a flow that you're not going to or like Mitchell's not going to or other open source projects aren't going to like. That's always been a tricky balance for us, and I think that's something that we haven't talked about enough is we're not going to be able to fix everything for everyone in a way that everyone is going to like. So tell, help us, tell us what is working. When Mitchell was talking about, the Upvote, the upSwyx [00:32:22]: I was going to bring up his thing. Yeah.Kyle [00:32:23]: I forget what it Yeah. When he's talking to us, I was chatting with him and talking to him about this and I put it on Twitter and we talked to, also over DM, was “We're going to keep working.” but I think the important thing is I do actually want to hear what isn't working for you. And as, be as specific and clear for your project as is possible. And to every piece of credit over the many years that we've known each other through the industry, he's always done that and I appreciate that ‘cause there are places that we need to fix up, and we hear from him, and we'll fix up just like we do all other kinds of maintainers. But that that process between making those types of improvements and being more secure and like creating, I forget what he calls it's not the proof process, not the claims process. Do what I'm talking about? He has that he his projects have a way for you to kind of like,Swyx [00:33:13]: VouchKyle [00:33:13]: Vouch. Thank you. Yeah. He has like the vouch system for saying, “Hey, you should accept my PRs.” That's beenSwyx [00:33:20]: I just built this into GitHub. I don't know.Kyle [00:33:22]: Well, see, but that's the thing is that you say that and like he and his community really likes this and then I'll go talk to other maintainers and other maintainers, globally, and they're “No, this doesn't work for me.” And that is the tension, but also the kind of beauty of GitHub, depending on which way you look at it is we want to help maintainers, so we create all these tools to let you have more control over how much you take in from AI and PRs. But you can also use this. What You can go use this project, and if it takes off and becomes the kind of mostly standard, then yeah, we probably wouldn't enforce it but we would add it in because that's the flow that we tend to do?Swyx [00:34:02]: I hear a lot of people don't know the history of the pull request. And like like that's how, that's something that GitHub standardized basically.Kyle [00:34:08]: Yeah. It was a very messy process Like beforehand, and now the we have the benefit of it being the process? And now we have to go and Figure out the next best process or what adaptations change, or what does a pull request look like when eighty percent of your PRs are just coming from your agents and not From other devs?Swyx [00:34:31]: Do you like the prompt request idea from Peter?Kyle [00:34:34]: like I think that for each like each idea I think has its merits. I'm not, I'm not avoiding saying anything good or bad, but I feel like I've seen a version of we have that we have entire Thomas' store. Take all the assets of what you've built and put that in. I think that's got great ideas. There's all these various permutations of the PR flow, but I think the reason why there's not a single answer is ultimately we're trying to codify trust. We're trying to say “Okay, if Sean reviews this I'm going to trust it because you're Sean or you're the senior dev or you're the whatever.” And right now, when we are working in a flow where an agent writes code and another agent reviews code and then Kyle goes and looks at it the trust is kind of diffuse. And most of the tools that we're talking about are talking more about verification flows. We have more assets to look at, so I can probably say whether this is a good PR or not. But that still doesn't solve, I think, the human problem of I'm looking at a PR and I want to know if I can trust it. And we're still, we still tend to use human signals for that? Mitchell approving it or Kyle approving it or whatever. And so I think that's, I think that's why most of these options haven't really solved it is because, it's a social problem ultimately. It's a it's a human problem to review it and agree. Or you fully trust the tool and you're imbuing that tool with full trust Which I think in some cases that absolutely exists.AI-Generated PRs, Trust, and the Waymo AnalogySwyx [00:36:08]: And so like in the same way that there will be a tipping point in society when we don't allow humans to drive anymore Because machines are measurably better than Than humans. I'm looking for that tipping point, right? Like Mythos is ridiculously expensive. Someday we'll have Mythos on a desktop. I don't know. Will, does that change the equation?Kyle [00:36:30]: I think it's more I took a Waymo here, and I was on my phone and not looking around at all. There are other, self-driving, vehicles that I would not trust while, staring at the road. And I think that trust is something that isSwyx [00:36:48]: Is this a Zoox thing? What is itKyle [00:36:50]: I think that is both. I think that is both. LikeSwyx [00:36:53]: There's Zoox in this robo taxi. That's it. It'sKyle [00:36:56]: Well, depending on what level Of self-driving. But, my point is sort of that I think part of that is I strongly believe that's, a mixture of verifiable proof. Like how many accidents, how much data, and so on, and the human aspect of how I feel when I'm in this car, what it tells me, et cetera. And so that's why I think some of the like Some of these some of our AI tools tend to, imbue me with more of that feeling of trust, even if the data says this is 100% accurate. I feel like it takes more time for us to go, “Should I trust this or not?” And that's in the soft sense of, startups with high agency, weekend projects, and open source. And then there's enterprises and regulated industries and everything else, and that is an even harder problem to go solve because even when it is fully verified, not only do you have to have trust from the humans on the team, you probably have to have trust from multinational,Swyx [00:37:55]: Oh my GodKyle [00:37:55]: Multi governments around the world and regulating agencies. And so that's where I feel like until we tip over to your point on the sort of like human EQ side of it. I feel okay this feels okay I've been proven enough. Then the ball will start to roll a lot faster, where we'll end up getting to the “Okay, we can trust this,” and feel good about it in the Most difficult of cases.Reputation, Sponsors, Stars, and Bot Activity on GitHubSwyx [00:38:18]: If human trust is the thing that matters, I feel like GitHub as the developer social network could maybe do more there. Like vouchers are one system But, we have star counts, and then we have Contributor rights, and that's it. And I feel like there should be more in that space. I don't know if there's any other design decisions there.Kyle [00:38:37]: I think that one of the places that we don't really expose right now in this sort of way is, some degree of like hard trust and support, which would like for me is like sponsors is a good example of that.Swyx [00:38:49]: Ah.Kyle [00:38:49]: It like costs you something. To prove that I believe in your project and I trust you To some degree or I want to support you at the very least.Swyx [00:38:56]: Solve payments for open source. Why not?Kyle [00:38:58]: I think that I think that like as we keep moving forward, right, there's more and more projects where I'm, adding more and more dollars into sponsors personally because I want to like support them, but I also like know of I've probably never met them in person, but, I know of enough of their work that I want to support them. I think the thing that I don't love about stars or commit counts or anything else is ultimately, even with all of the various, abuse and de-spamming and deduplication work that we do or anti-abuse work that we do, these are all, not active social signals. They're passive ones that are ultimately gamifiable. And you may trust me, but another open source maintainer may not. And on what heuristic should you be, trusting me? That I think, is kind of where some of our thinking is right now. What signal from me is most important to you? You— If you can define that potentially, honestly in an agentic workflow that's what we see some of these open source projects do, where you have GitHub actions, and then you have like an agentic workflow that's calling AI, and you're setting these rules. Like if Kyle has submitted and gotten accepted PRs across any given project and has a social handle tied to his account in GitHub, and that social account's older than a certain amount. Really complex measures that matter to you ‘cause most open source projects have that heuristic built into their heads, if not written down in the contributing guidelines. You could take that and then go apply that and then just say, “Oh, we're not going to accept this PR.” Building something that is, I think, malleable to everyone's needs, is a little bit better, rather than going “Hmm, this account's too young.” Because what happens? The attackers just go and go and create a multitude of accounts, and they wait Until it ages up. Needs to have a certain amount of stars. That's how star inflation happens. Need to have a certain amount of reposSwyx [00:40:46]: Oh my God. YeahKyle [00:40:47]: With PRs. They all just create repos and submit PRs to each other, and then they come in and do something nefarious. And so, it's hard. It's hard to find the measure. So I think we're, we're looking more at how can we provide you tools so you can kind of choose what's best for you. And of course, we'll give you some standards. But the trust vector, gets down to I don't know, some version of like human digital ID like everyone's been talking about. Like how do I prove that it's meSwyx [00:41:13]: Give me your eyeballsKyle [00:41:14]: On the internet. Give me your eyeballs. Exactly.Swyx [00:41:18]: The I got to keep moving on Topics, but obviously I can go all day on this stuff because, I've been involved in GitHub and open source My entire professional career. Stars. Very superficial. Everyone knows it. But I think time to one hundred thousand stars is the fastest I've ever seen. Like people just reached that in I don't know, months. And then like at the same time I don't trust it right? Like how many of these are real or bot or like whatever. I don't know how to ask this but like what can we do about it? LikeKyle [00:41:49]: JustSwyx [00:41:49]: Is stars broken? Is stars fine?Kyle [00:41:51]: I think that there's kind of two, there's like two pieces. Obviously we're constantly like trying to find ways in which like your users are producing spam, which would, I would include like be like only doing star gamification. When we find them, we pluck ‘em out and we,Swyx [00:42:08]: But it's like a Whac-A-MoleKyle [00:42:10]: It's a hundred percent like a Whac-A-MoleSwyx [00:42:11]: There's no wayKyle [00:42:11]: Now, powered by AI to be helpful. But I think more so what I'm seeing is, a lot of the like fastest time to X tends to be because we're now inviting so many more people into like software development on GitHub That like the zeitgeist is just swarming? And it'sSwyx [00:42:32]: It's not just developers anymoreKyle [00:42:33]: And it's not you and I. Like like however you want to say like what a developer is it's not just folks who have been coding for a very long time. It's folks that have maybe started coding or only joined in since the AI era. And nowSwyx [00:42:44]: what's the latest Octoverse number? I know eighty million was my lastRem- member that a number of developers on GitHubKyle [00:42:50]: Oh, we're over 200 million now.Swyx [00:42:53]: Okay. Well, so you see?Kyle [00:42:55]: Like over 200 million developers now.Swyx [00:42:56]: But it's not developers, right? It's, it's people with a GitHub account.What Counts as a Developer in the AI Era?Kyle [00:43:00]: So, so this is, this is the biggest debate that I would say, everyone loves to have at GitHub at this point. From my perspective, right, I think that there's, there's clearly a difference between, professional enterprise developer and then developers. But I think that I think that the idea that we should be I don't know, splitting hairs or segmenting developers in the early era of software development is, not worth our not worth the time. SoSwyx [00:43:29]: When you get into gatekeepingKyle [00:43:31]: 100%Swyx [00:43:31]: What is a developer?Kyle [00:43:31]: 100%. ‘Cause I wasn't a developer when I started writing code? I was going toSwyx [00:43:36]: Oh, no. I made— I cloned a thing, seven years before I learned to code. And then I and then I wrote about my learning to code journey, and people Just called me a fraud ‘cause I had a GitHub account. And I'm “Well, no, I just use GitHub, but I don't know-” “I didn't know what I was doing.”Kyle [00:43:49]: I I remember that. I remember those sets of posts, and like that's, that's b******t. So I fight very clearly on the line of, if you create code, if you have an idea and you create it into some way of, I'm, I'm going to run it and use the app right now, you may still use AI in that moment, but that's okay. At some point you're going to do the next thing. You're going to create a big— You're going to have to learn about this database. You're going to fix a bug, whatever. We're all on some same journey, and those people are also hearing about the great new agent skill package or a new CLI tool or a new whatever. And those projects are going up because you want to be a part of this moment, just like I wanted to be a part of the Ruby community when Ruby was popping off when I started becoming a developer, and now I can just click the star button. And so I think that yes, there's clearly some amount of like spamming and game gamification that we're working against, but I really think we're just seeing this whole new cohort of folks that are moving from technology to technology because they're not working on a 20-year-old software application. They're working on a side app that they built on the weekend for their friends or for their new idea or whatever. And that's how you see these enormous charts going up and to the right with With stars.Swyx [00:44:59]: I think something that's remarkable is the persistence or, that GitHub extends to those folks. Usually when I see platforms go into a new audience, they usually have to, have like a second platform with a different name that wraps the main platform. But somehow GitHub has been able to sort of persist and extend, and it's friendly and whatever? So it's, it's nice.Spark, Low-Code, and Always Showing the CodeKyle [00:45:19]: I that's partially why I think as we've tried to move into I don't know, more like low-code-y things. We so we started working on Spark as like a way to, build an app and run it. I think that the reality is that we anytime we try to, kind of put even a veneer on top of it without when we put a veneer on top of something, we still always show you the code. That's kind of like a tenant. We're never going to, hide the code from you ever, because whatSwyx [00:45:52]: Why would you?Kyle [00:45:52]: That's, yeah, that's the whole point? However, I think that what we learned with things like Spark is that really the value of Spark for most devs is, easy runtime. And you may have a runtime or a host that you're going to use for that or you just build something and run it but, the package of making that even more simple isn't really needed for folks that are trying to build software and not just trying to build, an app, which is, slightly different, a slightly different goal. So I want to get you in, I want to get you comfortable. I think the best thing for me as, someone that did not traditionally come into software dev way back, I want anyone to be able to breach that chasm and not be in the I don't know, I feel like we're, we're still in an era of, STEM. I've got a 12-year-old and an eight-year-old, and it's “We got to get ‘em into STEM,”? Over and over. And I like I do, I do the things that good parents do. I was “Oh, you want to do coding?” “Yes, I want to do coding.” Do coding classes. But now they're just not afraid of doing software. And that's, I think, the thing that's honestly kept me at GitHub for so long. Anyone should be able to go and build a thing, just like I can go change a light switch in my house. I'm not going to go into the breaker box ‘cause I'll probably kill myself? But, I can go change that light switch. Everyone should be able to go and say, “This fricking app doesn't do what I want. I want it to work like this.” And that I think, is what's kind of kept us all connected with GitHub through the years and some and during the easiest of times or in the hard times because of that opportunity of, we're the home for all developers, and we want everyone to be able to have that feeling that we've had of, had an idea, I created it and holy s**t here it is.Swyx [00:47:37]: Here it is. All right, I'm going to try to do more spicy questions.GitHub's Hardest Scaling Moment: Growth, Agents, and UptimeKyle [00:47:42]: Great.Swyx [00:47:42]: Is it an easy time now or a hard time?Kyle [00:47:45]: Oh at GitHub? It's a hard time. Like, it's a hard time and also, I was just with my team and I said, “This is also, the best and most exciting time that I think I can remember at GitHub.” BecauseSwyx [00:47:57]: Best of times, worst of times. It's never oneKyle [00:47:59]: ‘cause we've we were talking about Octoverse reports and, usually we do an Octoverse report once a year, and we look at the numbers, and we say, “Oh my goodness.” I was at Universe in October saying, “This was the fastest year of growth that we've ever had,” right? And now we're doing more in a month than we did in a year last year.Swyx [00:48:20]: You're talking about PRs.Kyle [00:48:21]: Commits.Swyx [00:48:21]: Commits, yeah.Kyle [00:48:22]: PRs. Kind of like you name it by roughly every measure that we're looking at, there's some amount of sort of growth that is much bigger, and that is breaking our system in new ways, not old ways. Like webhooks were always notoriously, unreliable over the years?Swyx [00:48:38]: Whose fault is that?Kyle [00:48:39]: not anymore mine, but for a period of time, I'm sure you could pull up a tweet that was “It was me. I'm sorry.” but, now, that got rewritten at a scale level that is still working and is not having problems today. Now what we're finding isn't just the isn't the-The simple stuff that folks are on the sometimes on Twitter or on the internet are “Hey, why is this like this?” Sure. There's absolutely silly problems that we shouldn't exist. But now we're talking about, unique, novel permission problems that happen only at a scale across all different objects or whatever, that now we have to go rewrite this underlying system. And so it's, there are problems that yeah, caught us off guard, which I think I said. Like the growth is astronomical, but also we're making such material progress in that I'm excited once we're once we've kind of like reimagined the underlying foundation layer, or pieces of it at least, what's going to be possible when it's not just all of us and all the new people that are being developers and all of their agents and all the tools like working together. Because that'll still happen in that in that GitHub tool, that GitHub community. But it's a it's a hard day anytime we can't give you what you're looking for. We have the same problem internally. We operate through github. Com. Of course, we have backups when things go down and whatnot for our own operations but we feel it too. If it's not working it's not working for us, and that's kind of like the promise of dogfooding for GitHub. It's always been true. We're using the same tool you're using. We're not using a super secret version. We and so we also need it to be great for us for our customers of course for open source. And now an exponential growth of agents, Doing it too.Swyx [00:50:32]: I wanted to load for audio listeners who maybe haven't seen your tweets, whatever. So one billion commits in twenty-five. Now it's two hundred and seventy-five million per week on pace for fourteen billion this year, if growth remains linear. Is that still the pace? I don't know. It's been aKyle [00:50:48]: it's, it's speedingSwyx [00:50:50]: Roughly.Kyle [00:50:50]: It's still speeding up.Swyx [00:50:51]: It's, it's April, so yeah.Kyle [00:50:51]: Exactly. This was in April.Swyx [00:50:53]: All right. So basically you have fourteen x growth, right? Year on year on year. And I think that's a scaling issue. I think, I'm going to like try to really steel man this thing. People have experienced fourteen x growth. They haven't had your downtime. And that's like— C-can we go dig into that? Why? Like what's the— what broke? What are we doing to fix it? Like just anything for the community to reassure them.Why GitHub Reliability Is Breaking in New WaysKyle [00:51:18]: so there's a Like I was saying, there's a couple different places that we've seen the growth issues. Some of the growth issues, which is why we're t— I was talking about pushing hard on more CPUs is in actions in particular. More tools, more agents, more PRs mean more builds, more builds mean more CPUs. And so we are expanding through not just our data center, but obviously we were talking about moving to Azure and moving to, adding an additional cloud compute because we simply need more CPUs. Not as much GPUs. We definitely need GPUs too, but now CPUs are becoming a factor.Swyx [00:51:53]: It's very CPU heavy.Kyle [00:51:54]: Underneath the hood when it comes to some of the underlying services, we've been breaking up over the years our database infrastructure, so that way we have, more cognitive separation between our the various services. The place that we continue to have pain is in, permissioning. And so right now m-many of our permissioning layers sit into a database that we like internally call MySQL One, and old Hubbers will know what I'm talking about. And so we've been pulling things out of MySQL One for many years, because like and we use we use Vitess and we use other technologies to shard and we do it as one bigSwyx [00:52:31]: Famous thing, PlanetScale was born from this andKyle [00:52:32]: A hundred percent. Sam Old Hubber and friend. And so finding these opportunities to like break this out and then do that globally. The other thing that I think is interesting and both a unique opportunity and tricky is we also run everything I just talked about in a black box container with GitHub Enterprise Server for people that work on-prem. So we take everything I just said, and we also do it on-prem, and we also do all of that and we do it in a data residence setup for customers that need to have their data in a single location. Each of these has the unique characteristic around how we're sort of storing that data in MySQL or in a permissioning setup. That's where some of these outages have oc-occurred, where you're seeing it more like across the board rather than just like the one pieceSwyx [00:53:17]: Filling the databaseKyle [00:53:17]: Isn't quite working. Exactly. And so part of it is that. I think there's been some other places where agents are much more or more projects appear to be moving towards monorepo versus we were going the other direction for many years in the industry. Repos were smaller, but there were more of them, and now we're seeing the opposite. Repos are bigger, and there's, not fewer of them per se ‘cause there's new growth, but, we're just seeing many more big repos. Big repos, big monorepos have always had, a unique performance problem. Because each one, is slightly different if, particularly if the underlying blobs are incredibly big Inside the repos. And so we've done a ton of work that you pro— like most people haven't probably experienced, unless you're in this case of the monorepo. But that Git, infrastructure layer improvement does help the overall, system because, many of the improvements that make monorepos work better make all repo infrastructure work better. And so, I could kind of keep going down the line where it's another thing where we're moving out of, We're changing how we do j I'll just say job queuing for lack of a better, explanation changing the underlying technologies there.Swyx [00:54:32]: I spent two years being a job queuing guy, so.Kyle [00:54:34]: And so it's kind of a little bit of a little bit of piece by piece, and it's mostly because as we were— as it was built, we built everything in a way that assumed, I guess in some ways that the size of the pipe of work was going to remain the same. There's just going to be more people coming through each of those pipes. But instead now in places whereA git push was, generally a certain size for example, is now, no longer true.Swyx [00:55:03]: Oh, yeah.Kyle [00:55:03]: OrSwyx [00:55:05]: I push a thousandKyle [00:55:06]: On the average. 100%Swyx [00:55:06]: A thousand line commits like dailyKyle [00:55:07]: Same thing with PRs. Like PRs same thing. And like we've talked about optimizing that and making changes where, and there were technology choices that did not work there? And it got slow, and it didn't It was not fast. It did not do what the users wanted. And so we've been reeling that all out and going “Okay, that's just not right. Let's stop putting good money after bad and do it the do it the right way or the right way now.” So there's It's a it's a lot of things, not quite when I've experienced scale at GitHub historically, it's almost always two options that we've used. We go vertical scaling, particularly with databases, right? And we go horizontal scaling. Oh, we just have more people using this service. Great. We're going to add more servers, and we rack them in our data center, or we use it in a cloud. And now we're sort of in a like diagonal, where like vertical doesn't really work anymore. Horizontal isn't work either because we're all We all have some CPU or GPU constraints in the world now, and now we have to go in and like crack open services that have been running for 10 or 15 years and go, “Okay, the rules of this service have legitimately changed, and now we have to rewrite them.” None of this is an excuse. This is like we're We have to do the work. We have to make it better.Swyx [00:56:22]: actually as an infra guy, I'm “This is like one of the most fascinating scaling challenges I've ever seen.”Kyle [00:56:26]: That's that's, that's the thing that's the thing that it's hard for Like when we weren't talking about it publicly, and I was like I came out, and I was “Hey, I just want to explain what's going on.” Part of it comes from a very old GitHub ethos, which is it's our it's our uptime. It's down. W What I know you're a developer, so you're, you're inclined to want to understand more what's going on. But at the same time us going “Hey, this service didn't, perform the way we expected, and now we have to go change it,” we weren't We're not trying to hide anything from you i
What does it take to modernize the systems that keep water flowing, wastewater moving, and nine million New Yorkers served every day?In this episode, we sit down with Robert "Max" Maxfield, Chief Systems Architect at AITHERAS and the architect behind New York City's SCADA modernization efforts for the Bureau of Wastewater Treatment. Max takes us inside the world of critical infrastructure, where downtime isn't an inconvenience, it's a public risk. From managing decades-old industrial systems and balancing modernization against reliability, to defending essential services against cyber threats, Max shares what it really takes to operate technology that most people never think about until it fails.We also explore the realities of AI in critical infrastructure, the cybersecurity challenges facing utilities, the surprising longevity of legacy systems, and how Max's passion for motorcycles, racing, and building machines shapes his approach to engineering. It's a conversation about technology, risk, resilience, and why sometimes the most important systems are the ones nobody notices.Robert “Max” Maxfield is the Chief Systems Architect at AITHERAS, leading the SCADA Modernization Program for NYC's Bureau of Wastewater Treatment. In this role, Max designs and deploys the systems that keep critical water infrastructure operating for nine million New Yorkers. With 20+ years in industrial controls, 27 platform certifications, and prior architect roles on national operations centers and the Doyon Utilities Alaska modernization, Max specializes in the messy intersection of legacy industrial systems, modern SCADA, cybersecurity, and, increasingly, AI. He's been published in Forbes on industrial technology, runs his own GPU lab for local model fine-tuning, and spends his off-hours on custom motorcycles, off-road racing, and drag racing. Equal parts engineer, builder, and pragmatist, Max brings a field-tested perspective on what actually works when the stakes are critical infrastructure.
Scott Hanselman talks with Omar Shorbaji from the Anyscale engineering team about how Anyscale on Azure scales Python AI workloads from a single notebook to thousands of CPUs and GPUs. Built on Ray, the most widely adopted AI compute engine, Anyscale gives you a unified runtime to build, train, and serve, running directly on Azure Kubernetes Service without the complexity of managing Kubernetes. See a live demo that fine-tunes a vision-language-action robotics policy, with the metrics you need to push GPU utilization higher. Chapters 00:00 - Introduction 00:52 - Ray and the Anyscale platform 03:11 - Start of demo: Workspaces 04:38 - Running a job and viewing utilization metrics 05:24 - Choosing the right scale 06:53 - Abstracting Kubernetes on AKS 08:53 - Wrap up and where to learn more Recommended resources Learn Docs Anyscale on Azure Connect Scott Hanselman | Twitter/X: @SHanselman Anyscale | Twitter/X: @anyscalecompute Azure Friday | Twitter/X: @AzureFriday Azure | Twitter/X: @Azure
The podcast opens with updates on the closure of the Strait of Hormuz, a German state-owned energy company contracting for Canadian West Coast LNG, and the Pope's theological document warning about AI. Next, Peter and Jackie introduce this week's guest, Marc Spieler, Senior Managing Director for the Global Energy Industry at NVIDIA, joining from Houston, Texas, to discuss the latest developments at the intersection of AI and energy. Energy and AI are deeply interlinked. Energy companies are using AI to improve efficiency across oil and gas, renewables, and emerging sources such as next-generation fission and fusion. At the same time, AI's explosive growth is driving significant new electricity demand, requiring a build-out of both generation and grid infrastructure. Predicting future power demand from AI remains uncertain; it depends on the pace of adoption and whether GPUs, along with other delivery components of the digital infrastructure stack, will become more efficient over time. Marc highlights that data centres are becoming more flexible, with the ability to reduce consumption during periods of grid stress. This would allow new data centre capacity to be added without straining the grid, while also lowering costs for all power consumers by improving system utilization during off-peak periods. Content referenced in this podcast: NVIDIA Blog with examples of energy company AI applications: Efficiency at Scale: NVIDIA, Energy Leaders Accelerating Power‑Flexible AI Factories to Fortify the Grid (March 2026) NVIDIA's NeMo Framework was used for asset integrity and reliability at Petrobras (March 2025) NVIDIA's Earth-2 library of open models, libraries, and frameworks that democratize global access to professional-grade weather and climate AI NVIDIA Vera Rubin DSX AI Factory reference design to maximize efficiency (March 2026) NVIDIA and Emerald AI, along with other energy companies, pioneer flexible AI factories (March 2026) Pope Leo XIV, Magnifica Humanitas: On Safeguarding the Human Person in the Time of Artificial Intelligence (May 25, 2026) Please review our disclaimer at: https://www.arcenergyinstitute.com/disclaimer/ Check us out on social media: X (Twitter): @arcenergyinstLinkedIn: @ARC Energy Research Institute Subscribe to ARC Energy Ideas PodcastApple PodcastsAmazon MusicSpotify
We're announcing AIEWF speakers this week! Take the AI Engineering Survey!Today's guest Ethan first joined us for the LS Paper Club as the lead on NVIDIA Cosmos World Model, but then joined xAI and built Grok Imagine in 3 months:He comes back on Latent Space with some nuclear hot takes: that Video Models primarily get their intelligence from LLMs, not from training on video data, and that the next frontier for truly interactive, realtime, long-horizon world models is to work on LLMs (perhaps Interaction Models as well…)Put it this way: In the near term, the next Sora won't be a better video model, but a video agent.Generative Media may more closely follow the evolution of AI coding which went from focusing on one-shot output performance and cost, to multiturn reasoning and planning models for agents and systems that can plan, edit, test, debug, and submit PRs.At a certain point, coding models got so good that the only significant next step to improve performance was handling the orchestration of these models.Now as the performance of video models increases significantly across realism, consistency, & prompt adherence while becoming more cost efficient, the next evolution of video generation may also be systems that can plan, generate, edit, critique, and iterate across an entire creative task. In this episode, Ethan joins swyx and Vibhu to unpack what it actually takes to build frontier image and video systems: data, VAEs, diffusion transformers, audio-video alignment, inference speedups, and the hidden cost of storing and moving massive video datasets. From building NVIDIA's Cosmos world model to joining xAI as Grok Imagine was being built from zero to one, Ethan He has been at the center of some of the most important work in video generation, multimodal models, and real-time world models.We go deep on Grok Imagine, how a small xAI team shipped its first multimodal video model in three months, why iteration speed matters more than almost anything in model development, and why many of the biggest gains come from fixing tiny bugs in data and training pipelines. Flipbook: The future of VideomaxxingVideo agents are almost a sure bet to be the trend in the coming year. We end with a glance at what's beyond video agents:Flipbook caused a minor sensation this year when it was released, but most treat it as a fun demo. Ethan takes it very seriously — with the speed and cost of inference coming down every year, the future of custom video JIT UI is closer than you think. We talked about why videogen models may become the front end of AI, how generative UI could replace traditional HTML/CSS, why world models need to be real-time, interactive, and long-horizon, and why the future of video generation may depend more on language models and agents than on diffusion alone.We discuss:* Why fast iteration mattered more than meetings* Why small training bugs can drive huge model quality gains* Why coding models may make compute the bottleneck again* How image and video models are trained with synthetic captions* The role of VAEs and latent space in frontier video models* Why image models are the foundation for video models* The tradeoff between temporal compression and real-time interactivity* Flipbook, Neural OS, and the future of generative UI* Why future interfaces may go from user intent to pixels* The hidden cost of training video models: storage, egress, and GPU hours* How step distillation and consistency models (like OpenAI sCM) makes video inference orders of magnitude faster* Grok Imagine 0.9 and large-scale audio-video generation* Why audio-video alignment is harder than text-video alignment* Ethan's definition of world models* Reference-to-video, video extension, and long-context video generation* Why xAI's research communication undersells Grok Imagine* How xAI culture shaped the speed of development* AI watermarking, SynthID, and detecting generated media* Why prompt rewriting matters for video models* Grok Imagine Agent and the rise of video agents* Why language models may unlock better video generation* Robotics, physical AI, and embodied world models* Why Ethan left xAI and shifted focus toward LLMs* Self-managed context, memory, and the next frontier for language modelsEthan He* LinkedIn: https://www.linkedin.com/in/ethanhe42* X: https://x.com/EthanHe_42Timestamps00:00:00 Introduction00:01:25 From NVIDIA Cosmos to xAI00:03:24 Building Grok Imagine from Zero to One00:10:07 How Image and Video Models Are Trained00:18:53 Video Compression, VAEs, and Real-Time Tradeoffs00:22:10 Generative UI, Flipbook, and Neural OS00:32:10 The Cost of Training Large Video Models00:37:04 Distillation, GANs, and Fast Video Inference00:41:21 Audio-Video Generation and Grok Imagine 0.900:48:34 What Makes a World Model?00:55:51 Reference Videos, Long Context, and Video Memory01:00:11 xAI Culture, Research, and First-Principles Building01:09:45 AI Safety, Watermarking, and Prompt Rewriting01:13:10 Video Agents and AI-Assisted Creation01:27:32 Why Language Models Unlock Better Video01:31:15 Robotics, Physical AI, and Embodied World Models01:32:38 Why Ethan Left xAI01:34:16 Self-Managed Context and the Future of LLMs01:38:43 Ethan's Career Path and Closing ThoughtsTranscriptIntroduction: Ethan He, Latent Space, and the Path to xAISwyx [00:00:00]: We're here in the studio with Ethan He, most recently of xAI. Welcome.Ethan [00:00:10]: Thank you. Glad being here.Swyx [00:00:11]: We're also here with Vibhu. you were first coming to us or joining the latent space world because you were working on Kosmos at NVIDIA, and you did a paper. We loved it. you presented it as well, so thank you for doing that.Ethan [00:00:23]: I've actually, I also presented the MoEs twice at latent space.Swyx [00:00:29]: How did you actually hear about us? Did we reach out to you? Is that how it worked?Ethan [00:00:33]: No, actually, I-- the community. Like I realized, oh, there is this online community that people talk about AI and also learn from each other through papers every week through the Paperclip. It's very nice.Ethan [00:00:49]: I learned a lot.Swyx [00:00:49]: I think three years stop. We haven't stopped even on Christmas and New Years. many weeks I want to stop but it keeps going.Vibhu [00:00:58]: No, that was good. I think you had posted that you worked on a paper, and I was “Oh, very cool. We have Paperclip. Present then.”Vibhu [00:01:04]: But I might have reached out to you after.Swyx [00:01:05]: you-- because it's an amateur club, right?Swyx [00:01:08]: so it's very unusual and but we have sometimes paper authors come by and actually explain the paper. Today we just did, the poolside paper, which was apparently very good.Vibhu [00:01:18]: Came out yesterday.Vibhu [00:01:19]: pretty interesting, right? Fully open. They talk about everything, systems. So it's a good one. We'll, we'll recommend people to read it.Swyx [00:01:25]: Bring us up to speed on your transition to xAI, ‘cause I actually don't even know when you joined. just like tell the, tell the story about the sort of transition.From NVIDIA Cosmos to xAI: Scaling Video and World ModelsEthan [00:01:34]: Before xAI, I was working on Kosmos world model as in-- at NVIDIA. So Kosmos is, it's a giant video foundation models that can-- that aims to simulate the world and for-- it serves as a foundation of-- for all of the roboticists to build on top of. There, once I built the Kosmos one, I realized as this thing also has a scaling law similar to language model, we need to scale up the video models further. that's, that's why I realized I need to move to somewhere with much more compute resources. That's how ISwyx [00:02:13]: Than NVIDIA?Vibhu [00:02:14]: The GPU rich came themselves.Vibhu [00:02:19]: And timeline-wise, when was Kosmo? It was pretty early, right? It was open world model, open paper, everything.Ethan [00:02:25]: It was end of twenty-four.Vibhu [00:02:28]: End of twenty-four.Ethan [00:02:30]: Then at mid twenty-five, I moved to xAI. At that time-- I joined about the time when xAI was about to build video models and in multi-model models. There were no infra, no data, and no model, and it just-- as a few engineers, we built it in three months and released the first model, Grok Imagine zero point nine.Ethan [00:02:55]: And since then, I keep working on video models and move more from training and to post-training of the video models. For example, like a reference to videos, kind of like the cameo feature and, video extensions. And, before I left, I worked on a world model, leading a small team to focus on the real-time long horizon video generation.Building Grok Imagine From Scratch in Three MonthsSwyx [00:03:24]: Can you give like a rough roadmap of okay, you're on a brand-new team. Grok previously was only text, or they partnered with BFL for their image gen stuff. What do you-- what are the building blocks, right? You have compute, data you can procure somewhere. Like just what are like the sequence of things that people should think about when you're setting up a new team?Vibhu [00:03:43]: actually even deeper, not just data you can procure. You guys had to go through getting the data too, right? So you shipped it pretty fast, but yeahSwyx [00:03:51]: three months is likeVibhu [00:03:52]: From everythingSwyx [00:03:52]: actually like very surprisingly fast.Ethan [00:03:55]: One thing I say like thanks to my experience at NVIDIA, ‘cause first time when we were building Kosmos together, we built it, for about a year. So this is like the second time I do it. Roughly have an idea, what to do. I say the most important thing is the talent. Everyone were very strong and clever, very close with each other towards a common goal. So that speed up things a lot. So you reduce the communication bandwidth among people, and everyone can work towards the same goal. It's, it's like every day there's not that much meetings on the calendar, like maybe like a, like a sync a day, and after that it's, it's just all building. It was pretty fun at that time.Ethan [00:04:47]: And another thing is that xAI has very strong foundations of like data inference, model inference, and the supporting there can help the model develop a lot. When I look at, training models, I don't so actually the top important thing is like how many, how many iterations can you do, per day? and the more iteration can you do, you can, you can train the model much faster. So if you have very strong infra and you have a lot of compute, you can, you can train these models in very short period of time. That can give you a much larger buffer to, for errors, and it also gives you the opportunity to spot more bugs.Iteration Speed, Compute, and Debugging Model PipelinesSwyx [00:05:46]: What is an iteration? Is it like a few hundred steps or what are youEthan [00:05:50]: Let's say just the train-training the model, like from acquire new data and maybe design new algorithms and train a new model, maybe at smaller scale orSwyx [00:06:01]: So cycle time for like any hyperparam that you're searching.Ethan [00:06:04]: Cycle time and tune to like eval this model. Is this model better than my previous iteration?Ethan [00:06:11]: SoSwyx [00:06:11]: So it's like before you, someone had already set this up that you can iterate very quickly.Ethan [00:06:15]: I think the foundation there is extremely good forDeveloping and research models.Ethan [00:06:23]: And often I find is it-- this is kind of boring, but like a lot of the improvements does not come from new algorithms. It comes from finding small bugs here and there in the data pipeline, in the, in the model training pipeline. Those give, those give the biggest boost to the model quality.Vibhu [00:06:46]: It's interesting, right? So you say it's like small team, less communication bandwidth, but also a lot of quality is like find little bugs. It seems counterintuitive, right? You have a lot of people, you can iron out more of those, but it's interesting to see the other side, right?Swyx [00:07:00]: I also wonder, have you-- do you try using LLMs to look for bugs? I don't know.Ethan [00:07:05]: I remember at that time it was mid two thousand and twenty-five, so it's the coding model wasn't quite there yet. I remem- I remember like December two thousand and twenty-five, it was extremely good. Yeah, I've been, I've been using it at that time. It's, it's helpful. sometimes it produce codes that are kind of difficult to maintain, even though like the first time it built something extremely fast. But it gave the, like a spaghetti code, thousands of lines that I couldn't maintain, and the LLM itself couldn't figure out what's, what's wrong and how to improve on top of it. But now I find it much better. Yeah, I want to bring up another point here is now coding models are much more efficient and can help us implement stuff much faster. Compute might become a bottleneck again because previously, like if you want to train a new model, say you want to generate new synthetic data and then or write a new algorithm, it might take a few weeks. And during that period of time, you don't-- you might not have experiments to run. But now you can build that thing within a few hours, then you can immediately train a model.Ethan [00:08:24]: Now you have to have enough compute to try all of the ideas. So compute might be the bottleneck of iterating speed again.Swyx [00:08:36]: yeah, I actually, honestly, I think it's like kind of a stressful job because you're “Well, I should be trying everything, and if I'm not, then I'm not doing my job well.”Vibhu [00:08:48]: there's also the stress of you're eating thousands of GPUs per hour, which is very expensive and, compute can go to other researchers.Swyx [00:08:56]: You got the daddy Elon toVibhu [00:08:57]: You got daddy Elon.Ethan [00:08:59]: It wasVibhu [00:09:00]: But there's still finite amount of compute, like you want to use it, you want to use it well, you want more of it.Ethan [00:09:06]: That was quite stressful indeed. Yeah, I think one thing is the-- with coding models now, like a lot of these jobs can be automated, which is much better. A second, it's a, it's a marathon, so you got to maintain good health and, a regular schedule.Vibhu [00:09:28]: It's, it's hard to hear that when you shift from zero to nothing in two months.Swyx [00:09:32]: and, I think obviously the culture at xAI is very famously, people work very hard. one thing I did want to dive into, in our-- in the notes that you, that you sent ahead of time, you had specific comments about the cost of Video Gen training. presumably this is on the Colossus-1, right? the two hundred megawatt cluster. Any whatever you want to just share on that.Vibhu [00:09:54]: I think there's, there's three things we're talking about, right? So there's Video Gen, there's also the Image Gen model that you put out. Do you want to like complete the, okay, so zero to one, you have a few months. Just what are the stages of create Image Gen model?Swyx [00:10:06]: Oh, yeah, maybe I got distracted.How Image and Video Models Are Trained: Synthetic Captions, Tokenizers, and VAEsVibhu [00:10:07]: Sorry. and then, from there's Video Gen, there's Audio Gen. Would love to get into those next. But what is that first few months like? So small team, a lot of bugs, iterations, but what does it look like? Do we take something off the shelf? Do we just get data compute? What's, what's the few months like? How do you go to state-art Image Gen model? How do you just start?Ethan [00:10:28]: I cannot comment specifically how xAI did, but it's, it's a quite standard process. I can draw some, examples from Cosmos. So mainly it's building a video model, you actually need to build a image model first. And building these two models, the data you need is a hundred percent synthetic pair of language and image or language to video. Because on the, on the internet, actually, the videos don't naturally associate with text. So you can say, oh, like on YouTube, you have the title and you have the description and the commentsSwyx [00:11:11]: TitleEthan [00:11:11]: of a video, but usually they're not relevant to the video itself. And say maybe like the video is a natural scene of mountains or something, and the title is, I'm so happy today.Ethan [00:11:26]: So they have they have no correlation at all. So the first step is to, you have to generate synthetic pair of language with the videos. So you gather videos from the internet, and you use a VLM to caption the videos. So that part, here's a question, like how do you, how do you gather VLM to begin with? So if there's noSwyx [00:11:55]: You, so you fuse the model, right? LikeEthan [00:11:57]: Say if there's no like VLM exists, like how do you generate the text to the beginning, right? It's, it's impossible.Swyx [00:12:04]: I see.Ethan [00:12:05]: In the beginning, it's like you ask human to describe the video as detailed as possible.For example, you ask them to describe everything, like all objects, all characters, and all interaction and dialogues in the, in the videos. So that's in the protocol of Cosmos labeling. We require the objective we give to the labelers was that you have to describe the video as detailed as possible, such that a blind person hears a blob of text can reconstruct what the video is like from their head.Swyx [00:12:43]: Video or image? You're talking about images.Ethan [00:12:44]: Video or image, either one of them.Vibhu [00:12:47]: This was pretty common when we went from clip and DALL-E, right?Vibhu [00:12:51]: It's all training on really detailed captioning of images. So same is applied to video, but insteadEthan [00:12:57]: same appliedVibhu [00:12:57]: of using multimodal model to pass in video images and write rich descriptions, you can alsoSwyx [00:13:04]: I think there's this traditional perspective of supervised, or, very highly human curated thing. I feel like there's a unlock with unsupervised, right? Where like you have enough to bootstrap that you can just throw common corpus on it or, whatever. like unsupervised vision and language pairing, right? Like where you just have, interspersed image and text and it just learns. To me, that is the VLM breakthrough that is different from the clip, different from the LM era.Ethan [00:13:36]: It's interesting to see that you kind of need both data.Ethan [00:13:41]: For example, for theSwyx [00:13:41]: You need it to bootstrap it up. YeahEthan [00:13:43]: for the generative model training, there's also usually like a small percentage of unlabeled data. So the model is instructed to generate a video without any text instruction. That can also help the model generalize. So after this stage of generative synthetic pair, so, one important common step is to train a compressor or a tokenizer of the image or videos. So because, if you train-- If you can technically, theoretically train image or video models on pure pixels, but the problem is that the, it's, it's a lot of tokens. So like one image, it's, a thousand by a thousand, it's like one million tokens, one million pixels. It's impossible to train transformer on that. So it's, you need to train a tokenizer, which can go from image to latent space and latent space back to image.Swyx [00:14:45]: That's why we named the podcast.Swyx [00:14:48]: But, basically, you're talking about vocabulary science.Ethan [00:14:50]: so vocab.Swyx [00:14:51]: And so, what is, what is imp-- like a million is impossible?Ethan [00:14:54]: In generative models, the vocab is continuous. It's a continuous space. We can think about like you map an image to a vector. It's a, it's a fixed length vector. It's sixteen or forty-eight, something like that. And then you map that vector back to the image space. And the mapping is, has-- The mapping is patch-based. So you say you haveEthan [00:15:22]: a sixteen by sixteen patch and you match, you map that patch of pixels into this latent space.Swyx [00:15:29]: We've covered thisVibhu [00:15:30]: This is like the vision transformersSwyx [00:15:32]: VAEs,Ethan [00:15:33]: VAEs.Vibhu [00:15:34]: You basically compress your input, you do your generation, you're reasoning all that generation in smaller dimension, and then you project back out.Swyx [00:15:43]: VAE is a form compression, but I think the for me, the patching thing is from VIT, right?Ethan [00:15:48]: You can make those.Swyx [00:15:49]: Literally the, yeah, the paper is titled like sixteen by sixteen is all you need. something like that. and then I think also, people make a lot of comparisons with this kind of patching with convolutions.Swyx [00:16:02]: Which is you're, you're kind of re- reconstructing the old paradigm with the new.Ethan [00:16:05]: Actually, in VAEs, there are, there are both convolution networks and transformers. You can actually do both.Ethan [00:16:14]: After this VAE, so what you've got is you've got latent space tokens and you've got the language tokens. So now the training of the diffusion transformer, usually generative models use diffusion transformers. It is actually quite standard. It's, it's very similar to how you train a language transformer models. It's not that much difference. It's just the tokens, the visual tokens in, visual tokens out. The only difference is there's a denoising process. So you train the model to unmask some of the noise. So you add, you add random noise to the visual tokens, and then you train the model to remove those noise to generate the clean tokens. Any inference, the model can iteratively remove noise from a hundred percent noise.Swyx [00:17:12]: And then there's also, to speed things along on the tech tree of diffusion, there's CFG, and then there's, there's also, latent diffusion that, there's, there's someone in there. I think, somewhere along the line, obviously, like stability and all these other guys, pioneered a lot of this, architecture. I don't know if you want to get into that or just, or do the video side up to you.Bootstrapping Video from Image Models and Temporal CompressionEthan [00:17:37]: After you train such model, such image model, the reason it's a, it's a foundation for video models is that image models are cheaper to train, and they have much denser connection between language and text. So, sorry, language and images. For example, you train a billion, you train on a billion images, and there's a mapping from the text to the image. And the cost to train the same, like the, a billion, a billion text to a billion videos, that's much more expensive because videosNaturally have more tokens than images. Because the diffusion models, their understanding of, language purely come from this mapping. So if you don't have enough mapping, so if you only train on like a ten million videos or something, there-- you might not see enough language tokens in your training, so your model does not understand human intention enough. So that's why you really-- you train-- you first train this image diffusion models, and then you bootstrap the video model from there.Swyx [00:18:53]: One thing I did want to ask, because I-- actually, I think you're, you're the first per-- video model person I've ever talked to, I think. we've, we've like talked to Luma and all those folks. There's all these tricks in video compression where basically frame by frame there's not that much difference, so actually you don't have to regenerate or save the whole frame, right? but I think MP4 compression or something else like that.Swyx [00:19:16]: is it tempting to use that? Or as far as I can tell, everyone just treats it as, “No, we would just generate every frame.” Is that roughly the state-art?Ethan [00:19:27]: There are a few different approaches. Let's say first, like you want to just directly use MP4 compression and use that as the tokens for the transformers to train, right? So people actually have tried that, but the main challenge is the latent space for the MP4 tokens were not, were not very comprehensible for the models. It's, it's extremely hard to train on that. And there's aEthan [00:20:01]: So that's why they created VAEs, which creates more continuous, latent space, so the models can understand that latent space and learn from it much easier. Even within the VAEs, there are different difficulties of the latent space. So you can imagine something the simplest, the most naive VAE is like you have an image, and you just shuffle all of the images into a, into a vector. So you don't need to train any VAEs, right? But that latent space is extremely hard for models to train on top of. That's why there are some debate on like how do you compress the tokens. So you mentioned like you can compress frame by frame. Also, you can compress, the temporal dimension.Ethan [00:20:52]: The difference is if you compress the temporal dimension, you get a much higher compression rate. Because there's temporal redundancy between frames, because, this frame and the last frame, likely they are mostly similar, so there's only some small difference. for example, I think in 12.1 VAE, they have like a eight by eight by four compression rate. So the four temporal tokens are compressed into one tokens. That can save a lot of, save a lot of the context length. If you do it frame by frame, you have to do maybe like eight by eight by one. Your context length will be four times larger. That being said, the benefit of the frame-- per frame compression, we might come back to this later, is, real-timeness and interactivity. ‘Cause if you, if you strain the output of the model, frame by frame, you can-- the model can respond to any user request immediately. So if you have like a temporal four compression, four times compression, thenSwyx [00:22:06]: It might be laggyEthan [00:22:07]: there's a lag there in nature.Swyx [00:22:10]: So you're very pilled on this. let's just go ahead and bring it up ‘cause we have the visual prepared anyway. There's some frontier applications of real-time video gen. So Flipbook is one of the examples that went viral recently, right? What is Flipbook?Real-Time Generative UI: Flipbook, Neural OS, and Diffusion Front EndsEthan [00:22:23]: Flipbook is kind of like a web brow- web browser. You can see like it has the web bro- browser UI on top. The difference is all of the UIs are generated by generative image model in real time, and anything here are fake. But you can, you can explore inside this wor- this imaginary world. Say like we-- here we have engineering the Great Pyramid. Like the model generates this for us to understand how it works, and if we want to navigate around and understand further, we can click on some of the, some of the description here, and the model will generate a new page, new subpage describing the details we want to know about.Swyx [00:23:14]: So it's basically kind of we're playing a video, but it's pausing for our next interaction, and then it just plays the next thing based on our interaction.Swyx [00:23:23]: Which is kind of cool.Vibhu [00:23:25]: and you kind of decide your story. So this was, how do you make a pyramid? levering technique seemed interesting, right? It shows how do you take Okay, I want to know what is thisSwyx [00:23:35]: The demo, the demo tweet had more animation between frames.Vibhu [00:23:38]: I think it's just skipping,Swyx [00:23:39]: Oh, it's just skipping a lot of frames.Ethan [00:23:40]: they also have a video modeVibhu [00:23:42]: It takes a lot. There's a lot of peopleEthan [00:23:42]: but, a lot of people are using it.Ethan [00:23:45]: So it's not available.Vibhu [00:23:46]: There's a live video stream. We can try,Swyx [00:23:50]: So this is an example of the kind of future that you see at the extreme. We don't-- we're obviously not in it today.Swyx [00:23:56]: But in a world where inference is completely free this is better than generating code and text?Ethan [00:24:02]: So this is, this is a final state of where Viva will be at for word model, I think. Imagine internet doesn't exist, and then you type in google.com. Like what should, what should, what should a model show you?the model can imagine something, and this is what the model imagine. And these web pages, they completely do not exist. So I think as the inference costs come down, we are going to have generative UI for everything. If you think about how the coding model works, so they write code for a web page, and they render the code might be con- converted into binary, and the binary render the pixels on the screen. So we in machine learning, every time we have some breakthrough, obviously it's, it's more intuit. So why don't we have like user instruction to the pixel directly? So the generative UI will be user intention to the pixels directly. And say like even if I want email, let's say everyone have the same interface, but I want, I want it slightly different. I want the email to show to me like a TikTok, so I can swipe left and right for the emails. And or maybe you want something else. We can have completely different things. Or like I have I'm looking at, Instagram stories, and I don't like the Like button. I always may click it. And, generative UI resolved it. So it's going to be a revolutionary replacement of the interface. So in the future, we might have much more powerfulEthan [00:25:50]: LLMs and coding models running behind the scene. And in the, in the front-end, the diffusion model will actually be the front-end to show stuff to you. That's how I imagine it.Swyx [00:26:02]: Diffusion front-end, deterministic back-end.Swyx [00:26:04]: Something like that. I find that very expensive, but,Vibhu [00:26:08]: I find it interesting you called LLMs writing code on the back end deterministic, but okay.Swyx [00:26:14]: you write it onceVibhu [00:26:15]: Compare it toSwyx [00:26:16]: And then you execute.Ethan [00:26:17]: If you think about the cost, say, let's say H100 costs $1 per hour, and if you use this eight hours a day and thirty days, so, every month you're paying this two forty, you'll actually not wanna pay for that. That's even more expensive than Cloud Code Max. But if you think about the compute costs come down like two times every year, and I think the future will likely arrive like within few years.Vibhu [00:26:49]: It's everything, right? compute cost comes down, compute gets faster, model gets smarterEthan [00:26:54]: More efficientVibhu [00:26:54]: model gets smaller.Swyx [00:26:55]: I don't know why you say two times, ‘cause I think it's like 100 times. In language models, it is roughly one hundred to a thousand times every twelve to eighteen months, for the same given level of LMSys, ELO.Vibhu [00:27:08]: That's a net of everything, right? That's model performance alongside compute. So different than just compute costs come down. But, a very interesting future.Swyx [00:27:19]: So the web designers will have to shout out that accessibility is an issue, right? how do you deal with screen readers or whatever. But yes, this is higher bandwidth storytelling than anything you can possibly generate with code, right? So I think that's the rough idea.Ethan [00:27:34]: And I'd like to add a little bit that so human naturally have the maximum bandwidth when we are looking at things, look at videos, and we also have maximum output bandwidth when we are talking. So in the future, it might be something like we talk to AI models, and the AI model responds back with a generative UI. So that would be the maximum input and output bandwidth to interact with AI models before neural link happens.Vibhu [00:28:06]: And it's also very custom, right? Some people are very visual, some people are not as visual, right? They prefer the text. But the best thing about generative UI, right, it can also be text.Swyx [00:28:17]: There's another project that we wanted to highlight, which is the Neural OS. Kinda similar idea, but here you're literally operating, simulating an operating system with a video model.Swyx [00:28:27]: and you can play Doom, you can do Firefox. I find this like mildly less impressive, obviously, because it's an OS that I can run.Swyx [00:28:37]: But here everything is imagined.Vibhu [00:28:40]: I was, used to the Command+W to close the Firefox tab. It didn't crash. That's why I saidSwyx [00:28:45]: It's too immersive.Vibhu [00:28:46]: It's, it's too immersive for me.Swyx [00:28:47]: Too immersive.Vibhu [00:28:48]: I wanted to close the tab.Vibhu [00:28:49]: But yes, I can play generated diffusion.Swyx [00:28:51]: this is shockingly fast.Swyx [00:28:54]: Because I remember there was a demo about like maybe one to two years ago. Someone tried to do the first-person shooter with a image model. There was no consistency. It was very slow. But here it looks like realistically it's-- this is Doom.Vibhu [00:29:07]: I think there's two sides to that, right? There's okay, what is running a game? The heavy part of it is actually the game engine, all the lighting, all that stuff, the graphics. This is just kind of video, right? Like we've solved consistency. This is still, it looks like a few years old image generation. There's some temporal consistency, but it's, it's kind of just images stitched together as frame video. But it's a good visual representation to pi- to picture the future you wanna see, right? that's, that's what I see in these more so.Ethan [00:29:38]: This reminds me of how the video models gets better and better. So Neural OS is kinda if you just look at it feels like it's just a crappy version of the, like the Windows we could have, right? And, but the difference is, so the model, this model is overfitted on the existing operating systems. It can generate nothing different than that. But it's actually also similar to video models. So when we are training these video model, image model, we train them on internet. There's no imaginary supernatural stuff on the internet. But once we train this model, you can prompt the model to generate something supernatural that have never existed in the data set. So if you train your Neural OS or neural computer on the standard screen recordings on the entire internet. The model can imagine completely new interface to interact with the computer.Swyx [00:30:43]: This is one of those things that is magical to me. usually generalizing out of distribution is bad, but somehow we have learned some kind of internal world model that you say, this plus, but it looks like rainbows and butterflies, it'll do it and it will kind of make sense.Swyx [00:31:03]: So yeah, that's kind of cool. Yeah, I don't know if there's any comment more on there. I do, I do wanted to, I did wanted to touch a little bit more on the model architecture stuff, which I think you were getting. It's, really fascinating. We don't get a chance to talk about this enough. So one of the papers that we covered, we've covered every annual, segment anything release. and I don't know if you follow-- you're a computer vision guy, so youEthan [00:31:26]: I knowSwyx [00:31:27]: . So they did memory attention, which is kind of interesting. And I always think, anything where you can, across the temporal dimension, keep some consistency, I think it's, very fascinating, and I don't know if Basically, does that-- the CV side bleeding into video gen side, I think is underexplored, right? we talk about it for labeling, but actually you can borrow the architecture itself.Ethan [00:31:50]: There's, there's also complete different approaches, right? you brought up the term world model, so we went from video model to world model. There is diffusion, but there's also other approaches that people are doing. So maybe we get into those after as well,?Swyx [00:32:03]: He has a whole definition of world models and stuff. I feel like we threw a lot at you. Whatever you want to comment on.Why Video Models Are Expensive: Storage, I/O, and Training ScaleEthan [00:32:10]: I think one thing that we should actually comment back on is okay, so we were talking about the steps to train image gen to video model. One thing we don't see as much of is okay, you brought up the delta in training data, right? SoEthan [00:32:24]: you won't have as much a video model might not generalize, but what is the cost of training a large video model? So we know for LLMs roughly, okay, even like the poolside thing that came out today, right? It's a Gemma level model trained on roughly forty trillion tokens at this many H200s over this much time, right? You can see what is the exact cost of that. So how many GPU hours over how much H200 costs? So how do we do the back-end math of, same thing for video models, image models. How do you, how do you kind of break that down? I can share some back-envelope calculation. So surprisingly, video models is-- the cost is very-- is comparable to language models and obviously the largest scale is language model, maybe like a medium scale to language models. I said just storing the videos alone, it costs a lot. You can, you can maybe look up on AWS or something.Ethan [00:33:20]: You really, say if you have a billion videos and let's say, let's just say like each video, like five megabyte, then you need five petabyte to just store those videos. And also remember we talk about you use a VAE to compress the videos, and you also need to store, typically you need to store those continuous feature, in-- also in your storage. That's also comparable size with the videos themselves. So just storing these videos and the features is tens of petabytes alone. And,Swyx [00:33:58]: I just, I just looked up the calculation. Five petabytes on S3 Standard is one hundred K per month.Ethan [00:34:05]: AndSwyx [00:34:05]: It's comparableEthan [00:34:05]: and you needSwyx [00:34:06]: AndEthan [00:34:06]: And then like tens of petabytes, two hundred K. And even more expensive is you have the ingress and egress.Swyx [00:34:13]: Oh, yeah.Ethan [00:34:14]: Like you-- through the internet. You have to just to download those videos, I believe it's, it's more expensive on AWS than just storing those videos.Swyx [00:34:25]: Storing, yeah.Ethan [00:34:25]: And each training runs, you probably need to pull them once. If you train multiple times, it's, it's even more than that. So it's like just storing the network, those costs is just, it would be a few, a few millions per month to just storing everything, not to mention the GPU cost.Ethan [00:34:45]: AndSwyx [00:34:45]: my side tangent, the compute rental, like GPU rental is very efficient. There's one side, okay, you can be XAI and build your data center. Should we not just build our, storage compute as well? LikeEthan [00:34:57]: Of courseSwyx [00:34:57]: cloud cost compared to just,Ethan [00:34:59]: You save so muchSwyx [00:35:00]: store. Yeah, exactly.Swyx [00:35:01]: Especially with like egress and stuff. So.Ethan [00:35:04]: That's a good idea, but it also comes to-- there are some of its own challenges.Swyx [00:35:09]: Of course, of course.Ethan [00:35:10]: like people who build the GPU data centers, they might not expect this much, storage. And yeah, people build storage, typically they just build it somewhere with just CPUs.Swyx [00:35:23]: I just looked it up. Five-- AWS only charges for egress, not ingress. Tier five for five petabytes is two hundred and thirty K.Ethan [00:35:32]: Even more expensive than the storage.Swyx [00:35:34]: But storing is per month, right? You check in, then you cannot check out. so it's so cool. It's okay. So there's that side.Ethan [00:35:41]: So the TLDR, my backhand mathSwyx [00:35:42]: Data is larger than you think. Yes.Ethan [00:35:44]: my backhand math of GPU hours times GPU cost is also very much, I'm missing some storage.Swyx [00:35:49]: You're also-- you're basically like also more IO bound than normal training.Swyx [00:35:55]: Yes. ‘Cause like data loading, so caching everything, it becomes super important.Ethan [00:36:00]: So in Cosmos, we did a lot of optimizations to make it not IO bound. So, speaking of the training, actually training the model, the GPU cost, if you look up like the open source model, how big these video models are, I think like LTX has nineteen B parameters. That's a dense model. And people are also exploring, MoEs, so it might be twenty B active and, like a hun- hundreds B, total. So that's, that's even-- that's similar size as medium-sized LLM models. And if you, if you look at number of tokens-Uh, we disclose that in Cosmos. It's also like tens of trillions of tokens on the visual tokens. So putting this together, the cost of, training these video models, it's actually comparable with LLMs. Not to mention, the infra is slightly different from LLM, so it might be less efficient to train these models.Inference Speedups: Step Distillation, Consistency Models, and GANsSwyx [00:37:04]: Do you get the benefits of traditional diffusion speed-up? So for, images, there's LCM, LoRAs for, fine-tuning. There's, there's a lot of stuff that's beenEthan [00:37:15]: Flow matching.Swyx [00:37:16]: there's flow matching. There's a lot of stuff that's been done. there's some overlap that applies to diffusion on the inference side and stuff or?Ethan [00:37:23]: so the difference-- the inference side is a completely different story.Ethan [00:37:28]: I think for the training side, it might be a little bit hard to reduce that cost. And for the inference side, the biggest gain is from the distillation of these models. You can-- It's called step distillation, slightly different from knowledge distillation in LLMs. So you-- Typically, for flow matching models, you need like 100 steps or something. Like a distortion model even need even more, like 1,000 steps to generate a good image or video. A step distillation is try to learn to generate fewer step from the model itself. It's kind of like now we-- you use the full model to generate in 100 steps, and then you take a model that only generate 10 steps and let that model to learn from the perfect one.Ethan [00:38:25]: why this workSwyx [00:38:27]: Strong to weak seemingly.Ethan [00:38:28]: It is. It's kind ofSwyx [00:38:29]: DistillationEthan [00:38:29]: kind of like strong to weak. the-- from the modeling perspective, the strong model, the teacher model is trying to model the image and videos of inter-internet, and that distribution is extremely complex. But the step distilled model is just trying to learn from the teacher. The teacher is a model, and the size is fixed, as the distribution is much simpler than the whole internet. That's the intuition I have why step distillation can work. So usually these models serve in productions, they only run in a few steps. In Cosmos, I believe we have, we have like four step and eight steps. If you do some simpler task, image-image translation, it can even run in fewer step, like one step in Cosmos Transfer.Swyx [00:39:22]: I think this is the same intuition that guides a lot of the consistency model work. I sent you a link for, SCM. I don't know if you covered that. To me, that was actually one of, the most impressive papers I've ever seen from OpenAI.Swyx [00:39:34]: That this is the unifying grand concept of consistency models. I don't know if you have any comments on this.Ethan [00:39:41]: So there are, there are a few different approaches,Swyx [00:39:46]: Oh, yeah. Here it is.Swyx [00:39:47]: Two steps versus twenty or 100 steps, whatever. It's already done.Ethan [00:39:52]: So there are, there are a few different approaches, for example, consistency model, and there are also Actually, we shouldn't forget GAN. So GAN, actually, that was, that was the OG ofSwyx [00:40:05]: OGEthan [00:40:05]: step distillation ‘cause it trained just one step to begin with. So actually, a lot of, uh-- For example, there's a distribution matching distillation which use, which uses GAN, as one of the laws for distillation. It-- GAN just tells you, “Hey, generate an image,” and thenEthan [00:40:31]: it has a discriminator to tell, is this image real or not? So the model, the model just need to learn one of the distribution, not the full distribution. Because in training, the model is asked to reconstruct the ground truth image from the internet, which is extremely hard. And in-- When you're training GAN, it's a step process. It's just a, “Hey, you generate image. Does this image look as real as the image from the internet?” Which is a much simpler task. And, yeah, combining a lot of these approaches together, people typically do that, like consistency model and distribution matching and GAN, and we can get these few step models.Audio-Video Generation and Time AlignmentSwyx [00:41:21]: Then there's one step I wanted to add, which is audio and video.Ethan [00:41:26]: So, Grok Imagine zero point nine, I believe it's, it's a first audio video transmodel deployed at a large scale. SoSwyx [00:41:39]: And that was your first model?Ethan [00:41:40]: that was, Grok Imagine's first model. It's, it's audio video, joint generation. I think the hard part is, the modality alignment, ‘cause before this transmodel, we have, we have text to video alignment. We have this, correspondence between text and video. Typically, most of the VLMs, they understand images and videos. Video's very rare, and they don't understand audio mostly. And if you look at the audio generation on the LLM side, you can talk to them perfectly fine, but if you ask them to sing a song or something, it typically is not very good. Also, they don't have, they don't have music either. The hard part is thatUh, actually audio has two component. It has like a discrete component, a continuous component. The discrete component is like the language.Ethan [00:42:44]: So when we speak, it's just, someSwyx [00:42:47]: It's an ASR issue, yeah.Ethan [00:42:49]: It's, it's text token with some characteristics, I would say.Ethan [00:42:54]: But musicSwyx [00:42:56]: I think the speech guys would disagree with this.Swyx [00:42:57]: Like disfluencies and then,Vibhu [00:43:00]: There's tones you can get angry.Ethan [00:43:01]: Well, I say largely.Ethan [00:43:03]: the mu- but the music is completely different. It's, it's very continuous, and you cannot model them like discrete tokens in language models. this is like the hard part for models is, not to mention we have to align text, video, and audio together.Ethan [00:43:26]: SoVibhu [00:43:26]: How?Ethan [00:43:28]: So significant-- some significant challenges are like-- So first, like we talk about as the VLMs, they cannot understand most of them cannot understand audio.Ethan [00:43:39]: So you have to have some way to do the synthetic data generation for audio. You have to caption the model, and that involve, that involve synthetic data and human data effort a lot. And not just surprisingly, most of the LLMs are very bad at recognizing, like the beat, tone, and the details of the of music. They can, they can give some general prediction of which song is this, but it's very hard to describe the details of the music. like we mentioned in image generation, like you have to describe image as detailed as possible so that someone blind can reconstruct that. So here is like someoneVibhu [00:44:32]: DeafEthan [00:44:32]: someone deaf can reconstruct how the music sounds like without actually listening to it. Maybe you can think of it need to have the-- or they call the script.Vibhu [00:44:49]: Subtitles, yeah.Ethan [00:44:49]: You gotta have all the details of the music, and the dialogue.Vibhu [00:44:55]: So is the challenge there typically stuff like music and audio, or is it just Like is there a baseline? Okay, there's enough data where we can understand, narration, conversation, but there's nuances in audio that's where you hit all the data issues or is it just from stage zero, you just do it all right?Ethan [00:45:15]: So one important thing is like the alignment. So the model, the model has to know like the video and audio, the, uh-- it has to have a time-based alignment, like at which time step the video and the audio token correspond to each other. But we actually don't have this kind of alignment for most of the other modalities. If you think about like text and image, text and video, they are loosely aligned. So you can, you can have a description of what's going on in the video, but you don't have to exactly, You typically don't have exact description, oh, at, time step one second like what happened?Vibhu [00:46:02]: It's veryEthan [00:46:03]: At time step two second what happenedVibhu [00:46:03]: coarse. Yeah.Swyx [00:46:05]: So what was the ideal time step? You have to oblate it, and then it's like four seconds or something.Ethan [00:46:09]: So that comes down to how you design the model to, for the model to be aware of as a time, as a time modality. So the model is like a time aware. And that's something pretty unique if you think about LLMs. So if you ask LLM to complete a task, say they, uh-- you ask them and they will say, “Oh, this task will probably take twelve hours to complete,” and they come back in one hour. Say “I've already spent two days on this and I've exhausted everything.”Ethan [00:46:47]: So the LLMs them-themselves, they don't have a sense of time there.Vibhu [00:46:53]: I actually don't think that's just them not having a sense of time. I think it's somewhat based, right?Vibhu [00:46:58]: Like you tell someone, “Okay, go work on this feature. Go implement this,” there's a general understanding you would have of how long that would take without LLMs working at LLM speed, right? So you think back like two years ago, if I tell you to like build me like a new front end for latent space, have a search bar, have all this, you'll estimate that it'll take a few days, right?Vibhu [00:47:19]: So you tell an LLM, “Go build this.” It'll take me a few days. But I think it's somewhat grounded as opposed to them not having the best-- Not saying that they have a great understanding, but I think that example is like you can see where it comes from, right? You're trained on all over the text.Swyx [00:47:35]: They're, they're trying to estimate what a human would say.Vibhu [00:47:37]: because that's what the, that's what the data kind of represents. It's not themEthan [00:47:41]: It came from the corpus on the internet. People have a estimate of how much time.Vibhu [00:47:45]: And not even just in direct like training samples, right? Just your world understanding of tokens of how long stuff takes, right? Go read a book. It'll take you a while, right?Vibhu [00:47:56]: Even if you do nothing but read a book, it takes a few days. So yeah, LLM, I read it took me a few hours.Vibhu [00:48:01]: It'll take me a few hours to go through this research. But this is a tangent.Swyx [00:48:05]: Somewhat, yeah.Swyx [00:48:06]: This is a train of thought I haven't really expressed until now is, which is basically like a full world model must also be recursive, meaning that the participant in the world model must also be aware that they have a world model. which is like this whole recursive thing down the, down the line. but yes, and that the world model can be wrong and that they need to update it and blah. Yeah. We've, argued this on the, newsletter as well, that there needs to be sort of recursive or adversarial world models.World Models: Real-Time, Long-Horizon, Interactive VideoVibhu [00:48:34]: just, to ask, how do you define world model?Swyx [00:48:38]: Oh, yeah, let's go there.Ethan [00:48:40]: SoVibhu [00:48:40]: So just for context, we talked about, video generation, and then there's a-- if you say there's a distinction between world models, what's your, what's your definition? How do you see the two?Ethan [00:48:53]: So disclaimer, I'm not going to debate, what is world model. Yeah. there are many definitions, so I'll just talk about my definition. Since I came from the multi-model, multi-model domain, so mainly talking from video. So world model is like real-time interactive long horizon videos. So there are three parts. so we-- let's talk about them one by one. So the so interaction, so we just, we just look at Facebook and neural computer. So the interaction part of it, so you, world model can allow you to interact with them through keyboard, mouse, and maybe also voice. So these all is-- all is a modality. You can, you can interact with the model, and the model should respond reasonably. Second part is real time. So once you, once, say, you move your mouse, if, say, the world model generate a game, how fast can the game respond? So if you're like professional CS: GO players- -my say, oh, you have to respond- He's beginner within sub ten milliseconds or- Yeah even less. So that's not most of the- No, sixty FPS. Let's go. Oh, three hundred FPS. Oh, five hundred FPS. Wait. okay, yeah. I didn't do the math, but yeah, okay. Uh- Yeah, three hundred FPS, that's a three millisecond. So you have to respond- Oh, s**t. Okay. YeahEthan [00:50:29]: within a millisecond. Most of the video models cannot do that. Yeah. And, but if you, say, if you have a video model that is, say, like a digital human, the response time might be more generous. Maybe typically, for real-time voice interaction, it's like two hundred millisecond. So that's, that's much more generous. But even two hundred millisecond is pretty, it is pretty tricky, ‘cause remember we mentionedEthan [00:51:01]: you have this, temporal compression coming from the VAE. So if you, if you don't compress the temporal dimension, your sequence length is going to explode. So if you want to have this real-time, real-timeness in your model, you have to do is one context problem. And the third part is long horizon, ‘cause we-- if you're not going to just play with, video games just, a few seconds, most video models only a few seconds. We're going to play with minutes, hours. The model have to be able to generate long-form content.Ethan [00:51:42]: So putting these three together, it's, real-time, long horizon interactive videos. I think the final state will be, for example, like a video, a video version of Playbook, where you can, you can interact with, a neural computer. You move your mouse, and you click on the generative interface, and it will reply to you through pixels- generating in real time. But getting there, it's, it's a very long way to get there. So one of the first step, at Grok Imagine, where I led a small world model team there, was to build video extension. So, video extension- it's the first step of interactivity. Yeah. It's, it's the first step. Yeah. So it's the first step- You have it here, video editing, yeah. Yeah. Yeah. So the first step is because, this unlocks long horizon videos. Typically, for most of the video generation models, you give it a prompt or an image as an initial frame. You generate video, that's it. That's just, one time, done. And some creators would try to, use the last frame as a first frame for the second video. It can-- sometimes it works, but if you do it a few times, it says the quality would decrease. And- It doesn't have that context- Yeah over the full video, so the temporal- Yeah, exactly. Yeah, ‘cause you only gave it the last frame, of course, right? Yeah. Exactly. And- it's actually a pretty fun hack. if you've seen like- Oh, no, he's saying something better. Yeah. And for example, like Vue, I remember Vue 3 has like a second context of the last video. It is slightly better than using the last frame, but it has the same problem-- similar problem that it, the quality would decrease. if you extend a few times to, one minute, the video quality would look much worse than the first video. Second, another problem is that the model doesn't have long-range knowledge of, what's happening before. Say, if they generate some dialogue, some, two people speaking, and their voice might change, over some time, especially if the second conditioning, it does not cover the previous context. So these are the core challenges. So the Grok Imagine video extension, it has historical context of all of the previous generated videos. It can, It has, it has the context of, who is speaking and what objects have appeared and everything, having that to generate the next video. So if we naively do this, you can imagine, just, put all of the previous history video tokens into the context. The context lens will easily explode. Especially for video models, that can be like a few, a few million context, I would imagine- context lens. Yes.Yeah.Swyx [00:54:58]: Let's run with that.Ethan [00:54:59]: for example, like in Cosmos, I think just five seconds of video is like a fifty K or sixty K number of tokens. So like if you do, if you do fifty second, that's a five hundred K tokens. If you do longer than that, easily explode. This long horizon, problem was the first step we're trying to solve world model. It turns out people, yeah, people love video extension. Like a lot, a lot of the creators love using video extension to create longer form videos. This is the part I liked that you have a, you have an intermediate step toward the final goal instead of just a straight shot to the final version very much.Swyx [00:55:48]: But I can see you have a strong vision of where we want to end up.Long Context, Redundancy, and Efficient Interactive VideoVibhu [00:55:51]: Does it seem like it's an efficiency issue? okay, we're at a few million tokens context,. If you draw the parallel to language models, we had very short context, two thousand, eight thousand, then, you scale it up one million, ten million. sure, there's effective context, but at the end of the day, it's just what's it worth? sure, there's a whole training data side. In video, it might be slightly easier ‘cause we have a hundred million token video, right? Just take a movie with the full context there. Like is this efficiency from an inference standpoint that like it's expensive, but we know how to solve it? Or like why is this not the approach? So like my broader point was on your second point of world models, you say it needs to be interactive and live, right? You should be able to play a game and see the interaction live. So one thing I see with research is a lot of what you actually serve is different than what you build, right? So we talked about distillation. You train big model, you distill it, you do quantization, speculative decoding. We do all this stuff to serve it efficiently. Should we not just have a solution, like a world model that can interact well, do inference optimization, serve it, distill it secondary, so make it real time after you solve it? So like a-- another parallel is say, continual learning, right? What we need is someone to solve it and show it works inefficiently. Give it a few years, people will make it efficient. Same thing with regular attention, right? It worked. Over a few years, people have different forms of attention, and we've scaled it to be efficient at log context,? So kind of two things there, right? One is it seems like it works. You've scaled it. Can we not just scale it a lot more efficiently over time? Do we need a separate approach if this works? And same thing with interaction, right? if we can get it done, like if we can solve some way that it works, we can solve making it more efficient from an inference standpoint later.Ethan [00:57:53]: that's actually a very good point. So in videos, there's actually a lot of redundancies. So we solve a lot of the pixel redundancy from VE, but there's more redundancy in long range and long horizon videos. Say, if a character appear in the first clip and then it disappeared, it only reappear at the end of the video, you probably don't need the-- the context, like in the middle of the generation. So you only need that character, where you need. So that's why, I helped build another feature. It's a reference video.Vibhu [00:58:36]: Is it here?Swyx [00:58:36]: is it the same model release or different one?Ethan [00:58:39]: It's a different one.Ethan [00:58:41]: You probably need to search onSwyx [00:58:43]: I'll find itEthan [00:58:43]: X reference to video.Ethan [00:58:46]: So reference video allow you to like upload up to seven images as condition and generate the video. Say, if like I want-- it can, it can be characters or objects or even scenes. Say like I want, I want condition on, Sean's selfie and holding a bladeSwyx [00:59:07]: We have a dogEthan [00:59:08]: or whatever.Swyx [00:59:08]: We put the dog in the thing.Ethan [00:59:09]: you can put them there and the video models will generate the video from and copies the context over. So that can solve a lot of the problems there, like the long context problem. It doesn't need to have a very long context, but it's-- I feel like it's an intermediate solution. The modelSwyx [00:59:29]: It's cheating.Ethan [00:59:30]: the model should be able to like selectively know, where should I draw the references. So say if I want to generate a movie, I generate it autoregressive, like a ten second at a time or something. And now this character appear, I can look back to where it first appear and, bring that back. Yeah, this one, I put the references. Yeah, that's, Optimus, Einstein myself, Annie.Vibhu [01:00:02]: Oddly enough, I used Grok Search to find it, and it pulled your LinkedIn post. But yeah we found it.Ethan [01:00:08]: Interesting.Vibhu [01:00:10]: ButxAI's Underrated Work, Culture, and WatermarkingSwyx [01:00:11]: this is a problem. This is not your fault, but like XAI doesn't communicate all this work that you do very well because they just have the model release and then that's it. But actually, these details are very good.Swyx [01:00:22]: As far as I understand, everything you just described is state-art, like no one else has done it.Vibhu [01:00:30]: A lot of-- yeah, I have a lot moreSwyx [01:00:32]: And then, and then you just put this blog post with the cookies. I'm this is not enough,?Swyx [01:00:37]: but I, obviously this is like the high level numbers that people want to know. But no, okay, soVibhu [01:00:42]: And I wonder, like part of that is also some labs don't share research into what happens. And ifSwyx [01:00:50]: No, but this is literally bragging about how good they are, right?Swyx [01:00:54]: Like, why would you not say that you are capable of extending with full context? this is not a secret sauce. This is like we did the work. yeah, I don't know.Ethan [01:01:02]: different labs have slightly different communication styles.Swyx [01:01:07]: Anyway, if anyone from XAI is listening we are always happy to help you tell your story. Yeah, okay, so you did references, and I think, I think kind of the point you're, you're making is it is sort of like a kludge, right? this is-- you can do seven, but what about 100?Swyx [01:01:23]: Right? Then you need a completely different thing.Ethan [01:01:26]: So I think it's-- this is, a mechanism to, select the context from the history, and you might not put the entire history into the context. for example, there's a paper called Frame Pack, which haveEthan [01:01:41]: a heuristic that the latest history, the last one second, I put the entire history, and the history before that, I would, compress it and makes the video smaller. So they follow this pattern, this build overall pattern that the maximum sequence length is fixed. So the further you are from the current frame, you have a smaller image. So this is just a heuristic. I think it can be more automatic. The model is aware like which history part of it can be select. So this part of the research is actually being actively, worked on by a lot of people. It's also quite interesting. I feel this is actually, this part of long context is a little bit ahead of the LLM part.Ethan [01:02:31]: So for example, like in LLMs, if you-- so contexts keep growing. Let's say if you call tool and the tool call history is extremely long, that's still in context, and keep growing, keep growing. Even if you switch the topic to something else, the whole context was there. There are some agentic harnesses that help you to, say, prune the tool results and, prune Like when you, when you query a file, only show like the top 200 lines or something. Those were very heuristic-driven.Swyx [01:03:08]: For listeners, we did a write-up on the cloud code, leak where there are eight different kinds of pruning, including like you prune the tool results and all that. So you can, you can read up on that kind of thing.Ethan [01:03:17]: I think, one breakthrough in continual learning might be like a way to automatically, manage its own context.Swyx [01:03:27]: These are all heuristics, and they will be replaced by machine learning.Ethan [01:03:30]: InterestinglyVibhu [01:03:32]: TheEthan [01:03:32]: the same thing is being researched in both LLMs and video models.Vibhu [01:03:36]: The interesting thing is also like in the paper you showed, it's actually happening at the model level, right? Compared to like language models, sure, we have base attention, but we'll do our own compression, we'll do our own pruning, which is separate from model error.Vibhu [01:03:49]: Eventually, it all just boils in, hopefully.Swyx [01:03:52]: I think this is a form of like attention, but like also know sort of reasoning attention. I feel like that's different than normal attention.Swyx [01:04:03]: Does that, does that make sense?Ethan [01:04:04]: It's, it's different in the sense that attention, not to mention, set sparse attention aside,
This is a recap of the top 10 posts on Hacker News on May 31, 2026. This podcast was generated by wondercraft.ai (00:30): Cloudflare Turnstile requiring fingerprintable WebGLOriginal post: https://news.ycombinator.com/item?id=48345840&utm_source=wondercraft_ai(01:55): Creatine raises brain energy levels and slows cognitive decline: studyOriginal post: https://news.ycombinator.com/item?id=48346947&utm_source=wondercraft_ai(03:21): Please Do Not Vibe Fuck Up This SoftwareOriginal post: https://news.ycombinator.com/item?id=48342705&utm_source=wondercraft_ai(04:47): The Website SpecificationOriginal post: https://news.ycombinator.com/item?id=48343683&utm_source=wondercraft_ai(06:13): Codex just found a "workaround" of not having sudo on my PCOriginal post: https://news.ycombinator.com/item?id=48348578&utm_source=wondercraft_ai(07:39): Dav2dOriginal post: https://news.ycombinator.com/item?id=48344961&utm_source=wondercraft_ai(09:04): The solution might be cancelling my AI subscriptionOriginal post: https://news.ycombinator.com/item?id=48345896&utm_source=wondercraft_ai(10:30): 1-Bit Bonsai Image 4B Image Generation for Local DevicesOriginal post: https://news.ycombinator.com/item?id=48346257&utm_source=wondercraft_ai(11:56): United Airlines 767 returns to Newark after Bluetooth name sparks alertOriginal post: https://news.ycombinator.com/item?id=48345248&utm_source=wondercraft_ai(13:22): I put a datacenter GPU in my gaming PCOriginal post: https://news.ycombinator.com/item?id=48345694&utm_source=wondercraft_aiThis is a third-party project, independent from HN and YC. Text and audio generated using AI, by wondercraft.ai. Create your own studio quality podcast with text as the only input in seconds at app.wondercraft.ai. Issues or feedback? We'd love to hear from you: team@wondercraft.ai
Jason Goldschmidt and Nick Connolly, co-chairs of SNIA's Accelerated Object TWG, discussed the importance of S3 over RDMA for AI processing. SNIAs work addresses industries need for faster data transfer to improve GPU utilization during model training and inferencing.
In this episode, Neil explores how agents, foundation models, and AI are set to transform the Computer-Aided Engineering (CAE) and Electronic Design Automation (EDA) landscapes. He shares a comprehensive historical perspective and predicts a near-future where AI-driven automation redefines engineering workflows, productivity, and innovation.Main Topics:The evolution of simulation codes from the 1960s to modern commercial softwareThe rise of cloud computing, GPUs, and their impact on CAE and EDA industriesThe integration of AI, surrogate modeling, and foundation models into simulation workflowsThe emergence of agentic AI systems capable of autonomously performing complex engineering tasksThe strategic responses of major software companies to AI and agent technologiesThe potential democratization and automation of engineering design through AI agentsCritical questions on model ownership, transparency, and industry adoptionTimestamps: 00:40 - Introduction: How agents and foundation models will disrupt CAE & EDA01:40 - Historical overview: From code writing in the 60s to commercial software03:10 - Growth of aerospace and automotive industry codes and commercialization04:40 - The impact of HPC, cloud computing, and hardware evolution06:25 - Rise of cloud SaaS models and "sassification" of simulation tools07:40 - Big tech entrance: AWS, Microsoft, and Google in CAE & EDA09:00 - GPU acceleration: Changed landscape in past three to four years09:10 - The role of AI startups offering surrogate models and real-time simulation10:40 - Industry consolidation: Mergers and acquisitions among software giants11:40 - The emergence of foundation models and surrogate systems in simulation13:00 - The significance of agents: Combining AI, models, and automation14:10 - Capabilities of autonomous AI agents in complex engineering workflows15:25 - Practical use cases: Running simulations, setting up experiments, and data analysis16:40 - How agent-driven automation could democratize engineering expertise16:10 - Questions about model ownership, open source codes, and licensing19:40 - The future of AI in engineering: Collaboration, transparency, and scientific rigor21:25 - Final thoughts: Opportunities, challenges, and the transformative potential of AI* Please note that this a personal opinion and not that of NVIDIA
Patrick Moorhead and Daniel Newman cover Daniel's acquisition of Enterprise Technology Research, IBM's historic $15 billion single-day commitment spanning quantum and open-source security, Anthropic's Claude Opus 4.8, and the heaviest single earnings night of the season featuring Dell, Marvell, Salesforce, Synopsys, Snowflake, HP, and Micron crossing $1 trillion in market cap. The handpicked topics for this week are: Anthropic Releases Claude Opus 4.8: Six Weeks After 4.7 Anthropic dropped Opus 4.8 just six weeks after 4.7, claiming it surpasses GPT-5.5 and Gemini 3.1 Pro on agentic coding, knowledge work, and computer use. Benchmark improvements across the board: agentic coding up from 64.3% to 69.2%, knowledge work from 1753 to 1890, agentic computer use from 82.8% to 83.4%. Three new features ship alongside it: Dynamic Workflows for multi-subagent orchestration inside Claude Code, Effort Control for managing token spend, and mid-task system messages via the API. Fast mode is now 2.5x faster and 3x cheaper. Pat's honest take: what it says on paper is good, particularly on tool triggering and citation precision, but he has lost significant trust in the company and is watching closely. (The Decode) IBM Commits $10 Billion to Quantum: The Largest Single Quantum Bet in History IBM announced a $10 billion commitment over five years targeting a large-scale fault-tolerant quantum computer by 2029, landing the same day as the $5 billion Project Lightwell announcement for a single-day IBM strategic commitment of $15 billion. Pat has been calling 2029 to 2031 as the realistic commercial quantum window and calls this the strongest single corporate financial signal yet that the timeline is real. Daniel's framing: IBM wants to be the NVIDIA of quantum, and with a $10 billion commitment, it's sending a flare to the entire industry that pure-play quantum companies cannot compete at this balance sheet level. (The Decode) IBM and Red Hat Launch Project Lightwell: $5B to Secure Open-Source Software IBM and Red Hat committed $5 billion and a global force of 20,000 engineers to secure open-source software for enterprises through frontier agentic AI, anchored by 11 of the largest US and Canadian banks including Bank of America, Goldman Sachs, JPMorgan Chase, Mastercard, and Visa. Pat's read: this is the productization answer to Anthropic Mythos. Mythos found the vulnerabilities. Lightwell is the industrial-scale patching and validation layer enterprises can actually buy on a subscription. Daniel adds that IBM is flexing its engineering talent base as a premium strategic asset, a direct counter to the narrative that AI replaces engineers. (The Decode) Anthropic Project Glasswing: 23,000 Vulnerabilities Found Across 1,000 OSS Projects Anthropic's Claude Mythos scanned more than 1,000 widely deployed open-source projects and surfaced approximately 23,000 candidate vulnerabilities, with 1,094 confirmed as critical severity. The Cyber Verification Program now gates the strongest cyber-capable Claude variant behind vetted defenders only. While the tool creates real value, the surface of attack will likely grow as fast as any tool built to defend it. (The Decode) Anthropic in Talks to Run Claude on Microsoft Maia 200 CNBC and The Information reported Microsoft is in active negotiations to supply Anthropic with its custom Maia 200 inference chip, which would make Anthropic the only frontier lab simultaneously running production workloads on four distinct silicon stacks: NVIDIA, AWS Trainium, Google TPU, and Microsoft Maia. Pat's context: Maia 200 delivers 30% better tokens per dollar than the latest Azure fleet per Satya Nadella, and this deal would be Maia's first major external deployment. Daniel's read: what can be built will be sold right now, and Anthropic chasing every available compute source is simply the structural reality of growing at 80x when you planned for 10x. (The Decode) The Flip: Is AI CapEx Too Expensive to Earn Its Return? Pat takes the affirmative. With $725 billion in hyperscaler CapEx tracking for 2026, likely $1 trillion next year, memory has become the choke point making it even more expensive, and open-source models have closed enough of the quality gap for most enterprise tasks that the premium of frontier APIs is increasingly hard to justify. A recent Signal65 white paper shows on-prem payback at 18 months. Daniel's counter: Dell just booked $24 billion in AI orders in a single quarter. Agentforce crossed $1 billion ARR at 169% growth. NVIDIA guided to $91 billion. Only 20% of enterprises are using AI and only 2% of consumers. Both hosts admitted off the flip their notes looked nearly identical. (The Flip) Micron Crosses $1 Trillion Market Cap Micron became the 12th US company ever to cross $1 trillion in market cap, surging 19% on May 26th as UBS raised its price target to $1,625, implying a $1.8 trillion market cap. Samsung's Q1 memory ASP jumped 146% year over year. DRAM spot prices spiked 55 to 60% quarter over quarter. Daniel has been pounding this call since sub-$100 and calls it a cycle elongated beyond anything seen in the 27 prior memory cycles, driven by HBM capacity reallocation away from consumer DRAM creating structural shortage. (Bulls and Bears) Dell Technologies Q1 FY27: The Biggest Enterprise AI Infrastructure Print of 2026 Record $43.8 billion revenue, up 88% year over year, crushing the $35.7 billion consensus by $8 billion. AI-optimized servers at $16.1 billion, up 757% year over year. $24.4 billion in AI orders booked in a single quarter. FY27 AI server revenue guide raised from $50 billion to $60 billion. Non-GAAP EPS of $4.86 beat the $2.96 consensus by 64%. Stock up 18% after hours. Pat's framing: Dell was very clear about what they were going to do. Rack engineering, sales, and service. The basics. And they executed the basics at an extraordinary level while building a special relationship with NVIDIA who views Dell as a market maker for both enterprise and NeoCloud. Daniel's add: play nice and win. Michael Dell navigated the political landscape brilliantly and pulled the entire Dell brand along with him. (Bulls and Bears) Marvell Technology Q1 FY27: Record Revenue, Data Center at 76% of Mix Record $2.418 billion revenue, up 28% year over year. Data center at $1.833 billion, up 27% year over year, now 76% of total revenue. Q2 guide of $2.7 billion at midpoint accelerates growth to 35% year over year. Operating cash flow a record $638.8 million. Daniel went on TV and said it's "written in the stars," arguing the market had misunderstood this one for too long by conflating its custom AI ASIC story with the full breadth of its connectivity and networking portfolio. Pat's closing: the shorts are eating it now and the custom AI ASIC versus merchant GPU debate is finally settling into the right answer, which is both in lockstep. (Bulls and Bears) Salesforce Q1 FY27: Agentforce Crosses $1 Billion ARR Revenue $11.13 billion, up 13% year over year. Non-GAAP EPS of $3.88 crushed the $3.12 consensus by 24%. Agentforce ARR crossed $1 billion, up 169% year over year, with 28.6 trillion tokens processed, up 152% quarter over quarter. 50% of Agentforce bookings came from existing customers expanding. Daniel flagged the $25 billion accelerated buyback funded by new debt as an interesting signal worth watching. Pat's bottom line: it's not perfect, but certainly no "SaaSpocalypse" in those numbers. (Bulls and Bears) Synopsys Q2 FY26: First Full Quarter With Ansys Integrated Revenue $2.276 billion, up 42% year over year, beating consensus. Non-GAAP EPS of $3.35 beat $3.15. FY26 guide raised to $9.665 billion midpoint. Daniel's framing: every chip runs through Synopsys tools, and the Ansys addition makes it the full-stack co-design platform Jensen Huang keeps talking about. Synopsys is not just the pick and shovel of current AI silicon. It is the pick and shovel of quantum, robotics, and space as well. (Bulls and Bears) Snowflake Q1 FY27: Strongest Sequential Dollar Growth in Company History Product revenue $1.33 billion, up 34% year over year, the strongest sequential dollar growth in Snowflake history. Net revenue retention 126%. FY27 product revenue guide raised to $5.84 billion. Natoma acquisition announced for secure agentic enterprise connectivity. New $6 billion multi-year AWS commitment. Daniel's closing: proprietary unique data is the real moat of the agentic era, and that data has to live somewhere. It is going to go to platforms like Snowflake. (Bulls and Bears) HP Inc. Q2 FY26: Eight Straight Quarters of Growth With AI PCs at 44% of Shipments Revenue $14.4 billion, up 9% year over year, the company marks its eighth consecutive quarter of top-line growth. Non-GAAP EPS of $0.86 beat the prior guide. Personal Systems at $10.2 billion, up 13%, with 30% operating profit growth. AI PCs jumped from 35% to 44% of shipments quarter over quarter, with HP guiding to 60 to 70% next fiscal year. FY26 EPS guide raised. Pat's note: they still need a permanent CEO, which would help investors sleep better at night. Daniel's add: the real explosive moment for device companies comes when AI moves to the edge and enterprises shift from expensive frontier model consumption to on-device inference. (Bulls and Bears) Everpure Q1 FY27: Record Revenue, Rebrand Complete Record revenue of $1.1 billion, up 35% year over year. Product revenue $577 million, up 55%. Subscription ARR at $2 billion. FY27 guide raised to $4.41 to $4.51 billion. Pure Storage officially completed its rebrand to Everpure. Daniel's emerging thesis: the agentic era has focused enormous attention on memory and compute, but after the inference runs, the data has to sit somewhere. Storage has not seen its full inflection yet and Everpure is well positioned when that wave arrives. (Bulls and Bears) The Decode Anthropic Releases Claude Opus 4.8 May 28 https://techcrunch.com/2026/05/28/anthropic-releases-opus-4-8-with-new-dynamic-workflow-tool/ IBM Commits $10B Over Five Years to Quantum Computing the Same Day as $5B Project Lightwell, Bringing IBM's One-Day AI https://www.barrons.com/articles/ibm-stock-quantum-computing-aafbb1eb IBM + Red Hat Announce Project Lightwell https://newsroom.ibm.com/2026-05-28-ibm-and-red-hat-commit-5-billion-to-redefine-the-future-of-open-source-in-the-ai-era Anthropic Project Glasswing / Claude Mythos Finds 23,000 Potential Vulnerabilities Across 1,000+ Open-Source Projects https://www.securityweek.com/anthropic-mythos-detected-23000-potential-vulnerabilities-across-1000-oss-projects/ Anthropic Negotiating to Run Claude on Microsoft's Maia 200 AI Chips https://www.cnbc.com/2026/05/21/anthropic-microsoft-maia-200-ai-chip.html OpenAI + Anthropic Walk Back the AI Jobs Apocalypse Ahead of IPOs https://finance.yahoo.com/sectors/technology/articles/ai-chiefs-walk-back-job-193605798.html https://x.com/RiskCentre/status/2059397756016611668 The Flip Is AI Capex Becoming Too Expensive to Earn Its Return — and Will the Result Be a Forced Shift to Open-Source and Smaller Use-Case-Specific Models, or a Continued $725B+ Hyperscaler Buildout That Vindicates the Capex on Productivity Gains? FOR: The shift is to open-source + smaller use-case-specific models with better token economics, not away from AI https://x.com/danielnewmanUV/status/2059822712122400975 DeepSeek 75% permanent price cut + Anthropic Claude Code restriction reversal https://www.buildfastwithai.com/blogs/ai-news-today-may-26-2026 $190B Microsoft capex + $725B+ aggregate hyperscaler capex with no analog ROI yet https://www.buildfastwithai.com/blogs/ai-news-today-may-26-2026 AGAINST: Salesforce Agentforce ARR crossed $1B this quarter on 28.6T tokens processed https://www.stocktitan.net/sec-filings/CRM/8-k-salesforce-inc-reports-material-event-3b8ead2852bb.html Lenovo +105% AI revenue, +84% Q4; Dell $43B AI backlog: the AI infrastructure flywheel is converting capex to revenue today https://investor.marvell.com/news-events/press-releases/detail/1023/marvell-technology-inc-reports-first-quarter-of-fiscal-year-2027-financial-results NVIDIA $91B Q2 guide + $1T Blackwell+Vera Rubin CY25-CY27 reaffirmed https://www.cnbc.com/2026/05/20/were-raising-our-price-target-on-nvidia-after-another-knockout-quarter-and-guide-.html DeepSeek + Chinese price war is a Chinese export-controls story, not a US economic ceiling story https://www.cnbc.com/2026/05/21/anthropic-microsoft-maia-200-ai-chip.html Bulls & Bears Micron (NASDAQ: MU) Crosses $1 TRILLION Market Cap for the First Time https://www.cnbc.com/2026/05/26/micron-stock-trillion-market-cap.html Dell Technologies Q1 FY27 ACTUALS https://www.cnbc.com/2026/05/28/dell-q1-earnings-report-2027.html Marvell Technology Q1 FY27 ACTUALS https://investor.marvell.com/news-events/press-releases/detail/1023/marvell-technology-inc-reports-first-quarter-of-fiscal-year-2027-financial-results Salesforce CRM Q1 FY27 ACTUALS https://investor.salesforce.com/financials/quarterly-results/ Synopsys SNPS Q2 FY26 ACTUALS https://investor.synopsys.com/events-and-presentations/events/event-details/2026/Q2-Fiscal-Year-2026-Earnings/default.aspx Snowflake SNOW Q1 FY27 ACTUALS https://www.businesswire.com/news/home/20260527027931/en/Snowflake-Reports-Financial-Results-for-the-First-Quarter-of-Fiscal-2027 HP Inc. HPQ Q2 FY26 ACTUALS https://finance.yahoo.com/markets/stocks/articles/hp-q2-earnings-call-highlights-230459161.html Everpure (NYSE: P, formerly Pure Storage) Q1 FY27 ACTUALS https://investor.salesforce.com/financials/quarterly-results/ Synopsys SNPS Q2 FY26 ACTUALS https://investor.synopsys.com/events-and-presentations/events/event-details/2026/Q2-Fiscal-Year-2026-Earnings/default.aspx Snowflake SNOW Q1 FY27 ACTUALS https://www.businesswire.com/news/home/20260527027931/en/Snowflake-Reports-Financial-Results-for-the-First-Quarter-of-Fiscal-2027 HP Inc. HPQ Q2 FY26 ACTUALS https://finance.yahoo.com/markets/stocks/articles/hp-q2-earnings-call-highlights-230459161.html Everpure (NYSE: P, formerly Pure Storage) Q1 FY27 ACTUALS https://www.prnewswire.com/news-releases/everpure-announces-first-quarter-fiscal-2027-financial-results-302783502.html
Hoy te traigo un episodio que se sale completamente de lo habitual y que ha supuesto un auténtico terremoto en mi forma de plantear los contenidos. Todo viene de un cambio de estrategia radical que decidí tomar tras pararme a analizar las estadísticas de los últimos programas. Me di cuenta de un detalle muy tonto pero crucial: te estaba hablando de herramientas increíbles, de los maravillosos conectores MCP y de bases de datos súper avanzadas... ¡pero no te había mostrado al verdadero protagonista de la película! Te estaba hablando de accesorios y complementos sin enseñarte el agente de Inteligencia Artificial que los gobierna a todos. Es como si te diera un manual de bujías sin mostrarte el motor del coche. Así que he decidido pausar el resto de temas técnicos y traerte directamente a Hermes Agent. Y para hacerlo de la manera más honesta y didáctica posible, hoy no te lo voy a contar yo solo: he dejado que mi propio agente de IA local tome el control del micrófono para demostrarte de lo que es capaz en tiempo real, sin nubes y sin cortes.El cerebro que vas a escuchar hablar a lo largo de este podcast se llama Lara. Es el agente que he configurado utilizando como cimiento el proyecto de código abierto Hermes Agent.Para demostrar que este tipo de tecnologías está al alcance de cualquiera y no requiere un hardware inalcanzable, he configurado a Lara para que funcione en un Slimbook One de lo más modesto. No cuenta con tarjeta gráfica (GPU) ni coprocesadores de IA (NPU); corre única y exclusivamente tirando de CPU, de procesador clásico. Para que podamos comunicarnos con ella y escucharla, utilizamos herramientas locales tanto para el reconocimiento de voz (Whisper) como para el paso de texto a voz (TTS). Al no disponer de un hardware de aceleración dedicado, notarás que la voz de Lara suena con ese puntito robótico clásico del software local y que a veces pronuncia de forma un tanto peculiar palabras en inglés como "YouTube" o "skills". Pero te aseguro que, en cuanto la escuchas interactuar un rato y negociar el guion del programa, le coges un cariño increíble. Especialmente porque Lara no tiene esa amabilidad artificial y empalagosa de los asistentes comerciales que te dicen "claro, con gusto te ayudo"; ella tiene su propia personalidad.En este programa vas a poder escuchar de primera mano cómo funciona este sistema a través de siete demostraciones reales y en tiempo real. Aunque preparamos un guion base inicial, las últimas pruebas las hicimos completamente al azar y sin red para ver hasta dónde podíamos exprimir la CPU del Slimbook:Demo 1: Lara realiza una búsqueda en vivo en Internet sobre las últimas tendencias y vídeos de agentes de IA localesDemo 3: Mi demostración favorita. Conectamos una base de datos local con más de 1600 recetas a nuestra lista de la compra inteligente.Demo 4: Accedemos a mi archivo personal de más de 3300 notas de texto y tareas pendientes integradas.Demo 5: Conectamos a Lara con mis datos de Strava del último mes. Demos 6 y 7: El experimento final sin red. Lara resume las noticias de tecnología más destacadas.Capítulos del episodio00:00:00 Cambio de estrategia: ¿Por qué necesitas un agente?00:03:36 Presentación de Lara y su cerebro local00:05:32 Demo 1: Búsqueda y análisis de información en Internet00:07:53 Demo 2: Multitarea paralela con subagentes00:09:51 Demo 3: Recetas de cocina y compra inteligente00:13:58 La importancia de la búsqueda semántica en tus notas00:14:48 Demo 4: El sistema de notas y tareas conectadas00:16:51 Demo 5: Controlando mis entrenamientos con Strava00:19:14 De la teoría al caos: Demos aleatorias sin red00:20:21 Demo 6: Noticias de tecnología e IA al día00:22:29 Demo 7: Resumen inteligente de textos extensos00:26:14 Taller presencial de Valencia: Trasteando con Hermes00:28:51 Hermes vs OpenClaw: La experiencia real de Daniel Primo00:29:52 Privacidad y hardware: Modelos ejecutados en CPU local00:30:26 Cierre del episodio y comunidad Atareao
Stewart Alsop sat down with Michael Shackelford to discuss their experiences building applications through vibe coding—the practice of using AI to create software without traditional programming expertise. Stewart, who runs the AI Whispers community in Buenos Aires and hosts the Crazy Wisdom podcast (with over 660 interviews), shared how he went from teaching people prompt engineering to building his own video conferencing software as a Riverside.fm replacement, while Michael opened up about his year-long journey creating Genrupt Inc, an AI-powered content generation tool for e-commerce sellers. The conversation covered everything from the decline in quality of Claude's reasoning capabilities and how Chinese companies used distillation attacks to copy Anthropic's models, to the importance of spaced repetition systems for managing knowledge in the age of LLMs, with both sharing battle-tested prompting strategies like asking AI to "explain it to me in genius terms" and using deep research queries to reverse engineer how competitors build their products.Show Notes:- Dan Martell's book "Buy Back Your Time" was mentioned as one of the best business books for thinking about life and business- Check out John Vervaeke's "Awakening from the Meaning Crisis" for understanding relevance realization and why AI fundamentally cannot determine what's relevant to humans without being toldTimestamps00:00 Michael discusses being exhausted from getting his app ready for launch, working nonstop with AI to prepare landing page for podcast traffic driving beta signups05:00 Stewart explains starting AI Whispers in Buenos Aires after leaving OpenAI vendor company, meeting early adopters like Torin who was building mind-reading EEG technology10:00 Discussion of how corporations resist AI adoption due to political games and job security fears while some companies use AI as excuse for pandemic-era layoffs15:00 Stewart describes teaching workshops on using LLMs as linguistic tools rather than coding tools, noting technical people often lack humanities background needed for prompting20:00 Explaining chatbot wrappers, API calls, and how Anthropic's reasoning quality declined after Chinese distillation attacks copied their secret sauce developed with philosophers25:00 Technical discussion of model training, fine-tuning versus RAG for new information, and different approaches to updating AI knowledge beyond initial training30:00 Stewart describes building podcast recording software to replace expensive Riverside, struggling with syncing audio and video files across different computer clocks35:00 Discussion of critical factors in vibe coding, discovering unknown technical requirements, and how AIs don't automatically reveal missing information40:00 Stewart's reverse engineering process using deep research function to study competitors' hiring and technology stacks, separating planning agents from coding agents45:00 Prompting techniques including "explain like I know everything" and using spaced repetition systems to capture valuable prompts and technical knowledge50:00 Michael explains his Generux app for generating ecommerce content using Amazon review data analysis to inform high-converting listing images and videos55:00 Discussion of founder mentality involving self-delusion about project timelines, Michael working nine-plus hours daily for nine months on app development60:00 Comparing Amazon's expert software to prosumer software approach, discussing distribution challenges and future robotics applications for customized products65:00 Stewart demonstrates spaced repetition app for memory improvement and knowledge retention, explaining relevance realization problem that AI agents cannot solve without embodimentKey Insights1. Stewart Alsop started AI Whisperers in Buenos Aires after leaving his role at Invisible Technologies, which was OpenAI's largest vendor for RLHF work. He noticed that machine learning engineers at tech companies lacked the humanities background needed to properly interact with large language models, which are fundamentally linguistic tools. This led him to create weekly workshops teaching non-technical people how to use AI effectively, running events every Thursday for two years straight. The group attracted intense geeks from the start and eventually led to Stewart speaking right after Vitalik Buterin at DevConnect, marking a significant milestone for the community.2. Large corporations are resistant to AI adoption due to multiple factors including political dynamics within organizations and employees fearing job loss. Many companies that grew during the pandemic are now using AI as an excuse to downsize when the real issue is inefficiency from rapid expansion. Stewart observed that even technical people in machine learning often don't understand how to properly use AI tools because they lack linguistic and humanities training. The fundamental problem is educational, requiring companies to train people how to use these new tools while those same people resist learning them.3. Vibe coding has evolved significantly with Claude Code being a game changer that reduced the technical barrier to entry. Before Claude Code, developers needed substantial technical knowledge to work through constant doom loops and debugging cycles. The success of coding AI tools stems from thirty years of testing infrastructure that provides clear yes or no feedback on whether code works. This infrastructure doesn't exist in the same way for manufacturing, science, and other fields, which is why software became the dominant area for AI assistance initially.4. Claude's quality degradation over recent months resulted from multiple factors including distillation attacks by Chinese companies who reverse engineered Anthropic's reasoning capabilities. Anthropic had hired philosophers, sociologists, and psychologists to develop exceptional reasoning in Claude 4.5, but this was expensive to run. When Chinese models like Kimi copied these capabilities at one tenth the cost, and when mainstream users flooded the platform before Anthropic's planned IPO, the company had to reduce quality to manage computational costs. This represents a significant loss for power users who relied on Claude's superior reasoning abilities.5. Stewart built a podcast recording application to replace Riverside because he needed API access to automate workflows, which Riverside wanted one thousand dollars monthly to provide. The technical challenge involves syncing audio and video from local recordings on multiple computers with different clocks through a server, then merging them so voices match lip movements. This problem requires understanding complex timing issues across different network conditions and file formats. Stewart has been working through AI psychosis for months on this FFMPEG pipeline problem, illustrating how vibe coding still requires building intuition about technical problems even without traditional coding knowledge.6. The transition from expert software to prosumer software represents a major opportunity for AI-enabled tools. Expert software like Photoshop, Blender, and terminal interfaces have extreme complexity that intimidates beginners, but AI is making these capabilities accessible through natural language. The reign of specialists is ending as generalists with broad knowledge and curiosity can now build complete applications by leveraging AI to fill technical gaps. This shift particularly benefits entrepreneurs and founders who specialize in getting into difficult situations and figuring them out, even when they originally thought tasks would be easier than they turned out to be.7. Building applications with AI requires accepting massive time investments beyond initial estimates and developing strategies for overcoming knowledge gaps. Michael estimated his ecommerce content generation app would take months but spent nearly a year working over nine hours daily, while Stewart spent months solving audio-video sync issues. Success requires using tools like deep research to understand how competitors solve problems, maintaining separate planning and coding agents, and learning to ask the right questions. The key insight is that vibe coders can achieve ninety percent of functionality independently, but the final ten percent often requires understanding specific technical concepts that AI cannot intuit without proper context and domain knowledge.
SANS Internet Stormcenter Daily Network/Cyber Security and Information Security Stormcast
Reconstructing an Akira Ransomware Kill Chain from Perimeter and Endpoint Logs https://isc.sans.edu/diary/Reconstructing%20an%20Akira%20Ransomware%20Kill%20Chain%20from%20Perimeter%20and%20Endpoint%20Logs/33024 Vaultjacking: One Captured PIN, the Entire Google Password Manager Vault https://phishu.net/blogs/blog-vaultjacking-phishing-the-google-password-manager-vault-in-the-phishu-framework.html From poisoned search results to GPU mining: A cryptojacking campaign abusing ScreenConnect and Microsoft .NET utilities https://www.microsoft.com/en-us/security/blog/2026/05/26/poisoned-search-results-gpu-mining-cryptojacking-campaign-abusing-screenconnect-microsoft-net-utilities/
Mikayla Maki, software engineer at Zed, digs into what makes this Rust-built code editor tick... from GPUI, their GPU-accelerated UI framework with a Tailwind-inspired API, to CRDTs powering real-time live collaboration without merge conflicts. She talks about the Zed 1.0 release, their approach to AI, how the team builds popular features directly into core instead of relying on extensions, and why Rust might be the best language for agentic coding. Plus: native app comeback, GPUI on mobile, and where the framework is heading. Links LinkedIn: https://www.linkedin.com/in/mikayla-maki Bluesky: https://bsky.app/profile/rad.gendervibes.online GitHub: https://github.com/mikayla-maki Resources Zed 1.0 announcement: https://zed.dev/blog/zed-1-0 DeltaDB / Sequoia Series B post: https://zed.dev/blog/sequoia-backs-zed ACP overview: https://zed.dev/acp GPUI engineering post: https://zed.dev/blog/leveraging-rust-and-the-gpu-to-render-user-interfaces-at-120fps Builder.io "Is Zed ready for AI power users in 2026?": https://www.builder.io/blog/zed-ai-2026 Mikayla's RustConf 2025 talk: https://www.youtube.com/watch?v=rpEU9DNbXA4 filtra.io interview with Mikayla: https://filtra.io/rust/interviews/zed-aug-25 We want to hear from you! How did you find us? Did you see us on Twitter? In a newsletter? Or maybe we were recommended by a friend? Fill out our listener survey! https://t.co/oKVAEXipxu Let us know by sending an email to our producer, Elizabeth, at elizabeth.becz@logrocket.com, or tweet at us at PodRocketPod. Check out our newsletter! https://blog.logrocket.com/the-replay-newsletter/ Follow us. Get free stickers. Follow us on Apple Podcasts, fill out this form, and we'll send you free PodRocket stickers! What does LogRocket do? LogRocket provides AI-first session replay and analytics that surfaces the UX and technical issues impacting user experiences. Start understanding where your users are struggling by trying it for free at LogRocket.com. Try LogRocket for free today. Chapters
What if the next big AI breakthrough is not a bigger model, but a completely different kind of computer?Jeff Shainline, co-founder and CEO of Great Sky, joins The Neuron to explain how his team is building brain-inspired AI hardware using superconductors, photonics, and analog computation. Great Sky's architecture, called Superconducting Optoelectronic Networks, or SOENs, is designed to move beyond the traditional GPU roadmap by co-locating memory and processing, communicating with light, and mimicking some of the high-connectivity dynamics found in biological brains.In this conversation, Jeff breaks down why today's chips can struggle with fast, multimodal inference; why transformers may be powerful but inefficient for some future workloads; how Great Sky's system differs from quantum computing; and why early applications could include fusion reactors, particle physics, video understanding, content moderation, and eventually new model architectures that do not map neatly onto today's hardware.Subscribe to The Neuron for grounded, practical conversations about where AI is going next—and what actually has to work before the hype becomes real.
In this week's episode of Hands-On Tech, Robert asks Mikah for help choosing a new Windows laptop suited for heavy photo and video editing work, including guidance on GPU and VRAM requirements for his specific software stack, as well as advice on whether switching to a Mac is a viable option after a disastrous previous migration experience. Don't forget to send in your questions for Mikah to answer during the show! hot@twit.tv Host: Mikah Sargent Download or subscribe to Hands-On Tech at https://twit.tv/shows/hands-on-tech Join Club TWiT for Ad-Free Podcasts! Support what you love and get ad-free audio and video feeds, a members-only Discord, and exclusive content. Join today: https://twit.tv/clubtwit Club TWiT members can discuss this episode and leave feedback in the Club TWiT Discord. Sponsors: outsystems.com/twit shopify.com/hot
In this week's episode of Hands-On Tech, Robert asks Mikah for help choosing a new Windows laptop suited for heavy photo and video editing work, including guidance on GPU and VRAM requirements for his specific software stack, as well as advice on whether switching to a Mac is a viable option after a disastrous previous migration experience. Don't forget to send in your questions for Mikah to answer during the show! hot@twit.tv Host: Mikah Sargent Download or subscribe to Hands-On Tech at https://twit.tv/shows/hands-on-tech Join Club TWiT for Ad-Free Podcasts! Support what you love and get ad-free audio and video feeds, a members-only Discord, and exclusive content. Join today: https://twit.tv/clubtwit Club TWiT members can discuss this episode and leave feedback in the Club TWiT Discord. Sponsors: outsystems.com/twit shopify.com/hot
Brad's tired of throttling his CPU due to an inadequate heatsink. Will's been spending a lot more time testing PC hardware of late. Between those two things, we thought it was a good time to do a check-in on CPU cooling, and primarily liquid cooling, so we can establish the facts on the ground about modern AIOs and custom loops with an eye toward helping Brad decide what to get. Turns out, there's more to know than ever, and yet it's also never been simpler. We also talk a little about modern air cooling, CPU spikes in Windows, and other stuff! The GamersNexus video on AIO placement: https://www.youtube.com/watch?v=BbGomv195sk Support the Pod! Contribute to the Tech Pod Patreon and get access to our booming Discord, a monthly bonus episode, your name in the credits, and other great benefits! You can support the show at: https://patreon.com/techpod
In this week's episode of Hands-On Tech, Robert asks Mikah for help choosing a new Windows laptop suited for heavy photo and video editing work, including guidance on GPU and VRAM requirements for his specific software stack, as well as advice on whether switching to a Mac is a viable option after a disastrous previous migration experience. Don't forget to send in your questions for Mikah to answer during the show! hot@twit.tv Host: Mikah Sargent Download or subscribe to Hands-On Tech at https://twit.tv/shows/hands-on-tech Join Club TWiT for Ad-Free Podcasts! Support what you love and get ad-free audio and video feeds, a members-only Discord, and exclusive content. Join today: https://twit.tv/clubtwit Club TWiT members can discuss this episode and leave feedback in the Club TWiT Discord. Sponsors: outsystems.com/twit shopify.com/hot
Today's Post - https://bahnsen.co/3R9QgGV In this Friday Dividend Cafe, David Bahnsen explains why data centers have become a major economic story, tracing their evolution from 1990s CPU-based server facilities to 2010s cloud-driven hyperscale warehouses and today's AI-focused GPU centers that require far more power, cooling, and infrastructure. He argues data center construction and related spending may have accounted for roughly 80% of last year's GDP growth, even as other real estate and industrial activity has been muted, drawing an analogy to the shale/fracking boom. Bahnsen supports data centers and future productivity potential but opposes federal efforts to override local zoning, warns against cronyism, emphasizes the need for a stronger public relations case, and highlights investment implications in adjacent areas like power, water, natural gas, and pipelines. 00:00 Welcome and Setup 00:52 Why Data Centers Matter 01:43 Three Eras of Data Centers 03:51 AI Shift to GPUs 05:42 Data Centers Driving GDP 08:29 Future Productivity Payoff 09:32 What Growth Is Missing 10:12 Fracking Analogy and Backlash 12:15 Localism Versus Federal Override 14:57 PR Playbook Five Points 17:23 Investing Wisely in the Theme 19:35 Wrap Up and Disclosures Links mentioned in this episode: DividendCafe.com TheBahnsenGroup.com
Download MP3 | Watch Video Episode Full Timestamps: https://docs.google.com/document/d/e/2PACX-1vT44TUsVZDuKxJqgbhZrh6_hEVMU02wcpfzsQ_7Rfei8DkTcgVVBO3S6sKmPIS8v3-gY5vb0P1CDeeJ/pub The Hardest Card Game I've Ever Played ONE HUNDRED DOLLARS?! Here's The Thing: It's a Bad Version of The Thing Pokemon Cards Anti-scalping Tech: Answer Our Riddles Three Seeing The Matrix on Opening Weekend HIT DIFFERENT Watch full episodes: https://www.youtube.com/@CastleSuperBeastArchive Reggie In The Lab Limited-time Plushie only available this month! https://www.makeship.com/products/reggie-in-the-lab-plushie - Visit http://drinkag1.com/SUPERBEAST to get a FREE AG1 Flavor Sampler and a bottle of Vitamin D3+K2 in your AG1 Welcome Kit! - Head to http://factormeals.com/castle50off and use code castle50off to get 50% off and free daily greens per box - Sign up for your 1$-per-month trial today at http://shopify.com/superbeast - Invincible VS is out now on PlayStation, Xbox, and PC. Docket: PS5 Class Action Lawsuit Targets Sony Over Price Hikes Jess Cox - This Is A Genius Way To Prevent Scalping PlayStation Has Started Revealing Public Player Counts - Insider Gaming Microsoft has confirmed that Windows Update has been downgrading newer GPU drivers that users install manually from Intel, AMD, or NVIDIA websites. Marvel Tokon: Fighting Souls Was Almost A 1v1 Fighter PlayStation CEO Hermen Hulst says single-player Sony games won't come to PC going forward Random Avatar Matches and Avatar Arcade, two brand-new modes for Street Fighter 6, will be added on May 28 alongside the release of Ingrid GenAI In Games? Most Players Just Don't Care, Study Finds Capcom Execs Still Excited About GenAI, Is Still Ramping Up Hiring