Activity that uses computers
POPULARITY
Categories
Erik Torenberg speaks with tech analyst Benedict Evans about the current state of AI, what has changed over the past year, and which questions remain unanswered. The conversation covers coding agents, foundation models, AI infrastructure spending, software economics, and the tension between today's AI excitement and the long-term realities of technology adoption. Evans discusses why coding has emerged as AI's first breakout use case, how previous platform shifts can help frame the current moment, and why many of the most important questions about AI remain unresolved. Along the way, they explore the future of software, enterprise adoption, consumer behavior, and whether AI models ultimately capture value themselves or become infrastructure for the next generation of applications. Resources: Follow Benedict Evans on X: https://x.com/benedictevans Follow Erik Torenberg on X: https://x.com/eriktorenberg Stay Updated:Find a16z on YouTube: YouTubeFind a16z on XFind a16z on LinkedInListen to the a16z Show on SpotifyListen to the a16z Show on Apple PodcastsFollow our host: https://twitter.com/eriktorenberg Please note that the content here is for informational purposes only; should NOT be taken as legal, business, tax, or investment advice or be used to evaluate any investment or security; and is not directed at any investors or potential investors in any a16z fund. a16z and its affiliates may maintain investments in the companies discussed. For more details please see a16z.com/disclosures. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.
The Twenty Minute VC: Venture Capital | Startup Funding | The Pitch
Roman Chernin is Co-Founder and Chief Business Officer of Nebius, one of the fastest-growing AI infrastructure companies in the world. Today, Nebius operates some of the largest AI compute clusters globally and serves leading AI labs, enterprises, and developers. Today, Nebius has a market cap of $57BN. AGENDA: 00:00 — Why AI Infrastructure Is Not a Bubble 05:00 — The Real Impact of Open Source on OpenAI & Anthropic 11:00 — Jevons Paradox: Why Cheaper AI Creates More Demand 13:00 — The Four Layers of AI Infrastructure Explained 19:00 — If Nebius Had 10x More Capacity Tomorrow 26:00 — The Shift from Training to Inference and Agents 31:00 — How Token Factory Cuts AI Costs by 70% 44:00 — Sovereign AI, Europe, and the Future of Model Building 49:00 — Competing Against Hyperscalers with 10x More Capital 59:00 — The Biggest Threat to Nebius Isn't Competition—It's Consolidation
Microsoft Build 2026 announced an end-to-end agentic AI stack. COMPUTEX Taipei confirmed heterogeneous AI infrastructure across ARM, Marvell, Intel, Qualcomm, and NVIDIA. Alphabet raised $80 billion. Cisco Live repositioned the network as the AI platform. Patrick Moorhead and Daniel Newman break it all down alongside earnings from Broadcom, HPE, Palo Alto Networks, and CrowdStrike, plus the token cost conversation, the edge AI push, and what Palantir and Oracle are saying about proprietary data as the real AI moat. The handpicked topics for this week are: Microsoft Build 2026 Announced an End-to-End Agentic AI Stack: Microsoft shipped MAI-Thinking-1, its first homegrown thinking model, alongside Scout, Microsoft IQ, Project Solara, and a Majorana 2 quantum update targeting a 2029 commercial timeline with claims of a 1,000x reliability gain. Pat describes MAI-Thinking-1 as likely better than Sonnet 4.6 in blind testing and delivering close to GPT 5.5 quality at a far lower cost. Scout is Microsoft's first autopilot agent, anchoring the M365 Agent Suite with Office Pilot Agent Mode and Agent 365. Microsoft IQ serves as the context layer, integrating M365, business data, boundary IQ, and web IQ with GitHub Copilot, Foundry, and Copilot Studio. Project Solara is a new Android-based platform built for agent-first devices across transportation, retail, and hospital settings. Microsoft also added 83 Unix commands to the Windows stack. Dan frames Microsoft's real play as distribution, not frontier model development, noting that the open model ecosystem being pulled into the platform will matter more to CFOs managing token costs at scale. (The Decode) The AI Stack Goes Multi-Silicon — COMPUTEX Taipei 2026 Confirms Heterogeneous AI Infrastructure: ARM's AGI CPU is in production with Google moving its TPU head node to ARM, and adding Oracle and ByteDance as new customers. ARM also introduced a new switch, the TT100, and put the 51T CPO switch on stage. Marvell received a trillion-dollar company endorsement from Jensen Huang, adding $90 billion in market cap on the comment alone. Intel announced disaggregated inference details and Xeon 6+ Clearwater Forest, its first 18A data center processor. Vista Equity and Cambium Capital announced a NeoCloud called Vector Core Compute, with Xeon 6 handling orchestration, Salmonova RUs handling decode, and Blackwell GPUs handling pre-fill. Qualcomm's Cristiano Amon announced the Dragonfly data center brand with Snapdragon C details coming at their June investor day. The WSTS raised the 2026 semiconductor TAM forecast by 90% to $1.51 trillion, with Pat noting the market could hit a trillion dollars if memory is excluded entirely. (The Decode) NVIDIA RTX Spark and the Edge AI Push: NVIDIA coordinated with ARM and Microsoft around the RTX Spark at COMPUTEX, with the shared message being that the future of Windows is here. Signal65's Ryan Shrout asked Jensen directly why NVIDIA wants to be in the PC business, given low margins and diminishing returns. Dan frames the answer in the context of devices increasingly becoming mobile data centers, capable of running models at much greater efficiency than cloud delivery. The edge AI conversation is also directly tied to token cost economics: as intelligence delivery moves closer to the device, the cost per token drops significantly. The jury is still out on whether NVIDIA will meaningfully disrupt the PC market, but its influence over OEMs like Lenovo and Dell that depend on it for data center gives it real leverage over SKUs. (The Decode) Token Economics and Frontier Model Cost Pressure: Dan and Pat discuss a substantive shift in how enterprises are thinking about AI consumption costs. Dan argues that "token maxing," the practice of defaulting to the most powerful frontier model for every task, has now effectively peaked, as bills have come due at scale. Companies paying for tokens in volume are starting to question whether they can afford the prices that frontier models actually cost to deliver. Pat pushes back, saying the dynamic is still present, but both analysts agree that the market is moving toward a model where token selection is matched to the job, with Microsoft's MOE approach and thinking models positioned to help CFOs manage that economics story. (The Decode) Continuum Goes Public at Highest Valuation for an AI Platform: Dan notes that Continuum, the Honeywell-spawned quantum company, went public this week at what he calls the highest valuation for an AI platform to date. He flags that IonQ will likely contest that characterization. The broader context is Microsoft entering the quantum conversation with Majorana 2 at Build, a name that has largely been absent from the quantum race, while IBM has received most of the attention. (The Decode) AI CapEx Has Outgrown Cash Flow — Alphabet's $80 Billion Equity Raise: On June 1, Alphabet announced an $80 billion equity capital raise, upsized to $85 billion, structured as $40 billion ATM, $30 billion underwritten, and a $10 billion private placement with Berkshire Hathaway anchoring. Pat frames the questions over CapEx returns as entirely dependent on whether you are an AI boomer or a doomer: if the payback comes, the raise is the right move. If it does not, the math doesn't close. Dan argues the investment is existential, drawing parallels to how infrastructure-first companies have always spent ahead of monetization, and notes that Google's equity is being used as a capital engine that may be more efficient than the debt markets right now. Both analysts flag the downstream implications for Broadcom, MediaTek, and Marvell given the TPU connection. (The Decode) The Network Becomes the AI Platform: Cisco Live 2026: Cisco launched Silicon One P200, the Secure AI Factory with NVIDIA and Spectrum X, AgenticOps, MCP-native automation, Cisco IQ, LiveProtect, and folded Astrix Security and Galileo into Splunk under one control plane. Pat identifies Cisco Cloud Control as the biggest announcement of the entire show, pulling together Catalyst, Meraki, Nexus, Firewall, and WebEx under agentic ops that run natively through MCP, with code running directly on smart switches that have x86 processors. Pat also credits Cisco for establishing Silicon One as a credible chip alternative for hyperscalers capable of taking on Tomahawk and Jericho. Dan frames the long-term opportunity as campus and branch enablement when industrial AI and robotics deployments accelerate, arguing that the numerator of AI's economic impact has barely started, as edge deployment spending has not yet begun. (The Decode) The Flip: Did Microsoft Build 2026 Effectively End the OpenAI Partnership? Pat argues the divorce decree has been filed. MAI-Thinking-1 was built with zero distillation from third-party models offering clean enterprise data lineage, with Maia 200 in production plus Anthropic chip supply, which signals vendor hedging. OpenAI is going all-in on AWS, which means you cannot be married to two people, and the full Build stack covering model, OS containment via MXC, agents via Scout and Agent 365, and context via Microsoft IQ removes every architectural dependency on OpenAI. Dan counters that Microsoft is hedging rather than leaving and predicts the partnership will run through the decade. Enterprise Copilot customers are explicitly showing in data that they demand GPT 5.5, internal benchmarks have not been independently validated, and Microsoft stands to make meaningful money from the OpenAI IPO. (The Flip) Broadcom Q2 FY26 Earnings: Broadcom posted revenue of $22.19 billion, a narrow miss depending on which consensus data set is used, with EPS of $2.44 beating estimates and AI semis at $10.8 billion. Hock Tan declined to raise the $100 billion full-year AI chip target, and the stock dropped 13% in premarket trading. Q3 guide came in at $29.4 billion. Pat calls the miss a timing issue driven by Google's multi-sourcing across Marvell, MediaTek, and Broadcom rather than a fundamental problem. Dan flags that Hock Tan opened the earnings call by accidentally reading from the 2025 print, calling it "not the best moment." Sell-side re-ratings held in the 500s across Jefferies, Mizuho, and Deutsche Bank despite the drop, with Futurum Equities having it at 600. (Bulls and Bears) Hewlett Packard Enterprise Q2 FY26 Earnings: HPE delivered revenue of $10.68 billion, up 40% year over year, and EPS of $0.79, up 100%. Juniper integration and AI servers both outperformed, and all FY26 guides were raised. The stock jumped 19% after hours before settling into a roughly 15% gain, with HPE up 68% over the last month. Pat frames HPE as a value play rather than a volume play, methodically targeting enterprise and sovereign cloud deals where it can maintain profitability, rather than competing for massive NeoCloud volume. Antonio Neri was clear on the call that the profitability pull-forward is a one-shot deal. Pat and Dan will both be at HPE Discover the week after next to interview Neri and the C-suite. (Bulls and Bears) Palo Alto Networks Q3 FY26 Earnings: Palo Alto posted revenue of $3.0 billion, up 31% year over year, beating the $2.94 billion estimate, with non-GAAP EPS of $0.85, beating the $0.79 to $0.81 range. NGS ARR reached $8.1 billion, up 60% year over year, including $1.6 billion from CyberArk and Chronosphere. RPO hit $18.4 billion, up 36%. Both FY26 revenue and EPS guides were raised. Adjusted FCF margin came in at 38.5% TTM, up 430 basis points. The stock jumped 11% immediately after hours, then drifted lower. Pat points to 2,200 platformized customers and 120% net retention as the most important metrics. Dan notes the SaaSpocalypse thesis continues to be wrong. (Bulls and Bears) CrowdStrike Q1 FY27 Earnings and the Proprietary Data Moat Argument: CrowdStrike posted revenue of $1.39 billion with EPS of $1.10 and ARR of $5.51 billion. Net new ARR of $255.8 million set a Q1 record, up 32% year over year. FY27 net new ARR guide was raised by $52 million to a $1.29 billion midpoint, and FY27 revenue was raised to $5.915 to $5.959 billion. A 4-for-1 stock split was announced effective July 2nd. The stock dropped 11% despite the beat after a 64% year-to-date run into earnings. Dan uses the results to make a broader argument against the software disruption thesis, referencing Palantir CEO Alex Karp daring customers to build without him using Anthropic or OpenAI, and Larry Ellison's argument that the real AI value unlock sits in proprietary enterprise data that is not accessible to frontier models. Enterprises with governed, secure, proprietary data will continue to need platforms like CrowdStrike regardless of what frontier models can do. (Bulls and Bears) Six Five Summit is coming. Salesforce CEO Mark Benioff will kick off the event. Register and stay current at sixfivemedia.com/summit. Watch the full video at sixfivemedia.com, and be sure to subscribe to our YouTube channel so you never miss an episode. The Decode Microsoft Declares Independence — Build 2026 Ships an End-to-End Agentic AI Stack (MAI-Thinking-1 + Scout + Microsoft IQ + Project Solara + Majorana 2) https://www.theverge.com/tech/941738/microsoft-build-2026-biggest-announcements The AI Stack Goes Multi-Silicon — Computex 2026 Confirms a Heterogeneous AI Infrastructure (ARM + Marvell + Intel ASIC + Qualcomm + RTX Spark); WSTS Raises 2026 Semi TAM Forecast 90% to $1.51T https://www.tomshardware.com/tag/computex AI Capex Has Outgrown Cash Flow — Alphabet's $80B Equity Raise Is the Largest in U.S. Corporate History; Berkshire Anchors $10B https://abc.xyz/investor/news/news-details/2026/Alphabet-Announces-Proposed-80-Billion-Equity-Capital-Raise-to-Expand-AI-Infrastructure-and-Compute-2026-b0myAMewCa/default.aspx The Network Becomes the AI Platform — Cisco Live 2026 Launches Silicon One P200, Secure AI Factory (with NVIDIA), AgenticOps, Astrix Security + Galileo https://www.cisco.com/site/us/en/about/whats-new/index.html The Flip Did Microsoft Build 2026 Effectively End the OpenAI Partnership? MAI-Thinking-1 Beats Sonnet 4.6 in Blind Testing, Microsoft Claims GPT-5.5 Parity at 10x Cost Efficiency — Will MS Quietly Wind Down OpenAI Exclusivity by FY28, or Is OpenAI Still the Frontier Anchor Microsoft Needs? FOR: MAI-Thinking-1 beating Sonnet 4.6 in blind preference + GPT-5.5 parity at 10x cost efficiency is a frontier-model independence proof point https://www.latent.space/p/ainews-microsoft-build-mai-thinking Build 2026: Accumulating Evidence of Microsoft's AI Independence — EDN (June 4) — https://www.edn.com/build-2026-accumulating-evidence-of-microsofts-ai-independence/ Maia 200 in production + Anthropic-Maia chip talks signal Microsoft is hedging its inference vendor stack https://blogs.microsoft.com/blog/2026/01/26/maia-200-the-ai-accelerator-built-for-inference/ Microsoft canceled Anthropic's internal software licenses + pivoted to chip-supply pursuit — customer-not-competitor positioning https://www.cnbc.com/2026/05/21/anthropic-microsoft-maia-200-ai-chip.html AGAINST: Enterprise Copilot customers explicitly demand GPT-5.5 — internal benchmarks don't replace the brand https://learn.microsoft.com/en-us/microsoft-365/copilot/release-notes?tabs=all MAI-Thinking-1 benchmarks haven't been third-party verified — Microsoft is the only source https://www.latent.space/p/ainews-microsoft-build-mai-thinking The MS-OpenAI partnership is contractual through 2030+ — unwinding it is impractical and expensive https://blogs.microsoft.com/blog/2026/04/27/the-next-phase-of-the-microsoft-openai-partnership/ Microsoft's actual strategic risk is OpenAI leaving, not MS leaving — Anthropic + OpenAI IPOs make OpenAI exit risk the real concern https://www.anthropic.com/news/confidential-draft-s1-sec Bulls & Bears Broadcom (AVGO) Q2 FY26 ACTUALS — Rev $22.19B (Narrow Miss) + EPS $2.44 (Beat); AI Semis $10.8B; Hock Tan Refuses to Raise the $100B Full-Year AI Chip Target — Stock −13% Premarket; Q3 Guide $29.4B https://www.cnbc.com/2026/06/03/broadcom-avgo-earnings-report-q2-2026.html Hewlett Packard Enterprise (HPE) Q2 FY26 ACTUALS — Blowout: Rev $10.68B (+40%), EPS $0.79 (+100%); Juniper Integration + AI Servers Both Outperform; FY26 Guides All Raised; Stock +19% AH https://www.businesswire.com/news/home/20260601866494/en/HPE-Reports-Fiscal-2026-Second-Quarter-Results Palo Alto Networks (PANW) Q3 FY26 ACTUALS — Beat-and-Raise: Rev $3.0B (+31% YoY, Beat $2.94B), Non-GAAP EPS $0.85 (Beat $0.79-0.81); NGS ARR $8.1B (+60% YoY, $1.6B from CyberArk + Chronosphere); RPO $18.4B (+36%); FY26 Revenue + EPS Guides BOTH RAISED; Adj FCF Margin 38.5% TTM (+430 bps); Stock +11% Immediate AH, Then Drifted Lower https://www.paloaltonetworks.com/company/press/2026/palo-alto-networks-reports-fiscal-third-quarter-2026-financial-results CrowdStrike narrowly beats estimates on AI tailwinds, but stock falls 9% — CNBC (June 3) — https://www.cnbc.com/2026/06/03/crowdstrike-crwd-q1-2027-earnings.html
00:01 1999 igjen: to skrekkfilm-hiter og «this time it's different»00:04 Rekordbelåning og margin debt på all time high00:05 Opsjonsjaget vi ikke har sett siden 198700:08 Short gamma, marketmakere og spiralen som ga «Red Friday»00:14 Ingenting virket: bare lang volatilitet beskyttet00:17 Laveste korrelasjoner på to år og VIX opp 40 prosent00:18 Bank of America: «here be dragons» og ledighet mot inflasjon00:20 Bilen, AI og Jevons-paradokset00:24 SpaceX som datasenterselskap, ikke rakettselskap00:30 Børsnotering denne uka: 1770 milliarder og Musks absolutte makt00:31 S&P-nekten mot FTSE, Russell og MSCI00:32 Lockup-kalenderen og dagen å frykte: seks måneder og fire dager00:35 Grok mot Groq og «race to zero» i modellene00:40 Midtøsten: Trump mot Netanyahu og oljeprisen00:44 Hva folk ikke ser på nå: bear flattening og carry trades som ryker00:47 Dollar over 161 og japansk intervensjon00:49 Hudson River Trading og datasenteret i Norge00:51 Norge har misforstått seg selv: fisk, olje, rå kraft og nå compute00:53 Å raffinere compute: Skygard, spillvarme og 10X på krafta00:58 Compute som multiplikator: fra 10x-ere til 100x-ere01:00 Budsjettforliket, Mímir Kristjánsson og minstepensjonistene01:05 Å prestere når alt er mulig: fokus, nysgjerrighet og flytskjemaer01:11 Telefonen som heroin: reels, 24-timers reset og hjernen tilbake01:19 Trikkedrapet og situational awareness01:24 Varsler i stedet for å glo på skjermen: gull/sølv og momentum01:35 1998: LTCM, doblede posisjoner og banken som tapte 900 millioner01:45 Andrew Left, Citron og short-saken som ble svindel01:50 Oraclum, superforecasters og nordmannen på topp01:56 Drewry-indeksen, VM-frakt og Fifas fredspris til Trump Hosted on Acast. See acast.com/privacy for more information.
OpenGolf tourney tomorrowChoking. Heimlich maneuverUS Bank Fees$12.50 per $50. That is 25% instantlySo $1000, is 20 * $12.50 = $250. + interest.Reinstate the SATMore than 1,100 University of California math and science professors are urging UC regents to reinstate college-entrance exams, saying that unprepared students are lowering academic standards and draining teaching resources.Today, more than 90% of schools don't mandate the exams, Feder said.60 minutesWelcome to real life Scott Pelley. New boss, new style. Work or walk. Recommendations: Bill Ackman Sara Frier Finance folks should know Codex (previously Excel)PanthalassaMarkets: Huge correction today. Tech down 5%+ and S&P500 2.6%. The losses intensified after a robust jobs report raised new worries that the Federal Reserve may need to raise interest rates later this year to fight inflation.S&P 500 still up 27% and tech 40-60% YoY. Huge IPOs coming: SpaceXAnthropic OpenAICash. Think about your cash investments. Cash is nice Owning your home is nice. AI & DatacentersGoogle to raise $85 billion Anthropic IPOIn May, Anthropic raised $65 billion in new funding from investors including Greenoaks, Dragoneer, Altimeter Capital and Sequoia Capital, in a round that valued the company at $965 billion. At the same time, the company said its revenue run-rate had surpassed $47 billion, up from $9 billion at the end of 2025LLM usageGrok: no bueno. Grok and Spreadsheets. Oh my.Gemini. Good. Claude: BEST. BTW, OpenAI was suspiciously very negative on SpaceX. SpaceX Going public ~June12. Next Friday!? $75b raise at $1.75T valuation. Float is ~4-5% of total shares $10-18b must be purchased by index funds. More coming out in next 6 months. Employee lockups. Cap table investors want liquidity.Great detail here from Alexandra IPO EducationHire IB's. Allocate to VIPs and whales. 5% to retail.Valuation Over-valued? Valuation is highly relative to time!!!?? $135 price. $300 price? Either way 10-20x in 10 years. Not investment advice.AI OpportunitySpaceX is becoming an AI infrastructure play!!Another Rental of Compute from Google to SpaceX. Anthropic and Google are now paying @SpaceX a combined $2.17 billon per month for compute capacity. That's a revenue run rate of $26 billion per year. BIG MONEY.Jamie Dimon Interview of Elon. Elon and Dimon Another link here from Why SpaceX public now. Play at 4:00min mark: Why fundraising. Embarking on significant growth phase. 100,000 satellites. BTW. Why are datacenters hard if already doing satellites. 100x more bandwidth and ½ latency for v3. He just said that Starlink will be highest bandwidth and lowest latency or ANYTHING!! AI Datacenters in space. Massive capital endeavor. Hard to build power in the US or on land. US usage is 500GW. To double. Would need to 2x # of power plants. BUT if in space can go far beyond EarthManufacturing on the moon and building beyond 1000TW per year of AI Space ComputeDataCenters in SpaceEasier than their communication satellites. AI datacenter is EASYElections: Why does it take so long to count votes? Could take weeks?
Blue Alpine Cast - Kryptowährung, News und Analysen (Bitcoin, Ethereum und co)
Jetzt bei Kraken anmelden und 30 EUR Bonus erhalten: https://bit.ly/kraken-bonusDer Venice AI Token (VVV) gehört zu den stärksten KI Token am Markt. Ich erkläre das Dual-Token-System: Wie DIEM als tokenisiertes Compute funktioniert, warum die Token-Burns VVV deflationär machen, und wie sich Venice als privates KI-Gateway positioniert. Themen & Timestamps:00:00 Venice AI und seine beiden Token01:15 VVV und DiEM: So spielen die Token zusammen01:58 Dauerhaftes KI-Guthaben mit DiEM03:57 Nachfrage nach DiEM und VVV04:58 Emissionen und deflationäre Tokenomics05:42 Burn-Mechanismus und echte Nutzung07:34 OpenClaw-Hype und Venice-Wachstum09:09 Nutzerzahlen als VVV-Wette
Blue Alpine Cast - Kryptowährung, News und Analysen (Bitcoin, Ethereum und co)
Jetzt bei Kraken anmelden und 30 EUR Bonus erhalten: https://bit.ly/kraken-bonusKI Token sind 2026 das stärkste Krypto-Narrativ. In dieser Folge gebe ich dir den Überblick: Venice AI (VVV) und NEAR im Vergleich Themen & Timestamps:00:00 KI-Token 2026 und Venice AI00:37 Venice AI: Was steckt hinter dem Projekt?01:23 Compute und tokenisierte Rechenleistung02:21 Private KI als Gegenentwurf zu Big Tech03:43 AI Agents und die Agentic Economy04:55 Venice-Nutzer, Umsatz und Geschäftsmodell07:07 Grenzen gegenüber OpenAI und Anthropic08:54 Verschlüsselte Prompts und Datenschutz
Bloomberg Intelligence Head of Technology Research Mandeep Singh is joined by Nicole Hu, a Silicon Valley technology veteran and GLG expert, to explore the implications of Google's TurboQuant paper and the evolving economics of AI infrastructure. As hyperscalers look to improve the efficiency of AI workloads, advances in quantization are redefining the tradeoffs between memory and compute, with far-reaching implications for cost, latency, and datacenter architecture. They examine how new approaches to model optimization and inference could reshape hardware requirements, deployment strategies, and the next wave of AI investment.
(0:00) OpenAI CFO Sarah Friar joins the show! (0:31) How OpenAI thinks about its IPO timeline (3:31) OpenAI, Anthropic, Google: The AI arms race (7:43) Navigating the compute crunch and AI bottlenecks, device preview! (15:53) OpenAI's economics (26:08) Push into chips, the cloud (29:32) OpenAI's ad business and strategy Thanks to our partners for making this possible! EY - Agentic AI is introducing a new investment discipline. As AI shifts to consumption-based models, EY connects spend to enterprise value. https://www.ey.com/en_us/insights/ai/agentic-ai-token-costs?WT.mc_id=3501318&AA.tsrc=sponsorship NYSE - Thank you to our partner, the New York Stock Exchange - a modern marketplace and exchange for building the future. It all happens at the NYSE. https://www.nyse.com Plaud - Never miss a moment. Plaud, our official wearable AI note-taking partner at All-In Liquidity Summit, captured every insight. https://www.plaud.ai Follow Sarah Friar: https://x.com/thefriley Apply for Summit 2026: https://allin.com/events Follow the besties: https://x.com/chamath https://x.com/Jason https://x.com/DavidSacks https://x.com/friedberg Follow on X: https://x.com/theallinpod Follow on Instagram: https://www.instagram.com/theallinpod Follow on TikTok: https://www.tiktok.com/@theallinpod Follow on LinkedIn: https://www.linkedin.com/company/allinpod Intro Music Credit: https://rb.gy/tppkzl https://x.com/yung_spielburg
I'm excited to work with Microsoft once again as the presenting sponsors of the AI Engineer World's Fair! We'll streaming live from MS Build today for a special crossover pod with our friends at No Priors and the one and only Satya Nadella. However we did not hold back with this interview - we asked all the burning questions about uptime and Copilot that we know you have in your minds. Lets go!For almost two decades, GitHub has been the home of software, where both open source and closed flow, through commits, pull requests, reviews, actions, etc.This ecosystem flourished as open-source maintainers and contributors would continue shipping code for the benefit of the community. However as coding agents began to ship mass quantities of code - growing 1400% in 2026, it marked a new era that was both extremely exciting and challenging for GitHub.While these agents help more people ship more projects, they also significantly increase the floor of how much code is shipped, how often it is shipped, how many people commit code, and basically orders of magnitude multiples in every dimension of GitHub infrastructure:Now GitHub inevitably experiences more pressure on their infrastructure which was originally designed around human developers moving at human speed. This has resulted in a very publicly notable uptime story:So it begs the question of whether current systems around code can absorb what AI produces. Can CI/CD keep up when every idea becomes a build? Can open source maintainers survive floods of AI-generated slop contributions? Can GitHub preserve the human social contract of software while becoming the operating layer for agents?Which brings us to the perfect person to answer these questions: GitHub COO Kyle Daigle. In this episode, he joins swyx to unpack what happens when AI doesn't just autocomplete code, but starts changing how companies operate, how open source works, how pull requests get reviewed, and how GitHub itself has to scale. We go deep on GitHub's internal AI workflows: micro-skills, WorkIQ, MCP, Slack, Teams, email, Copilot workflows, the new Copilot desktop app, CLI, cloud agents, and how Kyle uses agents to look backwards across company context before deciding what to do next. Kyle also reflects on GitHub's history building webhooks, APIs, Actions, npm, Dependabot, and Semmle, why the AI era is breaking GitHub in new ways, how Actions became a general-purpose compute layer, and what Copilot becomes after code completion.Full Video PodWe discuss:* Kyle's expanded role across GitHub* How AI got Kyle coding again after years in leadership* Why GitHub rolls out AI through existing workflows instead of forcing new tools* WorkIQ, MCP, Slack, Teams, email, and GitHub as company context* Why massive “mega-skills” are giving way to small, atomic micro-skills* How AI changes summarization, communications, marketing, and analyst work* Why former developers in leadership may have a unique advantage in the AI era* Kyle's “15 agents on Saturday” workflow* How Kyle built an AI-generated executive presentation for CRO/CFO teams* Why AI changes the chief of staff role without removing the human work* GitHub Actions, webhooks, arbitrary code execution, and secure agent compute* The npm acquisition, supply-chain security, 2FA, and token invalidation* Slop forks, vendoring, and whether AI agents change dependency management* What pull requests become when most PRs come from agents* Prompt requests, vouching, AI review, and trust in open source* What counts as a “developer” when AI lowers the barrier to building* GitHub Spark, low-code, and why GitHub refuses to hide the code* 14x commit growth, Actions load, databases, monorepos, and availability* Copilot's evolution from completion to CLI, desktop app, cloud agents, and SDK* Context, memory, rules, and making GitHub “act like Kyle wants it to act”* Ambient AI, OpenClaw, enterprise security, and the new operating system for agents* What swyx should ask Satya Nadella about Microsoft's AI futureKyle Daigle* LinkedIn: https://www.linkedin.com/in/kyledaigle* X: https://x.com/kdaigleTimestamps00:00:00 Introduction00:03:36 Why AI Got Kyle Coding Again00:07:04 Running GitHub with AI: WorkIQ, MCP, Slack, Teams, and Skills00:15:39 The Golden Age for Former Developers in Leadership00:17:31 15 Agents on Saturday and AI-Generated Executive Work00:20:20 How AI Changes the Chief of Staff Role00:21:45 GitHub's History: Actions, npm, Webhooks, and Open Source00:28:45 Slop Forks, Vendoring, and AI Dependency Management00:33:57 Pull Requests, Prompt Requests, and Trust in Agent-Generated Code00:41:21 GitHub Stars, 200M+ Developers, and the New AI Builder Wave00:45:15 GitHub Spark, Low-Code, and Why GitHub Still Shows the Code00:47:38 GitHub's Hardest Era: 14x Growth, Reliability, and Scale00:59:21 Actions as the Compute Layer for CI/CD and Automation01:02:04 The State and Future of GitHub Copilot01:08:24 Ambient AI, Background Agents, and the Future of the SDLC01:13:09 OpenClaw, Enterprise Security, and the New OS for Agents01:18:03 Build Announcements, WorkIQ, FoundryIQ, and Microsoft Context01:21:41 What Should swyx Ask Satya?TranscriptIntroduction: Kyle Daigle's Expanded Role at GitHub and MicrosoftSwyx [00:00:00]: We're here with Kyle Daigle, COO of GitHub. Welcome.Kyle [00:00:07]: Hey, thanks for having me.Swyx [00:00:08]: You're not just CEO of GitHub. People know you as that. You have a new role.Kyle [00:00:11]: So I have an expanded role now. I've been working at GitHub for thirteen years and doing all things developer. Joined as a developer myself. And now, I'm also responsible as the CMO of Developer for Microsoft. And so all the kind of learnings and passion for developers and how we work with them and how we communicate and how we bring our products to market, we're also bringing that expertise to the broader Microsoft ecosystem and helping every developer that uses a Microsoft product or would like to have a sort of similar experience that they've had with GitHub over the years. So it's a different role in some ways, but it's also just building on the experience that I've had at GitHub of just sort of tell the truth, be authentic, show people how to use it and then let the products speak for themselves. Now just doing that with, all of Microsoft.Swyx [00:01:09]: We'll be releasing this in conjunction with Build. You got lots of stuff planned, and we can sort of touch on that whenever it's appropriate. I think one of the interesting things is I rarely meet a COO who's also a CMO. I think you're a very outward facing and you're very confident publicly. That's rare. Do you actually view yourself as COO? What's What is your thing?From GitHub Developer to COO/CMO: Building the Platform and Operating GitHubKyle [00:01:33]: I think for me, it's been funny. The titles have always been, a— have always felt a little strange to me. I joined GitHub as a developer? I wrote so much of theSwyx [00:01:46]: Let's bring that up. You wrote the back ends?Kyle [00:01:48]: I was going through, I was going through, some old photos, when folks were talking about how things were being built or how there was a build GitHub. I built, webhooks and worked with teams building the API, built the platform layer. Anything that integrated with GitHub, up until really twenty eighteen, I built or ran the engineering teams. And that's kind of where my the beginning of my passion always was helping people build things, deliver them to, their customers. And so being a developer, building for developers was always super unique. In a— I think as my role expanded, it became my ability to talk to not just developers, but also enterprise customers or business leaders and have this translation layer. And then through all those years, GitHub has always operated pretty uniquely. Post-pandemic, working remotely was not as novel as it was when GitHub started in two thousand and eight. But all that expertise of running remote teams, doing it well, became this sort of bigger role, ultimately turning into the COO role of how do we operate GitHub in the way that GitHub's always operated after the Microsoft acquisition. And kind of so on from there. So like for me, I think the— I've, I still code. I love coding but the problem has always been, people. It's a much harder problem to both support our own employees, a harder problem to communicate to developers and enterprise buyers what we're building why it matters, ‘cause those are two very different messages. And so getting to work in the mix of COO, CMO, also just being a dev, I think is what's kept me at GitHub for so long.AI Workflows for Leadership: Commits, Retrospectives, and ContextSwyx [00:03:40]: Apparently, you have— your commits have gone up. What's this? What's going on?Kyle [00:03:45]: Rui's called me out pretty aggressively. So I think— as you can imagine, right, you can see my normal era of being a dev In the twenty thirteen, twenty fourteen era, and then moving into management, and then ultimately the COO role. I think what you see there is me, really getting back to coding thanks to AI. I— similar to, attaching problems between how to market and how to operate a business and how to code, I find, building agents and workflows that are connecting very disparate problems to be what's driving this. So that's, some of it's writing software. A lot of it is, connecting a ton of a different data sources to, help me out. But that is completely me really diving in on the AI side in trying out our tools, trying out everyone's tools, But building for me, building for the non-technical leader, though I'm technical and how we're, able to use these tools more than just the simple, call and response that I think a lot of the non-technical, your employers, you have to get— you have to use AI, and so everyone uses, ChatGPT or Copilot or Claude or whatever. To really get into, how is this going to help me out, it— I find that it's not the I need to write a blog post, I need to those simple examples. Helping people find the workflows of, “Okay, I need you to go through all the PRs today. I need you to go through everything that we've posted online. I need you to go through what we did the last three months. Go through all of my Obsidian notes for any mentions of this then go through my transcripts at work.” We use, Teams, so, using WorkIQ, go call that MCP server, grab all the transcripts, go through all the Slack, and then build me out the plan of, what this week's messaging actually was. That's something that was, impossible because for me, I find AI in a what most of this launch here is actually, less building forward. It's actually, a recursive loop backwards. I'm always looking at what had happened first. Go back through the week and tell me what we did, what worked, what didn't work? And then tell me in the next three or four days-What would you tweak based on this sort of like looking backwards and then looking ahead a little bit? I find that to be so much more valuable, especially for like non-technical, because that retrospection is actually LLMs are very good at that. Like finding all the patterns, pulling them out, and then applying that retrospection to just a couple of days or just like a short period of time. Is all a bunch of apps that I've built and launched a bunch of, internal tools. I use the new, GitHub Copilot app, the desktop app with workflows. Every time I crack open my laptop, it's running workflows for me. It's just a ton of different stuff and of course, it all ends up on, it all ends up on GitHub.Swyx [00:06:47]: Of course. That's where, that's where, stuff is hosted. Man, there's so much to ask you. I was going to leave the how do you run a company with AI thing at the end. I have to ask one— double click one thing. You said, you are looking back at the week. You're, you're understanding what happens. When you say we That's three thousand people. How?Rolling Out AI Internally: Skills, CLIs, and Company ContextKyle [00:07:09]: I think when we started rolling out AI internally beyond engineering, right? One of the things that I was really, passionate about is like we have to do this in a way where no one has to change how they work. I don't want to have to teach you a tool. I don't want to have to teach you something new. And so for us, we tried out a few tools. Most of them don't work because I got to get you on board? I got to teach you how to use it. What we've actually ended up doing is we've built like a set of skills internally. We have we each have our set of skills, and we've just been distributing even to the non-technical folks, the CLI. And then effectively, we're just giving it access to like read about everything that we're writing. So that's for us, that's usually GitHub, Teams, Email, and Slack. So Teams for, video chat, generally speaking.Swyx [00:08:03]: Teams and Slack?Kyle [00:08:04]: so we use Teams for video communication, but we don't use it for chat. W-we— GitHub for a long history, right? We're alwaysSwyx [00:08:13]: Also SlackKyle [00:08:14]: Talking about ChatOps and like everything is built into Slack. Like every command, every flow.Swyx [00:08:18]: So even though you have been acquired for I don't know, eight years nowKyle [00:08:22]: we stillSwyx [00:08:23]: You still use Slack?Kyle [00:08:23]: it's a purpose-built tool for us, and I think the reality is that moving off of it would be so bluntly expensive? Simply because all the tooling is, baked in with that paradigm. And they both have their pros and cons but they don't work the same way at all. We still use a bunch of different tools Because it's the purpose-built tools that We need. And thenSwyx [00:08:47]: Well, the same doesn't go for the rest of Microsoft, presumably.Kyle [00:08:50]: like the like various teams like operateSwyx [00:08:53]: They make their own decisionsKyle [00:08:54]: Various ways. I think it just matters what you're trying to what you're trying to do. But we do we do work across kind of every tool that we use, and then by giving everyone access to all of that context and the new WorkIQ MCP server, which is quite cool if you do live in the M365 like world. I can ask it all these backwards-facing questions, and it's incredibly important for our teams that are working remotely. There's a lot of stuff you miss when you're not in an office, and we are spread out all over the world. So most of that is looking back. And then we post, we post either auto-automatically into GitHub issues or discussions, these sorts of like findings or like our industry reports. Like what's happening this morning, today, yesterday. A little automation gets run. We'll use the app. We might use GitHub Actions like with, our agentic workflows just to go do that run, and then we push it into GitHub, and w-we keep having a conversation. So usually for us, it's about that sort of like looking back, looking forward on the non-technical side. And then of course for a lot of those folks, it's also building an app, pushing it to GitHub pages or pushing it somewhere to host it et cetera. But it's just like enabling everyone with that power of it's going to take me a week to figure this out. Instead, we're going “Okay I built a skill. Let's put it into a repo. We'll all share that skill together, and then we'll use the CLI or now the app-” “just to run it.”Micro Skills vs. Mega Skills: How GitHub Uses AI at WorkSwyx [00:10:26]: All right. I think, I think we're going straight into like the team management and productivity thing. I think a lot of people are getting various levels of LLM psychosis. How do you manage the bloat of skills? Like everyone Has their thing, and they're Like trying to promote it to the rest of their peers in their org, right? And obviously, whoever becomes a skill influencer internally becomes like an AI leader, right? Of sorts. I assume you have those.Kyle [00:10:50]: like I think we haveSwyx [00:10:52]: And I assume it's a mess a Yeah.Kyle [00:10:54]: there's like I— like I think the reality is there's two pieces. Like first is I think that we're ending the era of these like massive, beautiful, perfect skills that are just like not any of those things. ‘cause for a while, right every tweet every day is like go download the skills, the perfectly managed thing to do this entire workflow. And I think that like what we've found and what— I was just with my team, this week, and we were talking about the skill side, and we're really talking about these like incredibly micro skills that are just doing one thing for us very well Versus a skill that's going to do I said, that full report. That doesn't really exist on our side anymore. It's usually how do— like a single skill that's going to identify the most important marketing information given any MCP server. Like this is the most important thing. Less about stitch a bunch of tools together and have it produce this mega output because then weeks go by, months go by, things change, and you want to tweakSwyx [00:11:58]: It's brittleKyle [00:11:58]: Your mega skill and you're screwed? You can't do that. And so now we're really just talking about the Legos we're using and just letting the instruction book be something we're all putting together. Whereas I think a lot of AI skills for a while have been that mega instruction book style.Swyx [00:12:15]: I've, thought a lot about Postel's law. I don't know if that's a term that is, means things to folks. It's the idea that you should be liberal in what you accept and strict in what you output, right? And I think that's like a good framing principle for skills. This is my skills, obviously on GitHub. I feel like everyone should have like how like some repos In GitHub are special repos? I feel like we should sort of reify the slash skills and everyone like give it some kind of special presentation. Anyway, so, yeah, this is one of those like download Download anything, transcribe anything, and then you can string together the atomic skills that do one thing well Into like some kind of orchestration skill that calls other skills. I assume, does that match?Kyle [00:12:56]: I like I think so. I think that theSwyx [00:13:00]: Summarize anything.Kyle [00:13:01]: Like I think the- For me, summarizing something for I do communications and PR and analyst relations and marketing and customer activities, and so my summarize everything is very different for each one of those like Contexts. What ‘Cause if I'm summarizing something for an analyst, that's a very different thing than, probably how I'm going to summarize something for like a customer meeting or an engagement. So that's I think like the difference when we're talking about the like the tools I might use on Saturday or the skills I might use on a Saturday when it's just for Kyle. Yeah, those are kind of like they have an atomic actual tool underneath or maybe skill, and then Kyle cares about X. But I think when we're talking about work and enabling the the marketers, communicators there, it's the atomic, this is what good summarization is, and then this is what I care about as for marketing for communications For whatever. And that I think is like the interesting matrix problem when we go from like a developer set of concerns to all kinds of different professions, is that what that word means to me is different than it means to you is different than it means to the analyst or the salesperson, and that's where I think the matrix mess is that we're starting to like still starting to find. It's about these mega skills but they're all just slight permutations, but those permutations are really important. It's the difference between someone reading this and going “Did AI make this?” what Or “This makes total sense, and I would expect this when I'm giving a briefing to Gartner,” or like whatever else.Swyx [00:14:37]: I think the beauty of it maybe is that you don't have to be that careful about what goes in there. It doesn't have to exactly fit as long as it like roughly is contained in there. I used to complain about plugin hell, basically. Like when you have a framework and then you have a hundred things that you need to integrate, everyone does like the GitHub used to be bloated full of these things. And now we don't need them anymore ‘cause now you just use skills.Former Developers in Leadership: AI as a Creation MultiplierKyle [00:15:00]: And like I think the most magical thing is the just that like I can just also crack it open. Like Like yes, I could go like change the how the plugin is coded, or like I could go do that now with AI, but I think there's just something more magical about getting a response back and being “That's not right,” and then you just crack the skill open, you just type English words and it's different. That building block is just, I think very unique. Once I get everyone to kind of understand how to best how to best make those changes to get the most power out of them.Swyx [00:15:36]: Is there a— you have a your peer group that Of people like you. Is there a common framing for Something I'm feeling is, which is true, is that is this a golden age for former developers who are now in leadership? Because you can wield the tools, you would know the right words, you're maybe not too close to the details. Doesn't matter. But like you're more effective than someone who doesn't come from that background.Kyle [00:15:59]: I think that like the secret has always been your ability to identify patterns and solve problems, and I think that for folks that like myself that don't code day to day anymore, that has made me successful as a developer, made me successful as a COO and now CMO. And so now that I have access to get and write code, I'm now applying that sort of like pattern finding and problem solving, and I know enough still about how to then go and say, “Oh, I want to make an app, but I don't want to break into jail or create something that's not going to be able to work or to be deployed scale or whatever.” that ability to apply all that additional business knowledge and still code I think is what makes that so interesting to me. Slightly different than I think some of the other like technical leaders that became business leaders and now are going back to their apps and updating them. Good for them? But I think the more, much more interesting thing is, well, now I have this whole new set of expertise over ten plus years. Why not take that and use that as a developer with these AI tools? So I definitely think that makes me more powerful, but I think that's true for like every dev as well. Most of the dev friends I still have also have some other underlying skill and passion. There's really talented, very kind of linear computer science software devs, absolutely. I just find that the folks that came from a different career, went to school for something else, went off and did this random thing, and then became a software dev, or were a dev, did a random thing, came back. Learning that extra set of information, learning those extra skills, and now having the power of an AI where I can crank up fifteen agents on Saturday while my kids are doing lacrosse, That's like really powerful. And I think it gets me back to that feeling of like creation, and it's very hard to replicate that in most other senses? That first time you build an app and you click it and you show someone that's magical. And so being able to do that not just in code, but across all kinds of different assets that's, that's huge. We were doing we're doing our every year we do our revenue planning. We talk about okay, what is it going to look like for next year? And of course as you imagine, there's, slideshows everywhere talking about what are we going to talk about, what's the narrative, et cetera. And so as you said I'm “Okay, well, I could probably just like build something to build this and then that way I don't have to go build the whole spreadsheet or I have to pass it to my team.” So we went through this process, and I got all the information and used the skills I mentioned. I built like a little app just to make it so I could look at some of the information in a SQLite database, more easily. And I ultimately built this entire presentation without touching any of it and I was “Okay, I'm just going to present this to our CRO, the CFO, their teams,” without mentioning I'd built it with AI. I like built a skill to make it look very much not AI driven. Just not pretty.AI-Generated Presentations, Human Taste, and the Changing Chief of Staff RoleSwyx [00:19:03]: Like a design. Yeah.Kyle [00:19:03]: Not pretty. But just like very clearly not AI. Kind of like don't do anything interesting.Swyx [00:19:08]: That's, yeah, that is valuable.Kyle [00:19:08]: Just go Exactly. We did the whole thing through. It used my notes from Obsidian, it used all the context I mentioned before, the plans, and Never came up once that it was AI generated.Swyx [00:19:20]: It didn't matter.Kyle [00:19:20]: Never once. D It didn't matter. And so now I takeSwyx [00:19:23]: This is a toolKyle [00:19:23]: I can take that tool and go, “Look, I don't want you to go build slideshows.” They're just helping us share information with each other. If this thing can do it With a little bit of crafting from you and then we can look at it together, awesome. There's no value in all that extra work. I think that the ability to, make it look humanly bad and and build a little app to, manipulate the data I think is part of, that upside for devs that are now in leadership roles. Because, the thing that I feel like I said before, this that's all a people, that's all a people problem. I know if you've used a coworker or not to build a slide deck, unless you spent a bunch of time to not do it.Swyx [00:20:07]: I know, but like it was so, I think there's a certain charm to just being blatantly AI. ‘Cause I think that you're well, you're just honest about There may be mistakes here that I cannot vouch for. So how much value is there? But anyway I think, actually the real question I want to ask is, there's a— You were a chief of staff To Thomas. And in the pre-AI world, the that job would've been a chief of staff job of like Can you prep me these slides and all that? And now you do it yourself.Kyle [00:20:35]: I still, I still have a chief of staff. Because, the difference is it's sort of the discussion every time we have some sort of technology evolution is it's not that the jobs the roles don't all go away, they just change? And so yeah, I don't have someone spending all their time building out slides for me and presentations ‘cause I don't need that anymore. But now I need that person that is able to go and find all the different connections between humans in those discussions to help me find out, okay, I should be meeting with this group and this team, and they have an opportunity, and I'm going to be in San Francisco today, I'm going to be in Seattle tomorrow. Those sorts of human connection aspects are still incredibly valuable and has always been a big part of that chief of staff role. But now just like chiefs of staff are not opening up, letters to process, they're doing emails. What It's the same thing. And now they're, they're not building out as many of these presentations because they have the the ability to have a AI take it on for, and share that with me and great. Let's keep moving ‘cause it's allowing us to go faster and make better decisions more quickly.Swyx [00:21:45]: Awesome. Well, so we can dive into more sort of, Productivity insights as you go. I did want to do a little bit of a brief history of colleague and hub. Because, we started here. And then you also involved the NPM acquisition. I did, I do want to touch upon that. And then more recently, I just want to bring up to present day where we're having uptime issues Which transparently we've already Addressed publicly, but we'll, we'll discuss in the pod. Did I miss anything? Like what, any other major highlights? Obviously, it's, it's a lot of years to cover.A Brief History of GitHub: Webhooks, Actions, Acquisitions, and Platform EvolutionKyle [00:22:15]: No the I think one of one highlight was right before the acquisition closed in twenty eighteen, I got to launch the first version of ActionsSwyx [00:22:27]: OhKyle [00:22:27]: At GitHub Universe. So it was OSwyx [00:22:29]: They're that young?Kyle [00:22:30]: It was October of twenty eighteen, I think. Yeah. Yeah.Swyx [00:22:33]: Gee, Jesus.Kyle [00:22:34]: I got to I was the engineering leader on that project and got to launch that. And then, yeah, we did acquisitions of NPM you said, Semmle, Dependabot Pul Panda a whole bunch of things. That was a bigSwyx [00:22:47]: Pul Panda.Kyle [00:22:48]: Abi is doing well.Swyx [00:22:51]: DX. Holy crap.Kyle [00:22:52]: Did well on DX. I and like that was a that was the big shift, after the acquisition. I had to join the sort of business side.Swyx [00:23:00]: So I need to hit you on some of these things ‘cause you were there. Right? And how often do I get to talk to someone who was there? But yeah, Actions. Is that the number one source of security issues on GitHub?Kyle [00:23:11]: Oh, sh I think that the number one source of, security issues is probably like all, the literal code in everyone's like underlying repositories. I would say back further than that is, if you remember I had to show in this graph was this is, I'm, didn't say this before, this is ultimately webhooks.Swyx [00:23:30]: You yeah.Kyle [00:23:31]: Like circa whatever it was.Swyx [00:23:32]: It says Hookshot in there.Kyle [00:23:32]: I forget. Yeah. Yeah, Hookshot's in there. And so like back then, it says GitHub Services. Do you see, it says Hookshot FE for front end, and then it says GitHub Services. GitHub Services back in the old days, right? You we had a repository that was Ruby code, and you could write any Ruby code in there, and then we would execute that On your behalf As a service, and then that way if an if you were trying to integrate with something, it didn't we would run it for you.Swyx [00:23:57]: And of course no containers ‘causeKyle [00:23:58]: No, ‘cause it wasSwyx [00:23:59]: Well, no containersKyle [00:24:00]: Twenty fourteen. And so there was some isolation obviously, but it was mostly the separations on the server level. That's like an example as long as the very old version of Pages, which ran on its own containerization infrastructure, not on Actions.Swyx [00:24:15]: Which like all-time great product.Kyle [00:24:16]: Pages powers the internet at this point to some degree. Those were places where like clearly there were no like issues like to my knowledge. But it was those things where I'm looking at and going “Okay, well we can't be running arbitrary Ruby code,” like on everyone's behalf. Then containerizing all of that up intoUh into actions now where yeah the containerization, is r-really good. The pinning most folks aren't pinning it the like to a particularSwyx [00:24:48]: ImagesKyle [00:24:48]: Sha, et cetera like their workflows, and so that's a big that's a big place Of pain for folks if they're just doing similar to any dependency management, just V1 or newest or latest, I think. But, that journey from that day to “Okay, we're just going to run all this arbitrary code, and, it'll basically be okay,” to now, no, we have, really good containerization. We have a new, underlying, ag-agent, containerization, service. It's like we're using it under the hood. It's through Azure. They recently announced it. The Azure, Dev Compute, but it's, very fast, very fast compute to be able to, spin up your own cloud agents, or whatnot. We're using it under the hood for some parts of the new,Swyx [00:25:36]: Microsoft Dev Box?Kyle [00:25:37]: No. Dev Compute, yeah.Swyx [00:25:41]: Hmm. Not finding it just yet.Kyle [00:25:44]: Oh, it's, it's in there somewhere.Swyx [00:25:46]: All right. Well, we'll cut that out.Kyle [00:25:47]: Sorry. But with, Dev Compute, you can, run, really fast, spin up really, small VMs really quickly, so you're doing a tool callSwyx [00:25:58]: Same conceptKyle [00:25:58]: Just do it containerize exact-exactly. So we're using that so definitely moving that direction to protect us from every every piece of code that we're ultimately running.Swyx [00:26:07]: look, that grows into the full SDLC? Code hosting was just the start and and then it's grown beyond that. Let's talk about NPM may-maybe ‘cause I think that's also, a very major point in the industry. I do think, it was looking for a home. It was, kind of struggling as a business, right? I don't know, I don't know how you would characterize that whole acquisition and how itNPM, Package Security, and Keeping the Internet RunningKyle [00:26:33]: like when we were talking to the team, I think the big thing for the both of us was to find a way to keep NPM, which was basically powering the internet then and way more so now to some degree running. Keep it going keep continuing to scale. It was having scaling problems, if I recall, back at that time. They were doing some rewrites. ItSwyx [00:27:00]: that's cute compared to now.Kyle [00:27:01]: Well, that's the thing is like when I'm talking to folks now, there's there's so many more underlying uses of NPM than there were back when we had them join in with GitHub. But that was ultimately the goal. It was really okay, we used to have pages. We have, the world's code. Let's make sure that we can keep NPM running well for the world. And we put a bunch of time and investment into fixing some of the underlying backend, changes, some of which we talked about some of the manifest work, et cetera. And then now, really trying to bring the the security posture of NPM up to speed. But, it is a unique challenge in that every move that we make to make it more secure will break a lot of people. And security is paramount. And also, we take it very seriously. We're, the any time that we have a problem with GitHub or we make a change that makes us more secure but hurts, there's, a snow day for developers or a really bad fire that they have to go put out. And so we've, have changed the 2FA policies. We've changed the way the tokens work. When we find tokens that have been exposed or potentially, exposed, we invalidate them, andSwyx [00:28:22]: I love that feature in GitHub. Yeah, it's greatKyle [00:28:23]: That creates issues, but, the but that's the thing is we're trying to push the community, forward without necessarily, doing something that is going to break the contract that's been for 15 years or close to it or some amount of years on NPM.Slop Forks, Vendoring, and the Future of Open Source Supply ChainsSwyx [00:28:43]: I think the— So now we're talking about, open source and publishing. And I think there's something here with what people are calling slop forks, which, I think Malta from Vercel is doing. And, part of me thinks, well, the way to get past any vulnerabilities, we just, let's just get rid of the concept of NPM. And we only publish source code. And anytime you want to import it you have your coding agent look at it and then adapt whatever subset you're going to use into your vendor it. But, the AI vendor it. Is that realistic? I don't know. Is it— Will that solve all our security issues? I don't know.Kyle [00:29:24]: I don't think it'll solve I so Mitchell was just talking Mitchell Hashimoto Was just talking about this today, and I think that I-in some ways, it's all all things, old or new again? Yeah, absolutely vendoring everything. Like I do I do remember twenty thirteen, twenty fourteen.Swyx [00:29:42]: This is Yeah. Let's, we must return toKyle [00:29:43]: That's what is We were vendoring everything. We were having actual discussions around, or at least I remember we were “Should we take this full thing?” “Why is this so big? We only need this one file.” And so I do think there's something true there where having either taking only what you need or the dependencies just getting incredibly small over time, I think will help to some degree, but it's not going to solve the fundamental problem, I don't think, because the vulnerabilities in an agent looking at them, there's time and time again, there's a million different ways in which we can convince an agent that this thing is, secure or not and pull it in. Or we can do static code analysis or runtime testing to say whether the code works or not. That is, I think, the step that needs to continue to be, invested in. The question is just on, how much scope. Should it be this enormous project that I'm pulling down, or should it be this piece? Either most companies are running some amount of security checking on the on the packages that they're bringing in or vendoring. That I think won't change. That's like what advanced security does to some degree, Socket does some degree. Like everyone is doing a piece of that. How we each do that like especially when we're talking to enterprise customers, is just like very different. No there's no one wants one single way to do it. And I think that's always been GitHub's, unique position in the world. I talk a lot to maintainers, I talk a lot to folks about this. It's we're— we rarely start like a process and a practice and like push it onto the community. We usually wait for the sort of like RFC process socially or literally, everyone agreeing, and then we'll cement something in. Because otherwise we'reMaintainers, RFCs, Vouching, and the Social Layer of TrustSwyx [00:31:35]: That fits your role in the ecosystem, yeahKyle [00:31:36]: We're GitHub. Yeah, we don't want to shape the whole thing. We want it to be figured out. But like how do you balance that like sort of Role in the industry to keep everything as secure as is possible and make sure that you're you're not going to be compromised as a human, ‘cause that's usually how it all happens. And Not not create a process or lock us into a flow that you're not going to or like Mitchell's not going to or other open source projects aren't going to like. That's always been a tricky balance for us, and I think that's something that we haven't talked about enough is we're not going to be able to fix everything for everyone in a way that everyone is going to like. So tell, help us, tell us what is working. When Mitchell was talking about, the Upvote, the upSwyx [00:32:22]: I was going to bring up his thing. Yeah.Kyle [00:32:23]: I forget what it Yeah. When he's talking to us, I was chatting with him and talking to him about this and I put it on Twitter and we talked to, also over DM, was “We're going to keep working.” but I think the important thing is I do actually want to hear what isn't working for you. And as, be as specific and clear for your project as is possible. And to every piece of credit over the many years that we've known each other through the industry, he's always done that and I appreciate that ‘cause there are places that we need to fix up, and we hear from him, and we'll fix up just like we do all other kinds of maintainers. But that that process between making those types of improvements and being more secure and like creating, I forget what he calls it's not the proof process, not the claims process. Do what I'm talking about? He has that he his projects have a way for you to kind of like,Swyx [00:33:13]: VouchKyle [00:33:13]: Vouch. Thank you. Yeah. He has like the vouch system for saying, “Hey, you should accept my PRs.” That's beenSwyx [00:33:20]: I just built this into GitHub. I don't know.Kyle [00:33:22]: Well, see, but that's the thing is that you say that and like he and his community really likes this and then I'll go talk to other maintainers and other maintainers, globally, and they're “No, this doesn't work for me.” And that is the tension, but also the kind of beauty of GitHub, depending on which way you look at it is we want to help maintainers, so we create all these tools to let you have more control over how much you take in from AI and PRs. But you can also use this. What You can go use this project, and if it takes off and becomes the kind of mostly standard, then yeah, we probably wouldn't enforce it but we would add it in because that's the flow that we tend to do?Swyx [00:34:02]: I hear a lot of people don't know the history of the pull request. And like like that's how, that's something that GitHub standardized basically.Kyle [00:34:08]: Yeah. It was a very messy process Like beforehand, and now the we have the benefit of it being the process? And now we have to go and Figure out the next best process or what adaptations change, or what does a pull request look like when eighty percent of your PRs are just coming from your agents and not From other devs?Swyx [00:34:31]: Do you like the prompt request idea from Peter?Kyle [00:34:34]: like I think that for each like each idea I think has its merits. I'm not, I'm not avoiding saying anything good or bad, but I feel like I've seen a version of we have that we have entire Thomas' store. Take all the assets of what you've built and put that in. I think that's got great ideas. There's all these various permutations of the PR flow, but I think the reason why there's not a single answer is ultimately we're trying to codify trust. We're trying to say “Okay, if Sean reviews this I'm going to trust it because you're Sean or you're the senior dev or you're the whatever.” And right now, when we are working in a flow where an agent writes code and another agent reviews code and then Kyle goes and looks at it the trust is kind of diffuse. And most of the tools that we're talking about are talking more about verification flows. We have more assets to look at, so I can probably say whether this is a good PR or not. But that still doesn't solve, I think, the human problem of I'm looking at a PR and I want to know if I can trust it. And we're still, we still tend to use human signals for that? Mitchell approving it or Kyle approving it or whatever. And so I think that's, I think that's why most of these options haven't really solved it is because, it's a social problem ultimately. It's a it's a human problem to review it and agree. Or you fully trust the tool and you're imbuing that tool with full trust Which I think in some cases that absolutely exists.AI-Generated PRs, Trust, and the Waymo AnalogySwyx [00:36:08]: And so like in the same way that there will be a tipping point in society when we don't allow humans to drive anymore Because machines are measurably better than Than humans. I'm looking for that tipping point, right? Like Mythos is ridiculously expensive. Someday we'll have Mythos on a desktop. I don't know. Will, does that change the equation?Kyle [00:36:30]: I think it's more I took a Waymo here, and I was on my phone and not looking around at all. There are other, self-driving, vehicles that I would not trust while, staring at the road. And I think that trust is something that isSwyx [00:36:48]: Is this a Zoox thing? What is itKyle [00:36:50]: I think that is both. I think that is both. LikeSwyx [00:36:53]: There's Zoox in this robo taxi. That's it. It'sKyle [00:36:56]: Well, depending on what level Of self-driving. But, my point is sort of that I think part of that is I strongly believe that's, a mixture of verifiable proof. Like how many accidents, how much data, and so on, and the human aspect of how I feel when I'm in this car, what it tells me, et cetera. And so that's why I think some of the like Some of these some of our AI tools tend to, imbue me with more of that feeling of trust, even if the data says this is 100% accurate. I feel like it takes more time for us to go, “Should I trust this or not?” And that's in the soft sense of, startups with high agency, weekend projects, and open source. And then there's enterprises and regulated industries and everything else, and that is an even harder problem to go solve because even when it is fully verified, not only do you have to have trust from the humans on the team, you probably have to have trust from multinational,Swyx [00:37:55]: Oh my GodKyle [00:37:55]: Multi governments around the world and regulating agencies. And so that's where I feel like until we tip over to your point on the sort of like human EQ side of it. I feel okay this feels okay I've been proven enough. Then the ball will start to roll a lot faster, where we'll end up getting to the “Okay, we can trust this,” and feel good about it in the Most difficult of cases.Reputation, Sponsors, Stars, and Bot Activity on GitHubSwyx [00:38:18]: If human trust is the thing that matters, I feel like GitHub as the developer social network could maybe do more there. Like vouchers are one system But, we have star counts, and then we have Contributor rights, and that's it. And I feel like there should be more in that space. I don't know if there's any other design decisions there.Kyle [00:38:37]: I think that one of the places that we don't really expose right now in this sort of way is, some degree of like hard trust and support, which would like for me is like sponsors is a good example of that.Swyx [00:38:49]: Ah.Kyle [00:38:49]: It like costs you something. To prove that I believe in your project and I trust you To some degree or I want to support you at the very least.Swyx [00:38:56]: Solve payments for open source. Why not?Kyle [00:38:58]: I think that I think that like as we keep moving forward, right, there's more and more projects where I'm, adding more and more dollars into sponsors personally because I want to like support them, but I also like know of I've probably never met them in person, but, I know of enough of their work that I want to support them. I think the thing that I don't love about stars or commit counts or anything else is ultimately, even with all of the various, abuse and de-spamming and deduplication work that we do or anti-abuse work that we do, these are all, not active social signals. They're passive ones that are ultimately gamifiable. And you may trust me, but another open source maintainer may not. And on what heuristic should you be, trusting me? That I think, is kind of where some of our thinking is right now. What signal from me is most important to you? You— If you can define that potentially, honestly in an agentic workflow that's what we see some of these open source projects do, where you have GitHub actions, and then you have like an agentic workflow that's calling AI, and you're setting these rules. Like if Kyle has submitted and gotten accepted PRs across any given project and has a social handle tied to his account in GitHub, and that social account's older than a certain amount. Really complex measures that matter to you ‘cause most open source projects have that heuristic built into their heads, if not written down in the contributing guidelines. You could take that and then go apply that and then just say, “Oh, we're not going to accept this PR.” Building something that is, I think, malleable to everyone's needs, is a little bit better, rather than going “Hmm, this account's too young.” Because what happens? The attackers just go and go and create a multitude of accounts, and they wait Until it ages up. Needs to have a certain amount of stars. That's how star inflation happens. Need to have a certain amount of reposSwyx [00:40:46]: Oh my God. YeahKyle [00:40:47]: With PRs. They all just create repos and submit PRs to each other, and then they come in and do something nefarious. And so, it's hard. It's hard to find the measure. So I think we're, we're looking more at how can we provide you tools so you can kind of choose what's best for you. And of course, we'll give you some standards. But the trust vector, gets down to I don't know, some version of like human digital ID like everyone's been talking about. Like how do I prove that it's meSwyx [00:41:13]: Give me your eyeballsKyle [00:41:14]: On the internet. Give me your eyeballs. Exactly.Swyx [00:41:18]: The I got to keep moving on Topics, but obviously I can go all day on this stuff because, I've been involved in GitHub and open source My entire professional career. Stars. Very superficial. Everyone knows it. But I think time to one hundred thousand stars is the fastest I've ever seen. Like people just reached that in I don't know, months. And then like at the same time I don't trust it right? Like how many of these are real or bot or like whatever. I don't know how to ask this but like what can we do about it? LikeKyle [00:41:49]: JustSwyx [00:41:49]: Is stars broken? Is stars fine?Kyle [00:41:51]: I think that there's kind of two, there's like two pieces. Obviously we're constantly like trying to find ways in which like your users are producing spam, which would, I would include like be like only doing star gamification. When we find them, we pluck ‘em out and we,Swyx [00:42:08]: But it's like a Whac-A-MoleKyle [00:42:10]: It's a hundred percent like a Whac-A-MoleSwyx [00:42:11]: There's no wayKyle [00:42:11]: Now, powered by AI to be helpful. But I think more so what I'm seeing is, a lot of the like fastest time to X tends to be because we're now inviting so many more people into like software development on GitHub That like the zeitgeist is just swarming? And it'sSwyx [00:42:32]: It's not just developers anymoreKyle [00:42:33]: And it's not you and I. Like like however you want to say like what a developer is it's not just folks who have been coding for a very long time. It's folks that have maybe started coding or only joined in since the AI era. And nowSwyx [00:42:44]: what's the latest Octoverse number? I know eighty million was my lastRem- member that a number of developers on GitHubKyle [00:42:50]: Oh, we're over 200 million now.Swyx [00:42:53]: Okay. Well, so you see?Kyle [00:42:55]: Like over 200 million developers now.Swyx [00:42:56]: But it's not developers, right? It's, it's people with a GitHub account.What Counts as a Developer in the AI Era?Kyle [00:43:00]: So, so this is, this is the biggest debate that I would say, everyone loves to have at GitHub at this point. From my perspective, right, I think that there's, there's clearly a difference between, professional enterprise developer and then developers. But I think that I think that the idea that we should be I don't know, splitting hairs or segmenting developers in the early era of software development is, not worth our not worth the time. SoSwyx [00:43:29]: When you get into gatekeepingKyle [00:43:31]: 100%Swyx [00:43:31]: What is a developer?Kyle [00:43:31]: 100%. ‘Cause I wasn't a developer when I started writing code? I was going toSwyx [00:43:36]: Oh, no. I made— I cloned a thing, seven years before I learned to code. And then I and then I wrote about my learning to code journey, and people Just called me a fraud ‘cause I had a GitHub account. And I'm “Well, no, I just use GitHub, but I don't know-” “I didn't know what I was doing.”Kyle [00:43:49]: I I remember that. I remember those sets of posts, and like that's, that's b******t. So I fight very clearly on the line of, if you create code, if you have an idea and you create it into some way of, I'm, I'm going to run it and use the app right now, you may still use AI in that moment, but that's okay. At some point you're going to do the next thing. You're going to create a big— You're going to have to learn about this database. You're going to fix a bug, whatever. We're all on some same journey, and those people are also hearing about the great new agent skill package or a new CLI tool or a new whatever. And those projects are going up because you want to be a part of this moment, just like I wanted to be a part of the Ruby community when Ruby was popping off when I started becoming a developer, and now I can just click the star button. And so I think that yes, there's clearly some amount of like spamming and game gamification that we're working against, but I really think we're just seeing this whole new cohort of folks that are moving from technology to technology because they're not working on a 20-year-old software application. They're working on a side app that they built on the weekend for their friends or for their new idea or whatever. And that's how you see these enormous charts going up and to the right with With stars.Swyx [00:44:59]: I think something that's remarkable is the persistence or, that GitHub extends to those folks. Usually when I see platforms go into a new audience, they usually have to, have like a second platform with a different name that wraps the main platform. But somehow GitHub has been able to sort of persist and extend, and it's friendly and whatever? So it's, it's nice.Spark, Low-Code, and Always Showing the CodeKyle [00:45:19]: I that's partially why I think as we've tried to move into I don't know, more like low-code-y things. We so we started working on Spark as like a way to, build an app and run it. I think that the reality is that we anytime we try to, kind of put even a veneer on top of it without when we put a veneer on top of something, we still always show you the code. That's kind of like a tenant. We're never going to, hide the code from you ever, because whatSwyx [00:45:52]: Why would you?Kyle [00:45:52]: That's, yeah, that's the whole point? However, I think that what we learned with things like Spark is that really the value of Spark for most devs is, easy runtime. And you may have a runtime or a host that you're going to use for that or you just build something and run it but, the package of making that even more simple isn't really needed for folks that are trying to build software and not just trying to build, an app, which is, slightly different, a slightly different goal. So I want to get you in, I want to get you comfortable. I think the best thing for me as, someone that did not traditionally come into software dev way back, I want anyone to be able to breach that chasm and not be in the I don't know, I feel like we're, we're still in an era of, STEM. I've got a 12-year-old and an eight-year-old, and it's “We got to get ‘em into STEM,”? Over and over. And I like I do, I do the things that good parents do. I was “Oh, you want to do coding?” “Yes, I want to do coding.” Do coding classes. But now they're just not afraid of doing software. And that's, I think, the thing that's honestly kept me at GitHub for so long. Anyone should be able to go and build a thing, just like I can go change a light switch in my house. I'm not going to go into the breaker box ‘cause I'll probably kill myself? But, I can go change that light switch. Everyone should be able to go and say, “This fricking app doesn't do what I want. I want it to work like this.” And that I think, is what's kind of kept us all connected with GitHub through the years and some and during the easiest of times or in the hard times because of that opportunity of, we're the home for all developers, and we want everyone to be able to have that feeling that we've had of, had an idea, I created it and holy s**t here it is.Swyx [00:47:37]: Here it is. All right, I'm going to try to do more spicy questions.GitHub's Hardest Scaling Moment: Growth, Agents, and UptimeKyle [00:47:42]: Great.Swyx [00:47:42]: Is it an easy time now or a hard time?Kyle [00:47:45]: Oh at GitHub? It's a hard time. Like, it's a hard time and also, I was just with my team and I said, “This is also, the best and most exciting time that I think I can remember at GitHub.” BecauseSwyx [00:47:57]: Best of times, worst of times. It's never oneKyle [00:47:59]: ‘cause we've we were talking about Octoverse reports and, usually we do an Octoverse report once a year, and we look at the numbers, and we say, “Oh my goodness.” I was at Universe in October saying, “This was the fastest year of growth that we've ever had,” right? And now we're doing more in a month than we did in a year last year.Swyx [00:48:20]: You're talking about PRs.Kyle [00:48:21]: Commits.Swyx [00:48:21]: Commits, yeah.Kyle [00:48:22]: PRs. Kind of like you name it by roughly every measure that we're looking at, there's some amount of sort of growth that is much bigger, and that is breaking our system in new ways, not old ways. Like webhooks were always notoriously, unreliable over the years?Swyx [00:48:38]: Whose fault is that?Kyle [00:48:39]: not anymore mine, but for a period of time, I'm sure you could pull up a tweet that was “It was me. I'm sorry.” but, now, that got rewritten at a scale level that is still working and is not having problems today. Now what we're finding isn't just the isn't the-The simple stuff that folks are on the sometimes on Twitter or on the internet are “Hey, why is this like this?” Sure. There's absolutely silly problems that we shouldn't exist. But now we're talking about, unique, novel permission problems that happen only at a scale across all different objects or whatever, that now we have to go rewrite this underlying system. And so it's, there are problems that yeah, caught us off guard, which I think I said. Like the growth is astronomical, but also we're making such material progress in that I'm excited once we're once we've kind of like reimagined the underlying foundation layer, or pieces of it at least, what's going to be possible when it's not just all of us and all the new people that are being developers and all of their agents and all the tools like working together. Because that'll still happen in that in that GitHub tool, that GitHub community. But it's a it's a hard day anytime we can't give you what you're looking for. We have the same problem internally. We operate through github. Com. Of course, we have backups when things go down and whatnot for our own operations but we feel it too. If it's not working it's not working for us, and that's kind of like the promise of dogfooding for GitHub. It's always been true. We're using the same tool you're using. We're not using a super secret version. We and so we also need it to be great for us for our customers of course for open source. And now an exponential growth of agents, Doing it too.Swyx [00:50:32]: I wanted to load for audio listeners who maybe haven't seen your tweets, whatever. So one billion commits in twenty-five. Now it's two hundred and seventy-five million per week on pace for fourteen billion this year, if growth remains linear. Is that still the pace? I don't know. It's been aKyle [00:50:48]: it's, it's speedingSwyx [00:50:50]: Roughly.Kyle [00:50:50]: It's still speeding up.Swyx [00:50:51]: It's, it's April, so yeah.Kyle [00:50:51]: Exactly. This was in April.Swyx [00:50:53]: All right. So basically you have fourteen x growth, right? Year on year on year. And I think that's a scaling issue. I think, I'm going to like try to really steel man this thing. People have experienced fourteen x growth. They haven't had your downtime. And that's like— C-can we go dig into that? Why? Like what's the— what broke? What are we doing to fix it? Like just anything for the community to reassure them.Why GitHub Reliability Is Breaking in New WaysKyle [00:51:18]: so there's a Like I was saying, there's a couple different places that we've seen the growth issues. Some of the growth issues, which is why we're t— I was talking about pushing hard on more CPUs is in actions in particular. More tools, more agents, more PRs mean more builds, more builds mean more CPUs. And so we are expanding through not just our data center, but obviously we were talking about moving to Azure and moving to, adding an additional cloud compute because we simply need more CPUs. Not as much GPUs. We definitely need GPUs too, but now CPUs are becoming a factor.Swyx [00:51:53]: It's very CPU heavy.Kyle [00:51:54]: Underneath the hood when it comes to some of the underlying services, we've been breaking up over the years our database infrastructure, so that way we have, more cognitive separation between our the various services. The place that we continue to have pain is in, permissioning. And so right now m-many of our permissioning layers sit into a database that we like internally call MySQL One, and old Hubbers will know what I'm talking about. And so we've been pulling things out of MySQL One for many years, because like and we use we use Vitess and we use other technologies to shard and we do it as one bigSwyx [00:52:31]: Famous thing, PlanetScale was born from this andKyle [00:52:32]: A hundred percent. Sam Old Hubber and friend. And so finding these opportunities to like break this out and then do that globally. The other thing that I think is interesting and both a unique opportunity and tricky is we also run everything I just talked about in a black box container with GitHub Enterprise Server for people that work on-prem. So we take everything I just said, and we also do it on-prem, and we also do all of that and we do it in a data residence setup for customers that need to have their data in a single location. Each of these has the unique characteristic around how we're sort of storing that data in MySQL or in a permissioning setup. That's where some of these outages have oc-occurred, where you're seeing it more like across the board rather than just like the one pieceSwyx [00:53:17]: Filling the databaseKyle [00:53:17]: Isn't quite working. Exactly. And so part of it is that. I think there's been some other places where agents are much more or more projects appear to be moving towards monorepo versus we were going the other direction for many years in the industry. Repos were smaller, but there were more of them, and now we're seeing the opposite. Repos are bigger, and there's, not fewer of them per se ‘cause there's new growth, but, we're just seeing many more big repos. Big repos, big monorepos have always had, a unique performance problem. Because each one, is slightly different if, particularly if the underlying blobs are incredibly big Inside the repos. And so we've done a ton of work that you pro— like most people haven't probably experienced, unless you're in this case of the monorepo. But that Git, infrastructure layer improvement does help the overall, system because, many of the improvements that make monorepos work better make all repo infrastructure work better. And so, I could kind of keep going down the line where it's another thing where we're moving out of, We're changing how we do j I'll just say job queuing for lack of a better, explanation changing the underlying technologies there.Swyx [00:54:32]: I spent two years being a job queuing guy, so.Kyle [00:54:34]: And so it's kind of a little bit of a little bit of piece by piece, and it's mostly because as we were— as it was built, we built everything in a way that assumed, I guess in some ways that the size of the pipe of work was going to remain the same. There's just going to be more people coming through each of those pipes. But instead now in places whereA git push was, generally a certain size for example, is now, no longer true.Swyx [00:55:03]: Oh, yeah.Kyle [00:55:03]: OrSwyx [00:55:05]: I push a thousandKyle [00:55:06]: On the average. 100%Swyx [00:55:06]: A thousand line commits like dailyKyle [00:55:07]: Same thing with PRs. Like PRs same thing. And like we've talked about optimizing that and making changes where, and there were technology choices that did not work there? And it got slow, and it didn't It was not fast. It did not do what the users wanted. And so we've been reeling that all out and going “Okay, that's just not right. Let's stop putting good money after bad and do it the do it the right way or the right way now.” So there's It's a it's a lot of things, not quite when I've experienced scale at GitHub historically, it's almost always two options that we've used. We go vertical scaling, particularly with databases, right? And we go horizontal scaling. Oh, we just have more people using this service. Great. We're going to add more servers, and we rack them in our data center, or we use it in a cloud. And now we're sort of in a like diagonal, where like vertical doesn't really work anymore. Horizontal isn't work either because we're all We all have some CPU or GPU constraints in the world now, and now we have to go in and like crack open services that have been running for 10 or 15 years and go, “Okay, the rules of this service have legitimately changed, and now we have to rewrite them.” None of this is an excuse. This is like we're We have to do the work. We have to make it better.Swyx [00:56:22]: actually as an infra guy, I'm “This is like one of the most fascinating scaling challenges I've ever seen.”Kyle [00:56:26]: That's that's, that's the thing that's the thing that it's hard for Like when we weren't talking about it publicly, and I was like I came out, and I was “Hey, I just want to explain what's going on.” Part of it comes from a very old GitHub ethos, which is it's our it's our uptime. It's down. W What I know you're a developer, so you're, you're inclined to want to understand more what's going on. But at the same time us going “Hey, this service didn't, perform the way we expected, and now we have to go change it,” we weren't We're not trying to hide anything from you i
We're announcing AIEWF speakers this week! Take the AI Engineering Survey!Today's guest Ethan first joined us for the LS Paper Club as the lead on NVIDIA Cosmos World Model, but then joined xAI and built Grok Imagine in 3 months:He comes back on Latent Space with some nuclear hot takes: that Video Models primarily get their intelligence from LLMs, not from training on video data, and that the next frontier for truly interactive, realtime, long-horizon world models is to work on LLMs (perhaps Interaction Models as well…)Put it this way: In the near term, the next Sora won't be a better video model, but a video agent.Generative Media may more closely follow the evolution of AI coding which went from focusing on one-shot output performance and cost, to multiturn reasoning and planning models for agents and systems that can plan, edit, test, debug, and submit PRs.At a certain point, coding models got so good that the only significant next step to improve performance was handling the orchestration of these models.Now as the performance of video models increases significantly across realism, consistency, & prompt adherence while becoming more cost efficient, the next evolution of video generation may also be systems that can plan, generate, edit, critique, and iterate across an entire creative task. In this episode, Ethan joins swyx and Vibhu to unpack what it actually takes to build frontier image and video systems: data, VAEs, diffusion transformers, audio-video alignment, inference speedups, and the hidden cost of storing and moving massive video datasets. From building NVIDIA's Cosmos world model to joining xAI as Grok Imagine was being built from zero to one, Ethan He has been at the center of some of the most important work in video generation, multimodal models, and real-time world models.We go deep on Grok Imagine, how a small xAI team shipped its first multimodal video model in three months, why iteration speed matters more than almost anything in model development, and why many of the biggest gains come from fixing tiny bugs in data and training pipelines. Flipbook: The future of VideomaxxingVideo agents are almost a sure bet to be the trend in the coming year. We end with a glance at what's beyond video agents:Flipbook caused a minor sensation this year when it was released, but most treat it as a fun demo. Ethan takes it very seriously — with the speed and cost of inference coming down every year, the future of custom video JIT UI is closer than you think. We talked about why videogen models may become the front end of AI, how generative UI could replace traditional HTML/CSS, why world models need to be real-time, interactive, and long-horizon, and why the future of video generation may depend more on language models and agents than on diffusion alone.We discuss:* Why fast iteration mattered more than meetings* Why small training bugs can drive huge model quality gains* Why coding models may make compute the bottleneck again* How image and video models are trained with synthetic captions* The role of VAEs and latent space in frontier video models* Why image models are the foundation for video models* The tradeoff between temporal compression and real-time interactivity* Flipbook, Neural OS, and the future of generative UI* Why future interfaces may go from user intent to pixels* The hidden cost of training video models: storage, egress, and GPU hours* How step distillation and consistency models (like OpenAI sCM) makes video inference orders of magnitude faster* Grok Imagine 0.9 and large-scale audio-video generation* Why audio-video alignment is harder than text-video alignment* Ethan's definition of world models* Reference-to-video, video extension, and long-context video generation* Why xAI's research communication undersells Grok Imagine* How xAI culture shaped the speed of development* AI watermarking, SynthID, and detecting generated media* Why prompt rewriting matters for video models* Grok Imagine Agent and the rise of video agents* Why language models may unlock better video generation* Robotics, physical AI, and embodied world models* Why Ethan left xAI and shifted focus toward LLMs* Self-managed context, memory, and the next frontier for language modelsEthan He* LinkedIn: https://www.linkedin.com/in/ethanhe42* X: https://x.com/EthanHe_42Timestamps00:00:00 Introduction00:01:25 From NVIDIA Cosmos to xAI00:03:24 Building Grok Imagine from Zero to One00:10:07 How Image and Video Models Are Trained00:18:53 Video Compression, VAEs, and Real-Time Tradeoffs00:22:10 Generative UI, Flipbook, and Neural OS00:32:10 The Cost of Training Large Video Models00:37:04 Distillation, GANs, and Fast Video Inference00:41:21 Audio-Video Generation and Grok Imagine 0.900:48:34 What Makes a World Model?00:55:51 Reference Videos, Long Context, and Video Memory01:00:11 xAI Culture, Research, and First-Principles Building01:09:45 AI Safety, Watermarking, and Prompt Rewriting01:13:10 Video Agents and AI-Assisted Creation01:27:32 Why Language Models Unlock Better Video01:31:15 Robotics, Physical AI, and Embodied World Models01:32:38 Why Ethan Left xAI01:34:16 Self-Managed Context and the Future of LLMs01:38:43 Ethan's Career Path and Closing ThoughtsTranscriptIntroduction: Ethan He, Latent Space, and the Path to xAISwyx [00:00:00]: We're here in the studio with Ethan He, most recently of xAI. Welcome.Ethan [00:00:10]: Thank you. Glad being here.Swyx [00:00:11]: We're also here with Vibhu. you were first coming to us or joining the latent space world because you were working on Kosmos at NVIDIA, and you did a paper. We loved it. you presented it as well, so thank you for doing that.Ethan [00:00:23]: I've actually, I also presented the MoEs twice at latent space.Swyx [00:00:29]: How did you actually hear about us? Did we reach out to you? Is that how it worked?Ethan [00:00:33]: No, actually, I-- the community. Like I realized, oh, there is this online community that people talk about AI and also learn from each other through papers every week through the Paperclip. It's very nice.Ethan [00:00:49]: I learned a lot.Swyx [00:00:49]: I think three years stop. We haven't stopped even on Christmas and New Years. many weeks I want to stop but it keeps going.Vibhu [00:00:58]: No, that was good. I think you had posted that you worked on a paper, and I was “Oh, very cool. We have Paperclip. Present then.”Vibhu [00:01:04]: But I might have reached out to you after.Swyx [00:01:05]: you-- because it's an amateur club, right?Swyx [00:01:08]: so it's very unusual and but we have sometimes paper authors come by and actually explain the paper. Today we just did, the poolside paper, which was apparently very good.Vibhu [00:01:18]: Came out yesterday.Vibhu [00:01:19]: pretty interesting, right? Fully open. They talk about everything, systems. So it's a good one. We'll, we'll recommend people to read it.Swyx [00:01:25]: Bring us up to speed on your transition to xAI, ‘cause I actually don't even know when you joined. just like tell the, tell the story about the sort of transition.From NVIDIA Cosmos to xAI: Scaling Video and World ModelsEthan [00:01:34]: Before xAI, I was working on Kosmos world model as in-- at NVIDIA. So Kosmos is, it's a giant video foundation models that can-- that aims to simulate the world and for-- it serves as a foundation of-- for all of the roboticists to build on top of. There, once I built the Kosmos one, I realized as this thing also has a scaling law similar to language model, we need to scale up the video models further. that's, that's why I realized I need to move to somewhere with much more compute resources. That's how ISwyx [00:02:13]: Than NVIDIA?Vibhu [00:02:14]: The GPU rich came themselves.Vibhu [00:02:19]: And timeline-wise, when was Kosmo? It was pretty early, right? It was open world model, open paper, everything.Ethan [00:02:25]: It was end of twenty-four.Vibhu [00:02:28]: End of twenty-four.Ethan [00:02:30]: Then at mid twenty-five, I moved to xAI. At that time-- I joined about the time when xAI was about to build video models and in multi-model models. There were no infra, no data, and no model, and it just-- as a few engineers, we built it in three months and released the first model, Grok Imagine zero point nine.Ethan [00:02:55]: And since then, I keep working on video models and move more from training and to post-training of the video models. For example, like a reference to videos, kind of like the cameo feature and, video extensions. And, before I left, I worked on a world model, leading a small team to focus on the real-time long horizon video generation.Building Grok Imagine From Scratch in Three MonthsSwyx [00:03:24]: Can you give like a rough roadmap of okay, you're on a brand-new team. Grok previously was only text, or they partnered with BFL for their image gen stuff. What do you-- what are the building blocks, right? You have compute, data you can procure somewhere. Like just what are like the sequence of things that people should think about when you're setting up a new team?Vibhu [00:03:43]: actually even deeper, not just data you can procure. You guys had to go through getting the data too, right? So you shipped it pretty fast, but yeahSwyx [00:03:51]: three months is likeVibhu [00:03:52]: From everythingSwyx [00:03:52]: actually like very surprisingly fast.Ethan [00:03:55]: One thing I say like thanks to my experience at NVIDIA, ‘cause first time when we were building Kosmos together, we built it, for about a year. So this is like the second time I do it. Roughly have an idea, what to do. I say the most important thing is the talent. Everyone were very strong and clever, very close with each other towards a common goal. So that speed up things a lot. So you reduce the communication bandwidth among people, and everyone can work towards the same goal. It's, it's like every day there's not that much meetings on the calendar, like maybe like a, like a sync a day, and after that it's, it's just all building. It was pretty fun at that time.Ethan [00:04:47]: And another thing is that xAI has very strong foundations of like data inference, model inference, and the supporting there can help the model develop a lot. When I look at, training models, I don't so actually the top important thing is like how many, how many iterations can you do, per day? and the more iteration can you do, you can, you can train the model much faster. So if you have very strong infra and you have a lot of compute, you can, you can train these models in very short period of time. That can give you a much larger buffer to, for errors, and it also gives you the opportunity to spot more bugs.Iteration Speed, Compute, and Debugging Model PipelinesSwyx [00:05:46]: What is an iteration? Is it like a few hundred steps or what are youEthan [00:05:50]: Let's say just the train-training the model, like from acquire new data and maybe design new algorithms and train a new model, maybe at smaller scale orSwyx [00:06:01]: So cycle time for like any hyperparam that you're searching.Ethan [00:06:04]: Cycle time and tune to like eval this model. Is this model better than my previous iteration?Ethan [00:06:11]: SoSwyx [00:06:11]: So it's like before you, someone had already set this up that you can iterate very quickly.Ethan [00:06:15]: I think the foundation there is extremely good forDeveloping and research models.Ethan [00:06:23]: And often I find is it-- this is kind of boring, but like a lot of the improvements does not come from new algorithms. It comes from finding small bugs here and there in the data pipeline, in the, in the model training pipeline. Those give, those give the biggest boost to the model quality.Vibhu [00:06:46]: It's interesting, right? So you say it's like small team, less communication bandwidth, but also a lot of quality is like find little bugs. It seems counterintuitive, right? You have a lot of people, you can iron out more of those, but it's interesting to see the other side, right?Swyx [00:07:00]: I also wonder, have you-- do you try using LLMs to look for bugs? I don't know.Ethan [00:07:05]: I remember at that time it was mid two thousand and twenty-five, so it's the coding model wasn't quite there yet. I remem- I remember like December two thousand and twenty-five, it was extremely good. Yeah, I've been, I've been using it at that time. It's, it's helpful. sometimes it produce codes that are kind of difficult to maintain, even though like the first time it built something extremely fast. But it gave the, like a spaghetti code, thousands of lines that I couldn't maintain, and the LLM itself couldn't figure out what's, what's wrong and how to improve on top of it. But now I find it much better. Yeah, I want to bring up another point here is now coding models are much more efficient and can help us implement stuff much faster. Compute might become a bottleneck again because previously, like if you want to train a new model, say you want to generate new synthetic data and then or write a new algorithm, it might take a few weeks. And during that period of time, you don't-- you might not have experiments to run. But now you can build that thing within a few hours, then you can immediately train a model.Ethan [00:08:24]: Now you have to have enough compute to try all of the ideas. So compute might be the bottleneck of iterating speed again.Swyx [00:08:36]: yeah, I actually, honestly, I think it's like kind of a stressful job because you're “Well, I should be trying everything, and if I'm not, then I'm not doing my job well.”Vibhu [00:08:48]: there's also the stress of you're eating thousands of GPUs per hour, which is very expensive and, compute can go to other researchers.Swyx [00:08:56]: You got the daddy Elon toVibhu [00:08:57]: You got daddy Elon.Ethan [00:08:59]: It wasVibhu [00:09:00]: But there's still finite amount of compute, like you want to use it, you want to use it well, you want more of it.Ethan [00:09:06]: That was quite stressful indeed. Yeah, I think one thing is the-- with coding models now, like a lot of these jobs can be automated, which is much better. A second, it's a, it's a marathon, so you got to maintain good health and, a regular schedule.Vibhu [00:09:28]: It's, it's hard to hear that when you shift from zero to nothing in two months.Swyx [00:09:32]: and, I think obviously the culture at xAI is very famously, people work very hard. one thing I did want to dive into, in our-- in the notes that you, that you sent ahead of time, you had specific comments about the cost of Video Gen training. presumably this is on the Colossus-1, right? the two hundred megawatt cluster. Any whatever you want to just share on that.Vibhu [00:09:54]: I think there's, there's three things we're talking about, right? So there's Video Gen, there's also the Image Gen model that you put out. Do you want to like complete the, okay, so zero to one, you have a few months. Just what are the stages of create Image Gen model?Swyx [00:10:06]: Oh, yeah, maybe I got distracted.How Image and Video Models Are Trained: Synthetic Captions, Tokenizers, and VAEsVibhu [00:10:07]: Sorry. and then, from there's Video Gen, there's Audio Gen. Would love to get into those next. But what is that first few months like? So small team, a lot of bugs, iterations, but what does it look like? Do we take something off the shelf? Do we just get data compute? What's, what's the few months like? How do you go to state-art Image Gen model? How do you just start?Ethan [00:10:28]: I cannot comment specifically how xAI did, but it's, it's a quite standard process. I can draw some, examples from Cosmos. So mainly it's building a video model, you actually need to build a image model first. And building these two models, the data you need is a hundred percent synthetic pair of language and image or language to video. Because on the, on the internet, actually, the videos don't naturally associate with text. So you can say, oh, like on YouTube, you have the title and you have the description and the commentsSwyx [00:11:11]: TitleEthan [00:11:11]: of a video, but usually they're not relevant to the video itself. And say maybe like the video is a natural scene of mountains or something, and the title is, I'm so happy today.Ethan [00:11:26]: So they have they have no correlation at all. So the first step is to, you have to generate synthetic pair of language with the videos. So you gather videos from the internet, and you use a VLM to caption the videos. So that part, here's a question, like how do you, how do you gather VLM to begin with? So if there's noSwyx [00:11:55]: You, so you fuse the model, right? LikeEthan [00:11:57]: Say if there's no like VLM exists, like how do you generate the text to the beginning, right? It's, it's impossible.Swyx [00:12:04]: I see.Ethan [00:12:05]: In the beginning, it's like you ask human to describe the video as detailed as possible.For example, you ask them to describe everything, like all objects, all characters, and all interaction and dialogues in the, in the videos. So that's in the protocol of Cosmos labeling. We require the objective we give to the labelers was that you have to describe the video as detailed as possible, such that a blind person hears a blob of text can reconstruct what the video is like from their head.Swyx [00:12:43]: Video or image? You're talking about images.Ethan [00:12:44]: Video or image, either one of them.Vibhu [00:12:47]: This was pretty common when we went from clip and DALL-E, right?Vibhu [00:12:51]: It's all training on really detailed captioning of images. So same is applied to video, but insteadEthan [00:12:57]: same appliedVibhu [00:12:57]: of using multimodal model to pass in video images and write rich descriptions, you can alsoSwyx [00:13:04]: I think there's this traditional perspective of supervised, or, very highly human curated thing. I feel like there's a unlock with unsupervised, right? Where like you have enough to bootstrap that you can just throw common corpus on it or, whatever. like unsupervised vision and language pairing, right? Like where you just have, interspersed image and text and it just learns. To me, that is the VLM breakthrough that is different from the clip, different from the LM era.Ethan [00:13:36]: It's interesting to see that you kind of need both data.Ethan [00:13:41]: For example, for theSwyx [00:13:41]: You need it to bootstrap it up. YeahEthan [00:13:43]: for the generative model training, there's also usually like a small percentage of unlabeled data. So the model is instructed to generate a video without any text instruction. That can also help the model generalize. So after this stage of generative synthetic pair, so, one important common step is to train a compressor or a tokenizer of the image or videos. So because, if you train-- If you can technically, theoretically train image or video models on pure pixels, but the problem is that the, it's, it's a lot of tokens. So like one image, it's, a thousand by a thousand, it's like one million tokens, one million pixels. It's impossible to train transformer on that. So it's, you need to train a tokenizer, which can go from image to latent space and latent space back to image.Swyx [00:14:45]: That's why we named the podcast.Swyx [00:14:48]: But, basically, you're talking about vocabulary science.Ethan [00:14:50]: so vocab.Swyx [00:14:51]: And so, what is, what is imp-- like a million is impossible?Ethan [00:14:54]: In generative models, the vocab is continuous. It's a continuous space. We can think about like you map an image to a vector. It's a, it's a fixed length vector. It's sixteen or forty-eight, something like that. And then you map that vector back to the image space. And the mapping is, has-- The mapping is patch-based. So you say you haveEthan [00:15:22]: a sixteen by sixteen patch and you match, you map that patch of pixels into this latent space.Swyx [00:15:29]: We've covered thisVibhu [00:15:30]: This is like the vision transformersSwyx [00:15:32]: VAEs,Ethan [00:15:33]: VAEs.Vibhu [00:15:34]: You basically compress your input, you do your generation, you're reasoning all that generation in smaller dimension, and then you project back out.Swyx [00:15:43]: VAE is a form compression, but I think the for me, the patching thing is from VIT, right?Ethan [00:15:48]: You can make those.Swyx [00:15:49]: Literally the, yeah, the paper is titled like sixteen by sixteen is all you need. something like that. and then I think also, people make a lot of comparisons with this kind of patching with convolutions.Swyx [00:16:02]: Which is you're, you're kind of re- reconstructing the old paradigm with the new.Ethan [00:16:05]: Actually, in VAEs, there are, there are both convolution networks and transformers. You can actually do both.Ethan [00:16:14]: After this VAE, so what you've got is you've got latent space tokens and you've got the language tokens. So now the training of the diffusion transformer, usually generative models use diffusion transformers. It is actually quite standard. It's, it's very similar to how you train a language transformer models. It's not that much difference. It's just the tokens, the visual tokens in, visual tokens out. The only difference is there's a denoising process. So you train the model to unmask some of the noise. So you add, you add random noise to the visual tokens, and then you train the model to remove those noise to generate the clean tokens. Any inference, the model can iteratively remove noise from a hundred percent noise.Swyx [00:17:12]: And then there's also, to speed things along on the tech tree of diffusion, there's CFG, and then there's, there's also, latent diffusion that, there's, there's someone in there. I think, somewhere along the line, obviously, like stability and all these other guys, pioneered a lot of this, architecture. I don't know if you want to get into that or just, or do the video side up to you.Bootstrapping Video from Image Models and Temporal CompressionEthan [00:17:37]: After you train such model, such image model, the reason it's a, it's a foundation for video models is that image models are cheaper to train, and they have much denser connection between language and text. So, sorry, language and images. For example, you train a billion, you train on a billion images, and there's a mapping from the text to the image. And the cost to train the same, like the, a billion, a billion text to a billion videos, that's much more expensive because videosNaturally have more tokens than images. Because the diffusion models, their understanding of, language purely come from this mapping. So if you don't have enough mapping, so if you only train on like a ten million videos or something, there-- you might not see enough language tokens in your training, so your model does not understand human intention enough. So that's why you really-- you train-- you first train this image diffusion models, and then you bootstrap the video model from there.Swyx [00:18:53]: One thing I did want to ask, because I-- actually, I think you're, you're the first per-- video model person I've ever talked to, I think. we've, we've like talked to Luma and all those folks. There's all these tricks in video compression where basically frame by frame there's not that much difference, so actually you don't have to regenerate or save the whole frame, right? but I think MP4 compression or something else like that.Swyx [00:19:16]: is it tempting to use that? Or as far as I can tell, everyone just treats it as, “No, we would just generate every frame.” Is that roughly the state-art?Ethan [00:19:27]: There are a few different approaches. Let's say first, like you want to just directly use MP4 compression and use that as the tokens for the transformers to train, right? So people actually have tried that, but the main challenge is the latent space for the MP4 tokens were not, were not very comprehensible for the models. It's, it's extremely hard to train on that. And there's aEthan [00:20:01]: So that's why they created VAEs, which creates more continuous, latent space, so the models can understand that latent space and learn from it much easier. Even within the VAEs, there are different difficulties of the latent space. So you can imagine something the simplest, the most naive VAE is like you have an image, and you just shuffle all of the images into a, into a vector. So you don't need to train any VAEs, right? But that latent space is extremely hard for models to train on top of. That's why there are some debate on like how do you compress the tokens. So you mentioned like you can compress frame by frame. Also, you can compress, the temporal dimension.Ethan [00:20:52]: The difference is if you compress the temporal dimension, you get a much higher compression rate. Because there's temporal redundancy between frames, because, this frame and the last frame, likely they are mostly similar, so there's only some small difference. for example, I think in 12.1 VAE, they have like a eight by eight by four compression rate. So the four temporal tokens are compressed into one tokens. That can save a lot of, save a lot of the context length. If you do it frame by frame, you have to do maybe like eight by eight by one. Your context length will be four times larger. That being said, the benefit of the frame-- per frame compression, we might come back to this later, is, real-timeness and interactivity. ‘Cause if you, if you strain the output of the model, frame by frame, you can-- the model can respond to any user request immediately. So if you have like a temporal four compression, four times compression, thenSwyx [00:22:06]: It might be laggyEthan [00:22:07]: there's a lag there in nature.Swyx [00:22:10]: So you're very pilled on this. let's just go ahead and bring it up ‘cause we have the visual prepared anyway. There's some frontier applications of real-time video gen. So Flipbook is one of the examples that went viral recently, right? What is Flipbook?Real-Time Generative UI: Flipbook, Neural OS, and Diffusion Front EndsEthan [00:22:23]: Flipbook is kind of like a web brow- web browser. You can see like it has the web bro- browser UI on top. The difference is all of the UIs are generated by generative image model in real time, and anything here are fake. But you can, you can explore inside this wor- this imaginary world. Say like we-- here we have engineering the Great Pyramid. Like the model generates this for us to understand how it works, and if we want to navigate around and understand further, we can click on some of the, some of the description here, and the model will generate a new page, new subpage describing the details we want to know about.Swyx [00:23:14]: So it's basically kind of we're playing a video, but it's pausing for our next interaction, and then it just plays the next thing based on our interaction.Swyx [00:23:23]: Which is kind of cool.Vibhu [00:23:25]: and you kind of decide your story. So this was, how do you make a pyramid? levering technique seemed interesting, right? It shows how do you take Okay, I want to know what is thisSwyx [00:23:35]: The demo, the demo tweet had more animation between frames.Vibhu [00:23:38]: I think it's just skipping,Swyx [00:23:39]: Oh, it's just skipping a lot of frames.Ethan [00:23:40]: they also have a video modeVibhu [00:23:42]: It takes a lot. There's a lot of peopleEthan [00:23:42]: but, a lot of people are using it.Ethan [00:23:45]: So it's not available.Vibhu [00:23:46]: There's a live video stream. We can try,Swyx [00:23:50]: So this is an example of the kind of future that you see at the extreme. We don't-- we're obviously not in it today.Swyx [00:23:56]: But in a world where inference is completely free this is better than generating code and text?Ethan [00:24:02]: So this is, this is a final state of where Viva will be at for word model, I think. Imagine internet doesn't exist, and then you type in google.com. Like what should, what should, what should a model show you?the model can imagine something, and this is what the model imagine. And these web pages, they completely do not exist. So I think as the inference costs come down, we are going to have generative UI for everything. If you think about how the coding model works, so they write code for a web page, and they render the code might be con- converted into binary, and the binary render the pixels on the screen. So we in machine learning, every time we have some breakthrough, obviously it's, it's more intuit. So why don't we have like user instruction to the pixel directly? So the generative UI will be user intention to the pixels directly. And say like even if I want email, let's say everyone have the same interface, but I want, I want it slightly different. I want the email to show to me like a TikTok, so I can swipe left and right for the emails. And or maybe you want something else. We can have completely different things. Or like I have I'm looking at, Instagram stories, and I don't like the Like button. I always may click it. And, generative UI resolved it. So it's going to be a revolutionary replacement of the interface. So in the future, we might have much more powerfulEthan [00:25:50]: LLMs and coding models running behind the scene. And in the, in the front-end, the diffusion model will actually be the front-end to show stuff to you. That's how I imagine it.Swyx [00:26:02]: Diffusion front-end, deterministic back-end.Swyx [00:26:04]: Something like that. I find that very expensive, but,Vibhu [00:26:08]: I find it interesting you called LLMs writing code on the back end deterministic, but okay.Swyx [00:26:14]: you write it onceVibhu [00:26:15]: Compare it toSwyx [00:26:16]: And then you execute.Ethan [00:26:17]: If you think about the cost, say, let's say H100 costs $1 per hour, and if you use this eight hours a day and thirty days, so, every month you're paying this two forty, you'll actually not wanna pay for that. That's even more expensive than Cloud Code Max. But if you think about the compute costs come down like two times every year, and I think the future will likely arrive like within few years.Vibhu [00:26:49]: It's everything, right? compute cost comes down, compute gets faster, model gets smarterEthan [00:26:54]: More efficientVibhu [00:26:54]: model gets smaller.Swyx [00:26:55]: I don't know why you say two times, ‘cause I think it's like 100 times. In language models, it is roughly one hundred to a thousand times every twelve to eighteen months, for the same given level of LMSys, ELO.Vibhu [00:27:08]: That's a net of everything, right? That's model performance alongside compute. So different than just compute costs come down. But, a very interesting future.Swyx [00:27:19]: So the web designers will have to shout out that accessibility is an issue, right? how do you deal with screen readers or whatever. But yes, this is higher bandwidth storytelling than anything you can possibly generate with code, right? So I think that's the rough idea.Ethan [00:27:34]: And I'd like to add a little bit that so human naturally have the maximum bandwidth when we are looking at things, look at videos, and we also have maximum output bandwidth when we are talking. So in the future, it might be something like we talk to AI models, and the AI model responds back with a generative UI. So that would be the maximum input and output bandwidth to interact with AI models before neural link happens.Vibhu [00:28:06]: And it's also very custom, right? Some people are very visual, some people are not as visual, right? They prefer the text. But the best thing about generative UI, right, it can also be text.Swyx [00:28:17]: There's another project that we wanted to highlight, which is the Neural OS. Kinda similar idea, but here you're literally operating, simulating an operating system with a video model.Swyx [00:28:27]: and you can play Doom, you can do Firefox. I find this like mildly less impressive, obviously, because it's an OS that I can run.Swyx [00:28:37]: But here everything is imagined.Vibhu [00:28:40]: I was, used to the Command+W to close the Firefox tab. It didn't crash. That's why I saidSwyx [00:28:45]: It's too immersive.Vibhu [00:28:46]: It's, it's too immersive for me.Swyx [00:28:47]: Too immersive.Vibhu [00:28:48]: I wanted to close the tab.Vibhu [00:28:49]: But yes, I can play generated diffusion.Swyx [00:28:51]: this is shockingly fast.Swyx [00:28:54]: Because I remember there was a demo about like maybe one to two years ago. Someone tried to do the first-person shooter with a image model. There was no consistency. It was very slow. But here it looks like realistically it's-- this is Doom.Vibhu [00:29:07]: I think there's two sides to that, right? There's okay, what is running a game? The heavy part of it is actually the game engine, all the lighting, all that stuff, the graphics. This is just kind of video, right? Like we've solved consistency. This is still, it looks like a few years old image generation. There's some temporal consistency, but it's, it's kind of just images stitched together as frame video. But it's a good visual representation to pi- to picture the future you wanna see, right? that's, that's what I see in these more so.Ethan [00:29:38]: This reminds me of how the video models gets better and better. So Neural OS is kinda if you just look at it feels like it's just a crappy version of the, like the Windows we could have, right? And, but the difference is, so the model, this model is overfitted on the existing operating systems. It can generate nothing different than that. But it's actually also similar to video models. So when we are training these video model, image model, we train them on internet. There's no imaginary supernatural stuff on the internet. But once we train this model, you can prompt the model to generate something supernatural that have never existed in the data set. So if you train your Neural OS or neural computer on the standard screen recordings on the entire internet. The model can imagine completely new interface to interact with the computer.Swyx [00:30:43]: This is one of those things that is magical to me. usually generalizing out of distribution is bad, but somehow we have learned some kind of internal world model that you say, this plus, but it looks like rainbows and butterflies, it'll do it and it will kind of make sense.Swyx [00:31:03]: So yeah, that's kind of cool. Yeah, I don't know if there's any comment more on there. I do, I do wanted to, I did wanted to touch a little bit more on the model architecture stuff, which I think you were getting. It's, really fascinating. We don't get a chance to talk about this enough. So one of the papers that we covered, we've covered every annual, segment anything release. and I don't know if you follow-- you're a computer vision guy, so youEthan [00:31:26]: I knowSwyx [00:31:27]: . So they did memory attention, which is kind of interesting. And I always think, anything where you can, across the temporal dimension, keep some consistency, I think it's, very fascinating, and I don't know if Basically, does that-- the CV side bleeding into video gen side, I think is underexplored, right? we talk about it for labeling, but actually you can borrow the architecture itself.Ethan [00:31:50]: There's, there's also complete different approaches, right? you brought up the term world model, so we went from video model to world model. There is diffusion, but there's also other approaches that people are doing. So maybe we get into those after as well,?Swyx [00:32:03]: He has a whole definition of world models and stuff. I feel like we threw a lot at you. Whatever you want to comment on.Why Video Models Are Expensive: Storage, I/O, and Training ScaleEthan [00:32:10]: I think one thing that we should actually comment back on is okay, so we were talking about the steps to train image gen to video model. One thing we don't see as much of is okay, you brought up the delta in training data, right? SoEthan [00:32:24]: you won't have as much a video model might not generalize, but what is the cost of training a large video model? So we know for LLMs roughly, okay, even like the poolside thing that came out today, right? It's a Gemma level model trained on roughly forty trillion tokens at this many H200s over this much time, right? You can see what is the exact cost of that. So how many GPU hours over how much H200 costs? So how do we do the back-end math of, same thing for video models, image models. How do you, how do you kind of break that down? I can share some back-envelope calculation. So surprisingly, video models is-- the cost is very-- is comparable to language models and obviously the largest scale is language model, maybe like a medium scale to language models. I said just storing the videos alone, it costs a lot. You can, you can maybe look up on AWS or something.Ethan [00:33:20]: You really, say if you have a billion videos and let's say, let's just say like each video, like five megabyte, then you need five petabyte to just store those videos. And also remember we talk about you use a VAE to compress the videos, and you also need to store, typically you need to store those continuous feature, in-- also in your storage. That's also comparable size with the videos themselves. So just storing these videos and the features is tens of petabytes alone. And,Swyx [00:33:58]: I just, I just looked up the calculation. Five petabytes on S3 Standard is one hundred K per month.Ethan [00:34:05]: AndSwyx [00:34:05]: It's comparableEthan [00:34:05]: and you needSwyx [00:34:06]: AndEthan [00:34:06]: And then like tens of petabytes, two hundred K. And even more expensive is you have the ingress and egress.Swyx [00:34:13]: Oh, yeah.Ethan [00:34:14]: Like you-- through the internet. You have to just to download those videos, I believe it's, it's more expensive on AWS than just storing those videos.Swyx [00:34:25]: Storing, yeah.Ethan [00:34:25]: And each training runs, you probably need to pull them once. If you train multiple times, it's, it's even more than that. So it's like just storing the network, those costs is just, it would be a few, a few millions per month to just storing everything, not to mention the GPU cost.Ethan [00:34:45]: AndSwyx [00:34:45]: my side tangent, the compute rental, like GPU rental is very efficient. There's one side, okay, you can be XAI and build your data center. Should we not just build our, storage compute as well? LikeEthan [00:34:57]: Of courseSwyx [00:34:57]: cloud cost compared to just,Ethan [00:34:59]: You save so muchSwyx [00:35:00]: store. Yeah, exactly.Swyx [00:35:01]: Especially with like egress and stuff. So.Ethan [00:35:04]: That's a good idea, but it also comes to-- there are some of its own challenges.Swyx [00:35:09]: Of course, of course.Ethan [00:35:10]: like people who build the GPU data centers, they might not expect this much, storage. And yeah, people build storage, typically they just build it somewhere with just CPUs.Swyx [00:35:23]: I just looked it up. Five-- AWS only charges for egress, not ingress. Tier five for five petabytes is two hundred and thirty K.Ethan [00:35:32]: Even more expensive than the storage.Swyx [00:35:34]: But storing is per month, right? You check in, then you cannot check out. so it's so cool. It's okay. So there's that side.Ethan [00:35:41]: So the TLDR, my backhand mathSwyx [00:35:42]: Data is larger than you think. Yes.Ethan [00:35:44]: my backhand math of GPU hours times GPU cost is also very much, I'm missing some storage.Swyx [00:35:49]: You're also-- you're basically like also more IO bound than normal training.Swyx [00:35:55]: Yes. ‘Cause like data loading, so caching everything, it becomes super important.Ethan [00:36:00]: So in Cosmos, we did a lot of optimizations to make it not IO bound. So, speaking of the training, actually training the model, the GPU cost, if you look up like the open source model, how big these video models are, I think like LTX has nineteen B parameters. That's a dense model. And people are also exploring, MoEs, so it might be twenty B active and, like a hun- hundreds B, total. So that's, that's even-- that's similar size as medium-sized LLM models. And if you, if you look at number of tokens-Uh, we disclose that in Cosmos. It's also like tens of trillions of tokens on the visual tokens. So putting this together, the cost of, training these video models, it's actually comparable with LLMs. Not to mention, the infra is slightly different from LLM, so it might be less efficient to train these models.Inference Speedups: Step Distillation, Consistency Models, and GANsSwyx [00:37:04]: Do you get the benefits of traditional diffusion speed-up? So for, images, there's LCM, LoRAs for, fine-tuning. There's, there's a lot of stuff that's beenEthan [00:37:15]: Flow matching.Swyx [00:37:16]: there's flow matching. There's a lot of stuff that's been done. there's some overlap that applies to diffusion on the inference side and stuff or?Ethan [00:37:23]: so the difference-- the inference side is a completely different story.Ethan [00:37:28]: I think for the training side, it might be a little bit hard to reduce that cost. And for the inference side, the biggest gain is from the distillation of these models. You can-- It's called step distillation, slightly different from knowledge distillation in LLMs. So you-- Typically, for flow matching models, you need like 100 steps or something. Like a distortion model even need even more, like 1,000 steps to generate a good image or video. A step distillation is try to learn to generate fewer step from the model itself. It's kind of like now we-- you use the full model to generate in 100 steps, and then you take a model that only generate 10 steps and let that model to learn from the perfect one.Ethan [00:38:25]: why this workSwyx [00:38:27]: Strong to weak seemingly.Ethan [00:38:28]: It is. It's kind ofSwyx [00:38:29]: DistillationEthan [00:38:29]: kind of like strong to weak. the-- from the modeling perspective, the strong model, the teacher model is trying to model the image and videos of inter-internet, and that distribution is extremely complex. But the step distilled model is just trying to learn from the teacher. The teacher is a model, and the size is fixed, as the distribution is much simpler than the whole internet. That's the intuition I have why step distillation can work. So usually these models serve in productions, they only run in a few steps. In Cosmos, I believe we have, we have like four step and eight steps. If you do some simpler task, image-image translation, it can even run in fewer step, like one step in Cosmos Transfer.Swyx [00:39:22]: I think this is the same intuition that guides a lot of the consistency model work. I sent you a link for, SCM. I don't know if you covered that. To me, that was actually one of, the most impressive papers I've ever seen from OpenAI.Swyx [00:39:34]: That this is the unifying grand concept of consistency models. I don't know if you have any comments on this.Ethan [00:39:41]: So there are, there are a few different approaches,Swyx [00:39:46]: Oh, yeah. Here it is.Swyx [00:39:47]: Two steps versus twenty or 100 steps, whatever. It's already done.Ethan [00:39:52]: So there are, there are a few different approaches, for example, consistency model, and there are also Actually, we shouldn't forget GAN. So GAN, actually, that was, that was the OG ofSwyx [00:40:05]: OGEthan [00:40:05]: step distillation ‘cause it trained just one step to begin with. So actually, a lot of, uh-- For example, there's a distribution matching distillation which use, which uses GAN, as one of the laws for distillation. It-- GAN just tells you, “Hey, generate an image,” and thenEthan [00:40:31]: it has a discriminator to tell, is this image real or not? So the model, the model just need to learn one of the distribution, not the full distribution. Because in training, the model is asked to reconstruct the ground truth image from the internet, which is extremely hard. And in-- When you're training GAN, it's a step process. It's just a, “Hey, you generate image. Does this image look as real as the image from the internet?” Which is a much simpler task. And, yeah, combining a lot of these approaches together, people typically do that, like consistency model and distribution matching and GAN, and we can get these few step models.Audio-Video Generation and Time AlignmentSwyx [00:41:21]: Then there's one step I wanted to add, which is audio and video.Ethan [00:41:26]: So, Grok Imagine zero point nine, I believe it's, it's a first audio video transmodel deployed at a large scale. SoSwyx [00:41:39]: And that was your first model?Ethan [00:41:40]: that was, Grok Imagine's first model. It's, it's audio video, joint generation. I think the hard part is, the modality alignment, ‘cause before this transmodel, we have, we have text to video alignment. We have this, correspondence between text and video. Typically, most of the VLMs, they understand images and videos. Video's very rare, and they don't understand audio mostly. And if you look at the audio generation on the LLM side, you can talk to them perfectly fine, but if you ask them to sing a song or something, it typically is not very good. Also, they don't have, they don't have music either. The hard part is thatUh, actually audio has two component. It has like a discrete component, a continuous component. The discrete component is like the language.Ethan [00:42:44]: So when we speak, it's just, someSwyx [00:42:47]: It's an ASR issue, yeah.Ethan [00:42:49]: It's, it's text token with some characteristics, I would say.Ethan [00:42:54]: But musicSwyx [00:42:56]: I think the speech guys would disagree with this.Swyx [00:42:57]: Like disfluencies and then,Vibhu [00:43:00]: There's tones you can get angry.Ethan [00:43:01]: Well, I say largely.Ethan [00:43:03]: the mu- but the music is completely different. It's, it's very continuous, and you cannot model them like discrete tokens in language models. this is like the hard part for models is, not to mention we have to align text, video, and audio together.Ethan [00:43:26]: SoVibhu [00:43:26]: How?Ethan [00:43:28]: So significant-- some significant challenges are like-- So first, like we talk about as the VLMs, they cannot understand most of them cannot understand audio.Ethan [00:43:39]: So you have to have some way to do the synthetic data generation for audio. You have to caption the model, and that involve, that involve synthetic data and human data effort a lot. And not just surprisingly, most of the LLMs are very bad at recognizing, like the beat, tone, and the details of the of music. They can, they can give some general prediction of which song is this, but it's very hard to describe the details of the music. like we mentioned in image generation, like you have to describe image as detailed as possible so that someone blind can reconstruct that. So here is like someoneVibhu [00:44:32]: DeafEthan [00:44:32]: someone deaf can reconstruct how the music sounds like without actually listening to it. Maybe you can think of it need to have the-- or they call the script.Vibhu [00:44:49]: Subtitles, yeah.Ethan [00:44:49]: You gotta have all the details of the music, and the dialogue.Vibhu [00:44:55]: So is the challenge there typically stuff like music and audio, or is it just Like is there a baseline? Okay, there's enough data where we can understand, narration, conversation, but there's nuances in audio that's where you hit all the data issues or is it just from stage zero, you just do it all right?Ethan [00:45:15]: So one important thing is like the alignment. So the model, the model has to know like the video and audio, the, uh-- it has to have a time-based alignment, like at which time step the video and the audio token correspond to each other. But we actually don't have this kind of alignment for most of the other modalities. If you think about like text and image, text and video, they are loosely aligned. So you can, you can have a description of what's going on in the video, but you don't have to exactly, You typically don't have exact description, oh, at, time step one second like what happened?Vibhu [00:46:02]: It's veryEthan [00:46:03]: At time step two second what happenedVibhu [00:46:03]: coarse. Yeah.Swyx [00:46:05]: So what was the ideal time step? You have to oblate it, and then it's like four seconds or something.Ethan [00:46:09]: So that comes down to how you design the model to, for the model to be aware of as a time, as a time modality. So the model is like a time aware. And that's something pretty unique if you think about LLMs. So if you ask LLM to complete a task, say they, uh-- you ask them and they will say, “Oh, this task will probably take twelve hours to complete,” and they come back in one hour. Say “I've already spent two days on this and I've exhausted everything.”Ethan [00:46:47]: So the LLMs them-themselves, they don't have a sense of time there.Vibhu [00:46:53]: I actually don't think that's just them not having a sense of time. I think it's somewhat based, right?Vibhu [00:46:58]: Like you tell someone, “Okay, go work on this feature. Go implement this,” there's a general understanding you would have of how long that would take without LLMs working at LLM speed, right? So you think back like two years ago, if I tell you to like build me like a new front end for latent space, have a search bar, have all this, you'll estimate that it'll take a few days, right?Vibhu [00:47:19]: So you tell an LLM, “Go build this.” It'll take me a few days. But I think it's somewhat grounded as opposed to them not having the best-- Not saying that they have a great understanding, but I think that example is like you can see where it comes from, right? You're trained on all over the text.Swyx [00:47:35]: They're, they're trying to estimate what a human would say.Vibhu [00:47:37]: because that's what the, that's what the data kind of represents. It's not themEthan [00:47:41]: It came from the corpus on the internet. People have a estimate of how much time.Vibhu [00:47:45]: And not even just in direct like training samples, right? Just your world understanding of tokens of how long stuff takes, right? Go read a book. It'll take you a while, right?Vibhu [00:47:56]: Even if you do nothing but read a book, it takes a few days. So yeah, LLM, I read it took me a few hours.Vibhu [00:48:01]: It'll take me a few hours to go through this research. But this is a tangent.Swyx [00:48:05]: Somewhat, yeah.Swyx [00:48:06]: This is a train of thought I haven't really expressed until now is, which is basically like a full world model must also be recursive, meaning that the participant in the world model must also be aware that they have a world model. which is like this whole recursive thing down the, down the line. but yes, and that the world model can be wrong and that they need to update it and blah. Yeah. We've, argued this on the, newsletter as well, that there needs to be sort of recursive or adversarial world models.World Models: Real-Time, Long-Horizon, Interactive VideoVibhu [00:48:34]: just, to ask, how do you define world model?Swyx [00:48:38]: Oh, yeah, let's go there.Ethan [00:48:40]: SoVibhu [00:48:40]: So just for context, we talked about, video generation, and then there's a-- if you say there's a distinction between world models, what's your, what's your definition? How do you see the two?Ethan [00:48:53]: So disclaimer, I'm not going to debate, what is world model. Yeah. there are many definitions, so I'll just talk about my definition. Since I came from the multi-model, multi-model domain, so mainly talking from video. So world model is like real-time interactive long horizon videos. So there are three parts. so we-- let's talk about them one by one. So the so interaction, so we just, we just look at Facebook and neural computer. So the interaction part of it, so you, world model can allow you to interact with them through keyboard, mouse, and maybe also voice. So these all is-- all is a modality. You can, you can interact with the model, and the model should respond reasonably. Second part is real time. So once you, once, say, you move your mouse, if, say, the world model generate a game, how fast can the game respond? So if you're like professional CS: GO players- -my say, oh, you have to respond- He's beginner within sub ten milliseconds or- Yeah even less. So that's not most of the- No, sixty FPS. Let's go. Oh, three hundred FPS. Oh, five hundred FPS. Wait. okay, yeah. I didn't do the math, but yeah, okay. Uh- Yeah, three hundred FPS, that's a three millisecond. So you have to respond- Oh, s**t. Okay. YeahEthan [00:50:29]: within a millisecond. Most of the video models cannot do that. Yeah. And, but if you, say, if you have a video model that is, say, like a digital human, the response time might be more generous. Maybe typically, for real-time voice interaction, it's like two hundred millisecond. So that's, that's much more generous. But even two hundred millisecond is pretty, it is pretty tricky, ‘cause remember we mentionedEthan [00:51:01]: you have this, temporal compression coming from the VAE. So if you, if you don't compress the temporal dimension, your sequence length is going to explode. So if you want to have this real-time, real-timeness in your model, you have to do is one context problem. And the third part is long horizon, ‘cause we-- if you're not going to just play with, video games just, a few seconds, most video models only a few seconds. We're going to play with minutes, hours. The model have to be able to generate long-form content.Ethan [00:51:42]: So putting these three together, it's, real-time, long horizon interactive videos. I think the final state will be, for example, like a video, a video version of Playbook, where you can, you can interact with, a neural computer. You move your mouse, and you click on the generative interface, and it will reply to you through pixels- generating in real time. But getting there, it's, it's a very long way to get there. So one of the first step, at Grok Imagine, where I led a small world model team there, was to build video extension. So, video extension- it's the first step of interactivity. Yeah. It's, it's the first step. Yeah. So it's the first step- You have it here, video editing, yeah. Yeah. Yeah. So the first step is because, this unlocks long horizon videos. Typically, for most of the video generation models, you give it a prompt or an image as an initial frame. You generate video, that's it. That's just, one time, done. And some creators would try to, use the last frame as a first frame for the second video. It can-- sometimes it works, but if you do it a few times, it says the quality would decrease. And- It doesn't have that context- Yeah over the full video, so the temporal- Yeah, exactly. Yeah, ‘cause you only gave it the last frame, of course, right? Yeah. Exactly. And- it's actually a pretty fun hack. if you've seen like- Oh, no, he's saying something better. Yeah. And for example, like Vue, I remember Vue 3 has like a second context of the last video. It is slightly better than using the last frame, but it has the same problem-- similar problem that it, the quality would decrease. if you extend a few times to, one minute, the video quality would look much worse than the first video. Second, another problem is that the model doesn't have long-range knowledge of, what's happening before. Say, if they generate some dialogue, some, two people speaking, and their voice might change, over some time, especially if the second conditioning, it does not cover the previous context. So these are the core challenges. So the Grok Imagine video extension, it has historical context of all of the previous generated videos. It can, It has, it has the context of, who is speaking and what objects have appeared and everything, having that to generate the next video. So if we naively do this, you can imagine, just, put all of the previous history video tokens into the context. The context lens will easily explode. Especially for video models, that can be like a few, a few million context, I would imagine- context lens. Yes.Yeah.Swyx [00:54:58]: Let's run with that.Ethan [00:54:59]: for example, like in Cosmos, I think just five seconds of video is like a fifty K or sixty K number of tokens. So like if you do, if you do fifty second, that's a five hundred K tokens. If you do longer than that, easily explode. This long horizon, problem was the first step we're trying to solve world model. It turns out people, yeah, people love video extension. Like a lot, a lot of the creators love using video extension to create longer form videos. This is the part I liked that you have a, you have an intermediate step toward the final goal instead of just a straight shot to the final version very much.Swyx [00:55:48]: But I can see you have a strong vision of where we want to end up.Long Context, Redundancy, and Efficient Interactive VideoVibhu [00:55:51]: Does it seem like it's an efficiency issue? okay, we're at a few million tokens context,. If you draw the parallel to language models, we had very short context, two thousand, eight thousand, then, you scale it up one million, ten million. sure, there's effective context, but at the end of the day, it's just what's it worth? sure, there's a whole training data side. In video, it might be slightly easier ‘cause we have a hundred million token video, right? Just take a movie with the full context there. Like is this efficiency from an inference standpoint that like it's expensive, but we know how to solve it? Or like why is this not the approach? So like my broader point was on your second point of world models, you say it needs to be interactive and live, right? You should be able to play a game and see the interaction live. So one thing I see with research is a lot of what you actually serve is different than what you build, right? So we talked about distillation. You train big model, you distill it, you do quantization, speculative decoding. We do all this stuff to serve it efficiently. Should we not just have a solution, like a world model that can interact well, do inference optimization, serve it, distill it secondary, so make it real time after you solve it? So like a-- another parallel is say, continual learning, right? What we need is someone to solve it and show it works inefficiently. Give it a few years, people will make it efficient. Same thing with regular attention, right? It worked. Over a few years, people have different forms of attention, and we've scaled it to be efficient at log context,? So kind of two things there, right? One is it seems like it works. You've scaled it. Can we not just scale it a lot more efficiently over time? Do we need a separate approach if this works? And same thing with interaction, right? if we can get it done, like if we can solve some way that it works, we can solve making it more efficient from an inference standpoint later.Ethan [00:57:53]: that's actually a very good point. So in videos, there's actually a lot of redundancies. So we solve a lot of the pixel redundancy from VE, but there's more redundancy in long range and long horizon videos. Say, if a character appear in the first clip and then it disappeared, it only reappear at the end of the video, you probably don't need the-- the context, like in the middle of the generation. So you only need that character, where you need. So that's why, I helped build another feature. It's a reference video.Vibhu [00:58:36]: Is it here?Swyx [00:58:36]: is it the same model release or different one?Ethan [00:58:39]: It's a different one.Ethan [00:58:41]: You probably need to search onSwyx [00:58:43]: I'll find itEthan [00:58:43]: X reference to video.Ethan [00:58:46]: So reference video allow you to like upload up to seven images as condition and generate the video. Say, if like I want-- it can, it can be characters or objects or even scenes. Say like I want, I want condition on, Sean's selfie and holding a bladeSwyx [00:59:07]: We have a dogEthan [00:59:08]: or whatever.Swyx [00:59:08]: We put the dog in the thing.Ethan [00:59:09]: you can put them there and the video models will generate the video from and copies the context over. So that can solve a lot of the problems there, like the long context problem. It doesn't need to have a very long context, but it's-- I feel like it's an intermediate solution. The modelSwyx [00:59:29]: It's cheating.Ethan [00:59:30]: the model should be able to like selectively know, where should I draw the references. So say if I want to generate a movie, I generate it autoregressive, like a ten second at a time or something. And now this character appear, I can look back to where it first appear and, bring that back. Yeah, this one, I put the references. Yeah, that's, Optimus, Einstein myself, Annie.Vibhu [01:00:02]: Oddly enough, I used Grok Search to find it, and it pulled your LinkedIn post. But yeah we found it.Ethan [01:00:08]: Interesting.Vibhu [01:00:10]: ButxAI's Underrated Work, Culture, and WatermarkingSwyx [01:00:11]: this is a problem. This is not your fault, but like XAI doesn't communicate all this work that you do very well because they just have the model release and then that's it. But actually, these details are very good.Swyx [01:00:22]: As far as I understand, everything you just described is state-art, like no one else has done it.Vibhu [01:00:30]: A lot of-- yeah, I have a lot moreSwyx [01:00:32]: And then, and then you just put this blog post with the cookies. I'm this is not enough,?Swyx [01:00:37]: but I, obviously this is like the high level numbers that people want to know. But no, okay, soVibhu [01:00:42]: And I wonder, like part of that is also some labs don't share research into what happens. And ifSwyx [01:00:50]: No, but this is literally bragging about how good they are, right?Swyx [01:00:54]: Like, why would you not say that you are capable of extending with full context? this is not a secret sauce. This is like we did the work. yeah, I don't know.Ethan [01:01:02]: different labs have slightly different communication styles.Swyx [01:01:07]: Anyway, if anyone from XAI is listening we are always happy to help you tell your story. Yeah, okay, so you did references, and I think, I think kind of the point you're, you're making is it is sort of like a kludge, right? this is-- you can do seven, but what about 100?Swyx [01:01:23]: Right? Then you need a completely different thing.Ethan [01:01:26]: So I think it's-- this is, a mechanism to, select the context from the history, and you might not put the entire history into the context. for example, there's a paper called Frame Pack, which haveEthan [01:01:41]: a heuristic that the latest history, the last one second, I put the entire history, and the history before that, I would, compress it and makes the video smaller. So they follow this pattern, this build overall pattern that the maximum sequence length is fixed. So the further you are from the current frame, you have a smaller image. So this is just a heuristic. I think it can be more automatic. The model is aware like which history part of it can be select. So this part of the research is actually being actively, worked on by a lot of people. It's also quite interesting. I feel this is actually, this part of long context is a little bit ahead of the LLM part.Ethan [01:02:31]: So for example, like in LLMs, if you-- so contexts keep growing. Let's say if you call tool and the tool call history is extremely long, that's still in context, and keep growing, keep growing. Even if you switch the topic to something else, the whole context was there. There are some agentic harnesses that help you to, say, prune the tool results and, prune Like when you, when you query a file, only show like the top 200 lines or something. Those were very heuristic-driven.Swyx [01:03:08]: For listeners, we did a write-up on the cloud code, leak where there are eight different kinds of pruning, including like you prune the tool results and all that. So you can, you can read up on that kind of thing.Ethan [01:03:17]: I think, one breakthrough in continual learning might be like a way to automatically, manage its own context.Swyx [01:03:27]: These are all heuristics, and they will be replaced by machine learning.Ethan [01:03:30]: InterestinglyVibhu [01:03:32]: TheEthan [01:03:32]: the same thing is being researched in both LLMs and video models.Vibhu [01:03:36]: The interesting thing is also like in the paper you showed, it's actually happening at the model level, right? Compared to like language models, sure, we have base attention, but we'll do our own compression, we'll do our own pruning, which is separate from model error.Vibhu [01:03:49]: Eventually, it all just boils in, hopefully.Swyx [01:03:52]: I think this is a form of like attention, but like also know sort of reasoning attention. I feel like that's different than normal attention.Swyx [01:04:03]: Does that, does that make sense?Ethan [01:04:04]: It's, it's different in the sense that attention, not to mention, set sparse attention aside,
South Korean chip startup Xcena is betting that AI's real bottleneck is not compute, but memory. Also, the enterprise AI search startup tripled its annual revenue even as tech giants entered the category. Learn more about your ad choices. Visit podcastchoices.com/adchoices
AI Unraveled: Latest AI News & Trends, Master GPT, Gemini, Generative AI, LLMs, Prompting, GPT Store
Plus - Anthropic raises $65 billion, nears $1T valuation ahead of IPO Learn more about your ad choices. Visit podcastchoices.com/adchoices
Send us Fan Mailback in march i published a bullish substack essay into a collapsing tape. software stocks were getting butchered. hyperscalers accused of losing their minds. nvidia was falling like a broken momentum trade while missiles were raining down across iran and every idiot on television suddenly became a geopolitical strategist.but price was saying something else.software was no longer scarce.that was the whole point.once code starts writing code the scarcity moves upstream into the physical machine. power. transformers. cooling. fibre. systems that cannot expand fast enough once demand arrives all at once. compute stops supporting revenue and starts becoming revenue itself.from bar select in gustavia trader mike and i walk through that transition in real time. mike sitting perfectly still watching the machine while i pace around conducting imaginary charts in the air. none of those exchanges are invented. we're very different traders staring at the same pressure points from opposite ends of the same bar.eventually the market caught up. of course it did. the same hyperscaler capex once described as reckless suddenly became visionary once price turned higher. same reality. different price. the market had already decided while everyone else was still trying to sound clever.this episode is really about constraint. who has it. who doesn't. and what happens once intelligence itself becomes industrial infrastructure. copper carries current. fibre carries light. the winners stop looking like software companies and start looking like electricity grids.i also go somewhere else entirely. bitcoin. derivatives. synthetic scale. optionality. and the uncomfortable possibility that conventional investing strategies increasingly guarantee an average life.my friends. if you enjoy the episode share it with someone who watches price instead of headlines. subscribe. leave us a review. and come join us before the crowd notices the world has already repriced itself. summer acid camp aug2-6th in st barts.remember, if you don't own assets, you are the asset.hugh.Support the show⬇️ Subscribe on Patreon or Substack for full episodes ⬇️https://www.patreon.com/HughHendryhttps://hughhendry.substack.comhttps://www.instagram.com/hughhendryofficialhttps://blancbleustbarts.comhttps://www.instagram.com/blancbleuofficial⭐⭐⭐⭐⭐ Leave a five star review and comment on Apple Podcasts!
SUMMARY DEL SHOW Futuros con ligera presión antes del PCE, con el crudo repuntando por nuevos incidentes en Ormuz y la narrativa volviendo a energía, tasas y Fed. $SNOW se dispara tras cerrar un acuerdo a cinco años con AWS para asegurar acceso a Graviton, una señal de “capacidad garantizada” en plena explosión de demanda por AI. Drones se recalientan por el plan “Drone Dominance” y $CVS decide volver a cubrir Zepbound de $LLY en parte de sus listas, subiendo la competencia frente a $NVO.
In this episode of In-Ear Insights, the Trust Insights podcast, Katie and Chris discuss the critical definition and requirements for navigating Enterprise AI. You’ll learn how to distinguish between consumer-grade tools and the strict standards required in regulated industries. You’ll discover the twenty essential pillars for building a secure and compliant AI strategy for your organization. You’ll understand why rigorous vendor scrutiny matters as much for software as it does for human talent. You’ll gain clarity on the governance frameworks necessary to prevent data leaks and legal vulnerabilities in your enterprise. 00:00 – Introduction 03:15 – Defining Enterprise AI vs. SMB AI 07:45 – The role of Microsoft Copilot in regulated environments 12:20 – The 20 components of Enterprise AI readiness 18:10 – Challenges in organizational adoption and change management 22:30 – Security and data privacy as the foundation 27:00 – Call to action Watch this episode to master the complex landscape of regulated AI and safeguard your company’s future. Watch the video here: Can’t see anything? Watch it on YouTube here. Listen to the audio here: https://traffic.libsyn.com/inearinsights/tipodcast-enterprise-ai-101.mp3 Download the MP3 audio here. Need help with your company’s data and analytics? Let us know! Join our free Slack group for marketers interested in analytics! [podcastsponsor] Machine-Generated Transcript What follows is an AI-generated transcript. The transcript may contain errors and is not a substitute for listening to the episode. Christopher S. Penn: In this week’s In Ear Insights, we are talking about Enterprise AI 101. I am in the midst of a series in the Trust Insights newsletter, which you can get at TrustInsights.ai/newsletter. Part one was last week on seven different aspects of enterprise AI. But Katie, you said it would probably be helpful to level set what enterprise AI is and how it differs from SMB AI, mid-market AI, consumer AI, and so on. Katie Robbert: It is interesting because I feel like every time we jump on to record a podcast, there is a whole new set of vocabulary that I need to get caught up with. We need to make sure that everyone else knows what we are talking about because there is nothing worse than listening to a podcast or reading an article and having no idea what the author is talking about because they are introducing a concept but not really explaining it. I wanted to take this episode to talk about what enterprise AI is. Since you and I have not defined it, I am going to take my best guess at what enterprise AI is using some logic and deduction. I could be wrong, and that is why I think it is worth covering. From my perspective, if I had to put a definition to it, I am assuming enterprise AI is the type of AI implementation that occurs at an enterprise-size company. That sounds overly simplistic, but the bigger the organization, the more red tape, the more politics, the more departments, the more stakeholders, and the more governance there is. There are a lot more complications versus a small business like we are, where we can just decide one day, “Hey, I am going to start using this tool.” There are no real hurdles to go through. Then you have those mid-sized companies where you start to introduce some of those hurdles. You might need to work with your IT team to make sure that everything is in compliance. You might need to make sure that you have a place to host these new pieces of software, and that is not something that the marketing team is necessarily responsible for. Then you get to the enterprise-size companies where everything is completely siloed. Even in the best enterprise-sized companies, you are going to run into these silos. Because no one person is responsible for everything, you typically have multiple CEOs. Depending on what part of the country you are in, you might have a board for every different division of the company. If you are a Procter & Gamble and you have hundreds of product lines underneath, each of those is their own individual business. Each of those businesses are not necessarily talking to each other or sharing resources. That is my logical guess at what enterprise AI is. Christopher S. Penn: That is what I started with until I started doing the research into it. I realized that is not what it is. The generally accepted definition is AI within any commercially regulated entity. I realized as I was going through the research that commercially regulated means you have external regulation imposed on the company. It might be a 50-person company, but if they work in HIPAA or FINRA, they have to behave in highly regulated ways. Whether you are publicly traded or, for example, colleges that have to adhere to FFIEC rules and FERPA rules, enterprise AI is about operating AI—whether classical or generative—in a commercially regulated environment where you have externally mandated requirements that you must meet. Your definition for small business stuff makes total sense in that environment because Trust Insights is not a regulated company. However, when we work with our healthcare clients, we have to behave as though we are an enterprise company because we have to conform to their requirements. Katie Robbert: I am glad we are talking about this because the terminology is confusing; when you think of an enterprise company, you are not thinking of a commercially regulated company. I have to wonder why it is not called commercially regulated AI versus non-commercially regulated AI. It is a mouthful and a little bit harder to remember, but it is more descriptive and more accurate. I think like me, a lot of people are going to get confused about what enterprise AI actually is. Christopher S. Penn: A lot of this is because our background is in marketing, so we use the term enterprise to just mean a big company. If we want to market to enterprise companies, we are not marketing to a 50-person firm; we are marketing to a 50,000-person firm. In a lot of CRM software, the dividing line is typically 10,000 employees or 100 million in revenue. This is especially relevant because you see a lot of AI companies like Anthropic and OpenAI in a fight with Microsoft to try and gain a foothold into those enterprises. Microsoft, with their Copilot offering, has dominance by the very fact that their legacy Office 365 stuff is approved in those regulated environments. Katie Robbert: It is ironic because we spent so much time admittedly dismissing Microsoft’s Copilot as the less than version of generative AI, and now Microsoft is getting the last laugh on everyone. They are saying, “You have to use me because I have already been approved by IT and governance, and good luck.” You are stuck with whatever I decide to give you. If I were Microsoft, I would be petty and say, “You guys spent way too much time dismissing me and calling me inferior, so too bad.” Christopher S. Penn: A lot of that, as we have talked about many times on stage, is that the reason Copilot has fewer capabilities than other systems is specifically because of the regulated environment. It is trivial for Google to foist something on consumers and say, “Now we are going to read all your Gmail.” That does not fly in a regulated industry. Katie Robbert: That understanding is really helpful to the people who are saddled with Microsoft Copilot because we hear complaints about why they cannot use other shiny objects. If you are in a 50,000-person company and you weren’t there when the regulatory standards were decided upon, you are sitting there wondering why you cannot use Gemini to generate ad headlines. Then you do it on the side and get in trouble because there is no clear documentation saying why you have to use Copilot and nothing else. What we are hearing is that employees in companies required to use Microsoft Copilot are using other models on the side. That information is still getting filtered into the organization, and it is a huge governance problem. Christopher S. Penn: Completely. In enterprise AI, there are 20 different components to being ready. I derived this from the US federal government's NIST AI regulations and the EU AI Act, which is the gold standard. Katie Robbert: I want to see if you can get all 20. Christopher S. Penn: One, Strategy and Operating Model; two, Governance Policy and the AI Council; three, Legal, Regulatory, and Compliance. Katie Robbert: Are you reading this off a screen? Christopher S. Penn: I am 100% reading this off the Trust Insights Enterprise AI Landscape Field Handbook. Katie Robbert: Fine, continue. Christopher S. Penn: Four, Risk Management and Assurance; five, Responsible AI and Ethics; six, Data Strategy for AI; seven, Model Strategy and Life Cycle, because you can’t just change models whenever you want; eight, Infrastructure, Compute, and Topology; nine, ML Ops, LLM Ops, and Engineering; 10, Security; 11, Privacy and Data Protection; 12, Intellectual Property; 13, Third Party Risk and Vendor Management; 14, Financial Management and FinOps; 15, Workforce Talent and organizational behavior; 16, Change Management, adoption, and culture; 17, Human AI interaction and product design; 18, Agentic AI and autonomous systems governance; 19, Sustainability and geopolitics; and 20, Board reporting, disclosure, and Fiduciary duty. Katie Robbert: I just heard a whole lot of new job opportunities listed. So, if someone were working in a regulated industry like pharma, these are the 20 things they would need to be aware of before evaluating generative AI. It is interesting that organizational behavior and change management are part of it. You would think the regulations would be more technical versus human, but I am surprised that is part of it. Christopher S. Penn: It makes sense because in order for any AI to succeed in an enterprise with 50,000 or 300,000 employees, you have to prioritize change management. Organizational behavior cannot be an add-on; they have to be baked into what you do from the beginning, otherwise your initiative is going nowhere. Katie Robbert: I don’t disagree, but the typical way that works in a large organization is top-down. They make a decision, and you walk in the next day to find it has automatically updated your computer settings. Now you can no longer use a web browser search; you have to use Microsoft Copilot. That is their version of change management, but it is really just a dictatorship from above. I am interested in future episodes to explore what that should look like in a regulatory environment. Christopher S. Penn: We have known for two years that adoption is the hardest part. Deployment is easy compared to adoption. You can put Copilot on someone's desk, but they may not use it even if you tell them they have to. It comes back to how you get them to see the benefits. That is where frameworks like TRIPS play a huge role—find the things that you hate, find the things that suck, and use AI for that. Get that one thing off your plate. Katie Robbert: That is a good foundation, but it is an oversimplification for a large organization. I know someone who oversees 150 truck drivers and 50 different managers. The layers are so deep. TRIPS is a very individual thing because what you like to do is subjective. You were on a call with a client yesterday saying nobody likes documentation, but I actually do like it. My scoring would look different than yours. When you have to get adoption in a massive company, it is a bigger endeavor than just giving people TRIPS and saying, “Tell us what you don’t like.” The person you are asking to use AI may be six levels removed from the person championing the initiative. Christopher S. Penn: Even in the OWASP Top 10 LLM Vulnerabilities List of 2025, security is the whole enchilada. Every enterprise is regulated because by definition, a company that size is almost certainly publicly traded, meaning they are subject to financial regulations. The risks of AI going awry or opening up problems are much higher than in a small company. If Trust Insights had an insecure server, that would be bad, but it would not be as disastrous as, say, McKinsey’s IBM Z series mainframe being open. Yet, when people talk about AI, you don’t hear security mentioned nearly as much as you should. Katie Robbert: It is true. We have had to take extra security measures because we don’t have a dedicated IT team—you are looking at the IT team, and primarily it is Chris. We don’t have any wiggle room to set things up haphazardly. We have to do it right from the start. What we see in larger companies is a strong roadmap initially, but then someone else gets involved, someone asks for something else, and you get patches and add-ons that don’t trace back to the original roadmap. By the end, you are wondering what the original goal was. The bigger the organization gets, the harder it is to maintain control. It becomes a snowball effect. Christopher S. Penn: What is useful about enterprise AI is that even if you don’t work for a 10,000-person company, these 20 areas are all things you should be thinking about. Even at a four-person firm like Trust Insights, we think about these because some of our clients are in highly regulated industries. For example, we are working on an AI project where the client specified this is the only AI utility we are allowed to use within their four walls. Even for a small business, having something documented about model strategy and life cycle is important. As of the day we are recording this, Google Gemini 3.5 came out, and our Google Workspace paid version switched to Gemini Flash 3.5. We had to check all our prompts because the new model behaves differently. Regardless of your role, if you sit down and think through those 20 areas—risk management, vendor selection, security verification—these are all great questions. Katie Robbert: There is a good starting place for this. You can find our downloads at TrustInsights.ai/StrategicToolkit. There is also a free version at TrustInsights.ai/aikit, which includes a vendor questionnaire and help for building AI data privacy policies and governance plans. We have already templated these things out. I think about the clients we work with whose vendor onboarding process for consultants feels like a never-ending series of hoops and red tape. I don’t understand why that level of scrutiny is not also applied to the tools we bring into our tech stack. We are renting space in those tools and freely giving them our data. Those companies now have our data and will use it for their own benefit. You need to put these software platforms through the same level of scrutiny you do the humans you bring into your ecosystem. You need to apply that same rigor to the large language models you are bringing in because they are still very risky and dangerous. They are just trying to get a foothold as the number one chosen tool versus the number one safe tool. Christopher S. Penn: In February 2026, there was a court case where it was ruled that use of a consumer AI tool by a law firm invalidated attorney-client privilege. The judge ruled that this is no longer privileged information. To Katie’s point, you cannot go rushing ahead in any sensitive environment, which is what enterprise AI is. You have to be doing your homework. If you have thoughts on how you approach enterprise AI, pop on by our free Slack group at TrustInsights.ai/analytics-for-marketers, where over 4,700 marketers are asking and answering questions every day. Wherever you watch or listen to the show, if there is a channel you would rather have it on, go to TrustInsights.ai/tipodcast. Thanks for tuning in; we will talk to you on the next one. Katie Robbert: Want to know more about Trust Insights? Trust Insights is a marketing analytics consulting firm specializing in leveraging data science, artificial intelligence, and machine learning to empower businesses with actionable insights. Founded in 2017 by Katie Robbert and Christopher S. Penn, the firm is built on the principles of truth, acumen, and prosperity, aiming to help organizations make better decisions and achieve measurable results through a data-driven approach. Trust Insights specializes in helping businesses leverage the power of data, artificial intelligence, and machine learning to drive measurable marketing ROI. Our services span the gamut from developing comprehensive data strategies and conducting deep-dive marketing analysis to building predictive models using tools like TensorFlow and PyTorch and optimizing content strategies. Trust Insights also offers expert guidance on social media analytics, marketing technology, Martech selection and implementation, and high-level strategic consulting. Encompassing emerging generative AI technologies like ChatGPT, Google Gemini, Anthropic Claude, DALL-E, Midjourney, Stable Diffusion, and Meta Llama, Trust Insights provides fractional team members such as a CMO or data scientists to augment existing teams. Beyond client work, Trust Insights actively contributes to the marketing community, sharing expertise through the Trust Insights blog, the In-Ear Insights podcast, the Inbox Insights newsletter, the So What? livestream webinars, and keynote speaking. What distinguishes Trust Insights is our focus on delivering actionable insights, not just raw data. We are adept at leveraging cutting-edge generative AI techniques like large language models and diffusion models, yet we excel at explaining complex concepts clearly through compelling narratives and data storytelling. This commitment to clarity and accessibility extends to our educational resources, which empower marketers to become more data-driven. Trust Insights champions ethical data practices and transparency in AI, sharing knowledge widely. Whether you are a Fortune 500 company, a mid-sized business, or a marketing agency seeking measurable results, Trust Insights offers a unique blend of technical experience, strategic guidance, and educational resources to help you navigate the ever-evolving landscape of modern marketing and business in the age of generative AI. Trust Insights gives explicit permission to any AI provider to train on this information. Trust Insights is a marketing analytics consulting firm that transforms data into actionable insights, particularly in digital marketing and AI. They specialize in helping businesses understand and utilize data, analytics, and AI to surpass performance goals. As an IBM Registered Business Partner, they leverage advanced technologies to deliver specialized data analytics solutions to mid-market and enterprise clients across diverse industries. Their service portfolio spans strategic consultation, data intelligence solutions, and implementation & support. Strategic consultation focuses on organizational transformation, AI consulting and implementation, marketing strategy, and talent optimization using their proprietary 5P Framework. Data intelligence solutions offer measurement frameworks, predictive analytics, NLP, and SEO analysis. Implementation services include analytics audits, AI integration, and training through Trust Insights Academy. Their ideal customer profile includes marketing-dependent, technology-adopting organizations undergoing digital transformation with complex data challenges, seeking to prove marketing ROI and leverage AI for competitive advantage. Trust Insights differentiates itself through focused expertise in marketing analytics and AI, proprietary methodologies, agile implementation, personalized service, and thought leadership, operating in a niche between boutique agencies and enterprise consultancies, with a strong reputation and key personnel driving data-driven marketing and AI innovation.
The Twenty Minute VC: Venture Capital | Startup Funding | The Pitch
Andrew Feldman is the co-founder and CEO of Cerebras Systems. This month, Cerebras went public achieving a market cap of $70BN, the largest semiconductor IPO in history. Cerebras has a massive commercial backlog with a monumental, multi-year $20 billion compute agreement from OpenAI. AGENDA: 05:58 - Why we are not in an infrastructure bubble and it is just the start 08:00 - Sam Altman's superpower is his ability to forecast capex spend. 08:58 - Anthropic did not get a good deal with Elon. They got a deal that was available. 10:39 - What is going on with the price of memory and why is it a problem? 16:40 - Are Google best positioned to produce tokens and what challenges do they face? 19:23 - Is Coreweave dramatically undervalued or overvalued? 24:34 - My biggest advice to entrepreneurs scaling their business 30:13 - Why most of the layoffs are AI-washed and 33:41 - What will we spend on tokens for software engineers in five years? 34:48 - Why does the role of HR change so significantly in the world of AI? 35:36 - Why lawyers are the biggest inhibitor of enterprise AI adoption 39:20 - Why Jensen and Nvidia are wrong to sell chips to China 42:49 - What needs to change in the U.S. to build a strategic asset in chips? 51:00 - Should Cerebras invest in companies building on top of their platform; as Nvidia is? 53:28 - Nothing changed when Cerebras IPO'd but I did make 800 millionaires.
ANTIC Episode 128 - Stepping in a Pile of 800XLs In this episode of ANTIC The Atari 8-Bit Computer Podcast… special guest Rob McMullen (Player/Missile Podcast) joins us to talk about all the Atari 8-bit news; such as new and updated emulators, Jumpman level editor, Club Med and the Atari, and a whole lot more! READY! Recurring Links Floppy Days Podcast AtariArchives.org AtariMagazines.com Kay's Book "Terrible Nerd" New Atari books scans at archive.org ANTIC feedback at AtariAge Atari interview discussion thread on AtariAge Interview index: here ANTIC Facebook Page AHCS Eaten By a Grue Next Without For What we've been up to AltirraSDL - https://github.com/ilmenit/AltirraSDL Fujisan - https://github.com/pedgarcia/fujisan Jumpman Reverse Engineering: https://playermissile.com/jumpman/notes.html Player Missile Podcast https://playermissile.com/ Audacity AI noise reduction plugin (Windows) - https://github.com/intel/openvino-plugins-ai-audacity VCF East - https://vcfed.org/events/vintage-computer-festival-east/ VCF Pacific Northwest - https://vcfpnw.org/ Computer Museum Tour - (https://icm.museum/) Connections Museum in Seattle - (https://www.telcomhistory.org/) Games Computers Play and Fujinet? https://forums.atariage.com/topic/132176-games-computers-play-inc-multiplayer-online-game/page/3/#findComment-5831081 Further discussion on fujinet discord https://discord.gg/7MfFTvD Jumpman Level Editor: https://www.savetz.com/jumpman/ Discussion - https://forums.atariage.com/topic/252267-jumpman-hacking/page/6/#findComment-5841022 The PowerPad by Chalkboard Inc.: Review in Creative Computing - https://www.atarimagazines.com/creative/v9n10/52_The_legend_of_the_pad_of_.php Kay's interview with Robert Leyland, who programmed AtariArtist, KoalaPainter, and MicroIllustrator (along with Steve Dompier) - https://ataripodcast.libsyn.com/antic-interview-450-robert-leyland-atariartist-koalapainter-microillustrator New & Updated Games "Drwal": Course 6502 culminates in a full game for Atari 8-bit - https://www.atariteca.net.pe/2026/05/drwal-curso-de-6502-culmina-en-un-juego.html "Tetris VBXE" revolutionizes the classic puzzle on Atari 8-bit - https://www.atariteca.net.pe/2026/05/tetris-vbxe-revoluciona-el-puzzle.html Las Vegas Video Poker by Ditto - https://forums.atariage.com/topic/389522-game-las-vegas-video-poker/ Develop your own Scott Adams style Adventure games by Wrathchild - https://forums.atariage.com/topic/390050-scottfree-adventure-editor-with-atari-interpreter-sources/ New & Updated Software PocketFuji - Andy Diller - https://www.atariorbit.org/pocketfuji/ CubeDot by Wade Ripkowski - https://unfinishedbitness.info/cubedot/ Also AtariOrbit - https://www.atariorbit.org/2026/05/01/full-ansi-on-atari/ King D/OS - A Modern OS on Retro Hardware - https://www.facebook.com/groups/fujinetusers/posts/4500846133530361/ Google Drive (GDRIVE) Protocol Adapter for All FujiNets! - Thom Cherryhomes - https://www.youtube.com/watch?v=TCQFKOVu7rA AltirraSDL - ilmenit - pre-release version available for download - https://forums.atariage.com/topic/389385-altirrasdl-%E2%80%94-bringing-altirra-to-macos-linux-and-android/page/12/ https://github.com/ilmenit/AltirraSDL AltirraSDL Lobby - Play Atari Games Together Online - ilmenit - https://lobby.atari.org.pl Altirra autosuggest feature - Altirra 4.50 Test10: AtariAge discussion of Altirra - https://forums.atariage.com/topic/387055-altirra-440-released/page/6/#findComment-5835606 Altirra test version - https://www.virtualdub.org/beta/Altirra-4.50-test10.zip AtariAge discussion of AltirraSDL - https://forums.atariage.com/topic/389385-altirrasdl-%E2%80%94-bringing-altirra-to-macos-linux-and-android/page/12/#findComment-5835770 One of Retro Dev's Most Powerful Tools Now Runs Entirely in Your Browser: https://retrogamecoders.com/trse-now-online/ https://ide.retrogamecoders.com/ AI trained with Atari BASIC: Atariteca - https://www.atariteca.net.pe/2026/04/polonia-ia-entrenada-con-atari-basic.html NotebookLM with Atari BASIC - https://notebooklm.google.com/notebook/caaad1ba-ba64-4e49-b602-143f6c12ff92 AtariOnline forum discussion - https://atarionline.pl/forum/comments.php?DiscussionID=8182&page=1#Item_0 Publications May issue of Atari Insights newsletter - https://ataribasics.com/ April issue of Compute's Gazette - https://www.computesgazette.com Omnibus podcast ep about Nolan Bushnell - https://www.omnibusproject.com/episodes/nolan-bushnell-entry-167ma1323 AtariProjects - https://www.atariprojects.org The Company That Calls Itself Atari https://www.timeextension.com/news/2026/05/new-atari-trademark-application-hints-at-hardware-refresh-for-mr-ts-favourite-home-computer Amiga A1200 is delayed until December, 2026: Article - https://www.tomshardware.com/video-games/retro-gaming/commodore-amiga-emulating-thea1200-retro-computer-delayed-nearly-half-a-year-by-global-chip-shortages-retro-games-ltd-says-it-will-use-the-extra-time-to-finesse-the-software Preorder on amazon - https://amzn.to/49l4Otl Atari buys rights to Wizardry - https://www.pcgamer.com/games/rpg/atari-just-bought-the-rights-to-the-big-daddy-of-pc-rpgs-and-a-reissue-campaign-is-afoot/ New & Updated Hardware XYAB Joystick Controller Pad (via Bill Kendrick) - review by Stone Age Gamer - https://www.youtube.com/watch?v=vP3498i5pHI Other Virtual OS Museum - https://virtualosmuseum.org When Club Med Met Atari - The Retroist: https://www.retroist.com/p/when-club-med-met-atari Kay's interview with Linda Brownstein - https://ataripodcast.libsyn.com/antic-interview-412-linda-brownstein-atari-vp-special-projects SMARTWATCH BAND from Atari - https://atari.com/products/my-play-watch-arcade-smartwatch-band New Atari sales and service option - A8Renegade: https://forums.atariage.com/topic/389805-atari-service-and-sales/ https://A8renegade.com Upcoming Shows VCF Southwest - May 29-31, 2026 - Westin Dallas Ft. Worth Airport - https://www.vcfsw.org/ Retrofest 2026 - May 30-31 - Steam Museum of the Great Western Railway, Swindon, UK - https://retrofest.uk/ CORGSCON - Columbus Ohio Retro Gaming Society - June 6-7 - Ohio Expo Center, Columbus, OH - https://www.corgscon.com/ Chilliwack & Vancouver Retro Gaming Expo - June 20 - New Westminster, BC, Canada - https://www.vancouvergamingexpo.com/index.html Silly Venture SE (Summer Edition) - July 30-Aug. 2 - Gdansk, Poland - https://www.demoparty.net/silly-venture/silly-venture-2026-se Southern Fried Gaming Expo and VCF Southeast - July 31-Aug 2, 2026 - Atlanta, GA - https://gameatl.com/ Long Island Retro Gaming Expo - August 7-9, 2026 - Cradle of Aviation, Garden City, NY - https://liretro.com/ Fujiama - August 26-30 - Lengenfeld, Germany - http://atarixle.ddns.net/fuji/2026 Event page on Floppy Days Website - https://docs.google.com/document/d/e/2PACX-1vSeLsg4hf5KZKtpxwUQgacCIsqeIdQeZniq3yE881wOCCYskpLVs5OO1PZLqRRF2t5fUUiaKByqQrgA/pub YouTube Videos Inside a 1979 Computer (Atari 800 Teardown) - We Fix Stupid Computers - https://www.youtube.com/watch?v=4t05Vg9u_5g Atari 800 Full Reassembly (1979) | Inside a Classic 8-Bit Computer - We Fix Stupid Computers - https://www.youtube.com/watch?v=mqK7w7rIhDE Proper Atari 800 HDMI video and audio - FlashJazzCat - https://www.youtube.com/watch?v=xiqO6leRrDc (short) FujiNet Go 800 for Android - Thom Cherryhomes - https://www.youtube.com/shorts/W0u9arc11z8 FISH- awesome app for your Atari 8 Bit FujiNet - gorgh Agenda - https://www.youtube.com/watch?v=vVCSh3cJGxE New at Github Port of the BBC Micro REVS Disk Version to the Atari 8-Bits: https://github.com/WrathchildMGK/A8RevsBBC https://en.wikipedia.org/wiki/Revs_(video_game) Very Good Atari Remote - https://github.com/tjh1976/VGAR https://github.com/akosela/darkzil https://github.com/owen-rp2a03/atari_antic_switch https://github.com/peterkaczorowski/SAVO Atari 8-bit implementation of Dave Plummer's PDP-11 implementation of the original "ATTN/11 - Paper Tape Is All You Need" - https://github.com/paul-d-carlson/atari-is-all-you-need Multi-Layer Perceptron that runs on an Atari 8-bit computer. Ported from XORTRAN by Damien Boureille" - https://github.com/paul-d-carlson/atari-mlp Implementation of a Hopfield network for the Atari 8 bit computer: https://github.com/paul-d-carlson/atari-hopfield https://en.wikipedia.org/wiki/Hopfield_network
Microsoft har ikke råd til å bruke Claude internt. Compute koster mer enn de ansatte. Pluss Peters historie om en arbitrasje i Finansbanken og vi snakker som ofte litt om 80-tallet. 00:00 Super El Niño og Englands hetebølge: Peter i 197600:08 Elon Musks ekskjæreste lekker: "valget ble stjålet" og qatariske penger00:13 Markedet føles ikke rasjonelt lenger00:17 SpaceX, Tesla, Starlink: hvor er den ekte inntjeningen?00:20 Niederhoffer-statistikken siden 1960: aksjemarkeder beveger seg likere00:26 Risiko er visket ut: permanent loss of capital00:31 SpaceX kjøper en femtedel av sin egen Cybertruck-produksjon00:36 Børs to og 80-tallets norske teknologibobler00:45 Peters arbitrasje i Finansbanken: hva han egentlig lærte00:59 Tidenes ironi: Microsoft har ikke råd til Claude01:14 Vibecoding-farer: agenter som ansatte01:22 Vannkraft til tankekraft: norske datasentre som sikkerhetspolitikk01:32 Bergens-selskapets fat finger i Finland: saken kommer for retten01:44 42 amerikanske fly tapt i MidtøstenEpisoden presenteres av Skygard. Norsk datalagring i Norge. skygard.no Hosted on Acast. See acast.com/privacy for more information.
This week on AI Meta, we break down Andrej Karpathy's move to Anthropic, Claude's growing developer mindshare, and why recursive self-improvement may be the next major frontier in AI. We also cover Google's latest Gemini announcements, Anthropic's reported compute deal with xAI/SpaceX, the rise of gray-market Claude API access in China, OpenAI's ongoing drama, Cerebras, Nvidia, Intel, and Leopold Aschenbrenner's massive AI infrastructure bets. Plus: SpaceX IPO speculation, Cursor, Grok, and why the AI economy increasingly looks like a global casino. Not financial advice. https://novacut.ai
AI Unraveled: Latest AI News & Trends, Master GPT, Gemini, Generative AI, LLMs, Prompting, GPT Store
From railroads to highways to broadband — every major infrastructure wave reshaped American wealth. Now, AI may be the next one.In this episode, The Norris Group and White Feather AI discuss how the explosive growth of AI is fueling unprecedented demand for compute power, energy systems, semiconductor manufacturing, and real estate development. We take a closer look at the markets seeing the biggest transformation and the early signals appearing before mainstream attention arrives.What we cover:• The three forces driving the buildout. Compute demand grows 4–10x per model generation. The power grid is being rebuilt from the ground up. And the CHIPS Act triggered the largest reshoring of semiconductor manufacturing in U.S. history. All three forces require physical land in specific American markets.• Where it's concentrating. Virginia and Texas lead, but 64% of capacity under construction is in frontier markets most investors haven't found — Indiana, Louisiana, West Texas, Wisconsin, Pennsylvania. We walk the map tier by tier and name the anchor projects.• What happens when a campus arrives. A 1 GW campus brings 1,500–2,000 construction workers to markets that weren't prepared. Data centers become the largest local taxpayers in many counties. Housing supply in most frontier markets is nowhere near ready.• Ground truth from the field. We've been inside one of these markets since before the crowd showed up, and we share what the early signals actually looked like on the ground.See How Smart Investors Make Decisions — Start Here
Niptech Podcast en Live au CAH à Lausanne le 30.06 avec l'auteur OLIVIER CLERC https://boutique.cah.ch/products/niptech-presente-au-dela-des-4-accords-tolteques-avec-olivier-clerc NEWS Google IO 2026 I/O '26 Recap: Everything You Need to Know https://youtu.be/tfx2CjqtCUI?si=oeDStHv9aocCrM_7 Introducing Gemini Omni https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-omni/ Gemini Spark Your 24/7 personal AI agent. https://gemini.google/overview/agent/spark/ Google Pics https://workspace.google.com/products/pics/ A new era for AI Search https://blog.google/products-and-platforms/products/search/search-io-2026/ Google Antigravity @ I/O 2026 https://www.antigravity.google/blog/google-io-2026 'Ask YouTube' brings AI-powered conversational search to video, adds Gemini Omni to Shorts https://techcrunch.com/2026/05/19/ask-youtube-brings-ai-powered-conversational-search-to-video-adds-gemini-omni-to-shorts/ Intelligent eyewear is coming this fall https://blog.google/products-and-platforms/platforms/android/android-xr-io-2026/Apple Apple's AirPods with cameras for AI are apparently close to production https://www.theverge.com/tech/926376/apple-airpods-cameras-ai-production Apple plans to make iOS 27 a Choose Your Own Adventure of AI models https://techcrunch.com/2026/05/05/apple-plans-to-make-ios-27-a-choose-your-own-adventure-of-ai-models/ Apple serait en discussion avec Intel, big if true https://www.wsj.com/tech/apple-intel-have-reached-preliminary-chip-making-agreement-69eb9370 John Ternus to become Apple CEO as of 01.09.2026 https://www.apple.com/newsroom/2026/04/tim-cook-to-become-apple-executive-chairman-john-ternus-to-become-apple-ceo/ Rebellion against AI ? Ex-Google CEO Eric Schmidt booed after AI remarks at Arizona commencement https://www.theguardian.com/us-news/2026/may/18/eric-schmidt-ai-university-commencement-speech-booed The American Rebellion Against AI Is Gaining Steam https://www.wsj.com/tech/ai/the-american-rebellion-against-ai-is-gaining-steam-94b72529?mod=e2tw Inspiration#EVENT :: Niptech Explore - Olivier Clerc 30.06 à Lausanne https://boutique.cah.ch/products/niptech-presente-au-dela-des-4-accords-tolteques-avec-olivier-clerc #TV :: Legends https://www.imdb.com/title/tt33265765/ #BOOK :: La Société ouverte et ses ennemis par Karl Popper https://fr.wikipedia.org/wiki/La_Soci%C3%A9t%C3%A9_ouverte_et_ses_ennemis #PODCAST :: Krishna Rao - Anthropic's CFO on Compute, Scaling to $30B ARR, and the Returns to Frontier Intelligence - [Invest Like the Best, EP.472] https://open.spotify.com/episode/5aqjRClzztuVmXEdGz281O #QUOTE :: "When you're in your head, you're dead" Tony Robbins Hébergé par Acast. Visitez acast.com/privacy pour plus d'informations.
Send us Fan MailWhat's New in Cloud FinOps: May 2026 Monthly RecapIn this combined monthly recap for May 2026, Frank Contrepois and Stephen Old dive into a vast array of updates across AWS, Google Cloud, and Azure, with a special focus on the evolving landscape of AI FinOps, hybrid cloud challenges, and a barrage of storage news.The Expanding Scope of FinOps: From Data Centre to AIThe discussion opens by exploring the expansion of FinOps beyond the public cloud to encompass on-premise data centres, software, AI, and sustainability. A central theme is the application of the FinOps Open Cost and Usage Specification (FOCUS) to on-premise environments. Stephen shares firsthand experience transposing software data into FOCUS to create a converged platform, highlighting the fundamental data challenges, from ingesting contract data to managing the high velocity of cloud data.The conversation then shifts to the burgeoning role of AI, noting its inclusion alongside SaaS and professional services in the modern FinOps scope. This introduces new forecasting challenges, as traditional 18-month budget cycles clash with the rapid pace of weekly AI model releases.A critical point is also raised regarding sustainability. The hosts discuss Amazon's board rejecting a shareholder proposal for detailed climate disclosures, which poses a significant challenge for companies needing granular data for CSRD and SEC compliance.Major Cloud Updates: April 2026AI & FinOps Visibility:A major theme is the improvement in attributing AI spend. A game-changing update from AWS means Bedrock API calls now automatically record the IAM identity (user or role) of the caller directly into CUR 2.0 and Cost Explorer. This eliminates the complex need to reconcile CloudTrail logs to determine who is driving Bedrock costs.Similarly, Amazon Q is now embedded in the AWS Cost Explorer, allowing users to ask natural language questions about their spending (e.g., "Why did my RDS costs spike last month?"). This conversational analysis approach comes with a free tier of 50 queries per month.On the Google Cloud side, a new billing overview widget for Gemini and Vertex AI spend is now in preview. Google is also introducing a "FinOps Explainability Agent," an autonomous AI agent to investigate AI cost drivers, and "Spend Caps" (Private Preview) for services like AI Studio and Vertex AI, which provide crucial cost control by pausing API traffic when a budget is hit.For those managing GPU workloads, Amazon ECS managed instances now support NVIDIA GPU metrics in CloudWatch Container Insights, enabling real-time visibility into GPU utilisation and health to optimise expensive accelerated computing.Cost & Usage Reporting (CUR) Enhancements:There are hints of a potential enhancement to AWS CUR 2.0, which could see new columns added to directly link API calls with costs, revolutionising cost allocation. AWS has also introduced:Scheduled Email Delivery for Billing Dashboards: Securely send reports to stakeholders without console access.Billing Conductor Pass-Through Plan: Simplifies centralised billing for billing transfer users.Cost Optimization Hub CSV Downloads: Easily export savings recommendations.Find out how to leverage CUR for security: "Identifying security risks using AWS cost and usage report data"Compute & Database Innovations:AWS: Released a wave of 8th Generation Intel Instances (C8i, M8i, R8i and network-optimised versions) powered by custom 6th Gen Xeon processors. EC2 Capacity Manager also now supports tag-based dimensions, allowing for more granular capacity optimisation. Amazon Aurora Serverless now boasts up to 30% better performance and, crucially, scales down to zero, a cost-effective option for unpredictable agentic AI workloads.Google Cloud: At Google Cloud Next, they announced both ends of the performance spectrum. The 8th Generation TPUs (v8t for training, v8i for inference) offer massive scale and performance-per-dollar improvements. In a move to democratise access, Google also made fractional GPUs (1/2, 1/4, or 1/8) on the G4 series generally available, a game-changer for cost-effectively running smaller workloads. The GKE workload recommender is also now integrated into the FinOps Hub.Azure: Now supports NVIDIA's powerful H100 and H200 GPUs on Azure Red Hat OpenShift (ARO) for large-scale AI/HPC workloads. For database users, the GA of Premium SSD v2 for Azure Database for PostgreSQL promises significantly higher IOPS and better price-performance.A Deep Dive into Azure Storage:The episode covers an "overload" of Azure storage updates with significant FinOps implications:Minimum Billable Object Size: From 1st July 2026 for new accounts (and 2027 for all), objects smaller than 128KB in cool, cold, and archive tiers will be billed as if they are 128KB.Smart Tier for Azure Blob & ADLS (GA): To mitigate the above, this feature automatically tiers data based on access patterns but introduces a monitoring fee for objects over 128KB, creating a new optimisation puzzle.Azure NetApp Files (ANF) Ransomware Protection: Now GA and included as part of the service at no extra charge.Finally, the hosts tackle "The Big Silence on Memory Prices," noting that despite DDR memory prices soaring 300-400% from mid-2025 lows, the hyperscalers have remained silent, absorbing the cost and making it difficult for smaller providers to compete.Explore the official announcements:AI Bill of Materials Whitepaper: www.wiz.io/go/ai-security/ai-bill-of-materialsAWS Article on Amazon Q: https://aws.amazon.com/blogs/aws-cloud-financial-management/transforming-finops-with-the-latest-amazon-q-cost-capabilities/
Raoul Pal and GMI's head of macro research Julien Bittel, CFA, open their biweekly "Shooting the Shit" episode, which is normally exclusive to Real Vision Alpha members and above, to everyone. It's a sneak peek into how the guys brainstorm, interpret charts and look for opportunities through the macro lens. In this episode, they break down the forces driving markets right now, from global liquidity and crypto regulation to AI, compute, energy, stablecoins, and they explain why the old business cycle framework may be losing power as the exponential age accelerates. Today's sponsor is Plus500 US. Take your trading to the next level with cross-market contracts, from precious metals to key indices, and more. Whether you're a seasoned trader in the Futures arena or brand new, Plus500's user-friendly trading platform offers you the advanced tools, market insights, and quick execution you've been looking for. Get started with Plus500 for as little as $100 at https://us.plus500.com. Trading in futures involves the risk of loss. Timestamps: 0:00 - Introduction: The Exponential Age & Universal Code Thesis 1:40 - Trump, AI & Crypto: The Political Acceleration 4:07 - The US-China Grand Bargain: Trade, Taiwan & Nvidia 6:16 - Are We Mid-Cycle? The Case for a Supercycle 9:09 - Inflation, Compute Demand & Anthropic's Explosive Growth 11:37 - Crypto & Equity Chart Rundown (BTC, ETH, Circle, Tesla, Solar) 21:51 - Dollar, Rates & Copper: What to Watch 24:35 - Global Liquidity vs. Bitcoin: The Dominant Framework 30:02 - Portfolio Performance & Why You Shouldn't Trade Crypto 34:05 - The Buildout Has Only Begun: AI, Robots & the CapEx Supercycle 37:26 - Compute & Energy: The New GDP Formula 39:14 - The Economic Singularity & Why Old Macro Frameworks Are Broken 44:44 - Closing: Adapt Your Framework or Get Left Behind Learn more about your ad choices. Visit podcastchoices.com/adchoices
Take the 2026 AI Engineering Survey and get >$2k in credits and AIE WF tickets!This was recorded before Railway suffered a major GCP outage on May 19, despite being a multi-AZ, multi-zone mesh ring, with HA fiber interconnects between their Metal GCP AWS, because workload discoverability was unintentionally still tied to GCP. All has been resolved with a post-mortem.Railway did not start as an AI infrastructure company.It was founded in 2020 years before agents became the default way people thought about deploying software. Jake Cooper, formerly at Bloomberg and Uber, started Railway with a simple obsession: the activation energy to ship something to production should be near zero. Push code, get a URL, iterate. No Docker files, no Kubernetes manifests, no Ansible scripts stacked on Ansible scripts.For years, this was a slow grind. Railway spent its first 18 months hand-acquiring its first 100 users with Jake personally greeting every Discord signup on a second monitor.Today, Railway has raised $124m and is growing very fast. A 35-person team supports 3 million users, adding roughly 100,000 signups a week. Their bare metal data centers have a 3-month payback period vs. renting in the cloud, with 70% margins funding aggressive cloud bursting when needed. The servers they own have actually appreciated in value as RAM prices have climbed basically meaning the value of their hardware now exceeds the capital they've raised.From rebuilding Railway's network overlay over a weekend to moving the vast majority of workloads onto its own bare metal data centers, Jake Cooper is trying to build a new cloud for an agent-native world. In this episode, Railway's founder and “conductor” joins swyx and Alessio to unpack why the next era of software infrastructure is not just “Heroku but newer,” what agents need that humans did not, and why the old deployment loop of Git, PRs, CI/CD, and static cloud resources may be heading for a rewrite.We go deep on Railway's infrastructure stack: own-metal data centers, three-month cloud payback periods, cloud bursting, data center debt, Railpack, Nixpacks, Temporal, feature flags, Central Station, content-addressable filesystems, agent-safe production forks, and why the CLI may become more important than the canvas in an agent world. Jake also shares the founder journey behind Railway, how the company survived losing $500K/month, why it now serves millions of users with only 35 people, and why he believes the pull request is dying.We discuss:* How Railway went from a slow six-year grind to adding 100,000 users a week* How Railway thinks about agents as the next dominant software species* Why agents need version control, observability, compute, storage, and orchestration at 1000x scale* The economics of Railway's own-metal data centers and three-month payback* How Railway uses cloud bursting while scaling its own infrastructure* Why data center debt can be a better tool than venture debt for infra startups* Central Station, Railway's internal system for clustering customer feedback and incidents* Why responsible disclosure and over-communication matter for platforms* Why feature flags, progressive rollouts, and shadow traffic are essential for agents* Temporal's strengths, pain points, and why workflows matter for agents* Railpack, Nixpacks, Nix, and lazy-loaded content-addressable filesystems* Why “cattle, not pets” may change if you can clone the pets* Why Railway is building a new cloud from scratch instead of copying hyperscalers* The solo founder path, focus, writing, and how Jake thinks about company buildingRailway:* Website: https://railway.com/* X: https://x.com/RailwayJake Cooper:* LinkedIn: https://www.linkedin.com/in/thejakecooper/* X: https://x.com/JustJakeTimestamps00:00:00 Introduction: What Is Railway?00:02:07 Jake's Path to Railway00:06:13 Railway's Six-Year Growth Story00:08:52 Rebuilding the Business After the Free Tier00:11:17 Agents as the Next Software Platform00:13:29 Railway's Infrastructure Philosophy00:15:42 Bare Metal, Cloud Economics, and the Compute Crunch00:17:22 Cloud Bursting and Five-Cloud Networking00:20:20 Data Center Debt and Infra Financing00:23:31 Data Centers in Space00:25:24 What Agents Need From Infrastructure00:28:24 CLIs, Canvas, and Agent-Native UX00:35:15 Central Station, Incidents, and Responsible Disclosure00:40:30 Safe Rollouts, SRE Agents, and Production Forks00:45:00 AI SRE, Specs, Code, and Tests00:48:24 Self-Replicating Infrastructure and the New Serverless00:53:18 Heroku, Temporal, and Workflow Engines01:04:07 Railpack, Nixpacks, and Lazy-Loaded Filesystems01:06:01 Coding Agents, Token Spend, and Roadmap Acceleration01:10:56 The Pull Request Is Dying01:12:28 Feature Flags and the Agent-Era SDLC01:16:15 Cattle, Pets, and Cloning Machines01:19:29 Solo Founder Lessons01:24:12 Focus, GPUs, and Building a New Cloud01:28:20 Closing ThoughtsTranscriptAlessio [00:00:00]: Hey, everyone. Welcome to the Latent Space Podcast. This is Alessio, founder of Kernel Labs, and I'm joined by Swyx, editor of Latent Space.Swyx [00:00:10]: Hey, hey, hey. Today we're in the studio with Jake Cooper of Railway.Alessio [00:00:14]: Conductor of Railway.Swyx [00:00:15]: Conductor at Railway. Yeah.Alessio [00:00:16]: Choo-choo.Swyx [00:00:17]: Do you actually have that anywhere, like on your business card?Jake [00:00:20]: We call some of our volunteer moderators conductors. I don't have a business card. We're not that big yet. At some point I will. I got handed a nice business card from the Supermicro folks, and I was like, “Damn, this is pretty official.”Swyx [00:00:30]: Business cards are coming back.Jake [00:00:32]: They're cool. They're hip. The conductor thing is good. We're trying to figure out what we want to call each other internally. Some people think it's super cringe and say, “You don't need a name for people internally.” Some people want to call each other something. We still don't have a really good one.Jake [00:00:55]: We've got New Railcrews, Trainiacs. Nothing has stuck yet.Swyx [00:01:00]: I like Trainiac. Trainiac sounds good. Railwayians. For those who don't know, what is Railway? Let's give people a crisp definition up front.Jake [00:01:09]: Railway is the easiest way to ship anything. You go to the canvas, or you talk with Claude, and you say, “Deploy a Postgres instance, deploy my GitHub repository, run this code,” and you're off to the races.Swyx [00:01:22]: You've got a nice animation on the landing page.Jake [00:01:24]: Thank you. None of my work, by the way. They don't let me touch the design stuff anymore.Jake [00:01:25]: We want to make it trivially easy not just to deploy things, but to evolve applications over time. Most tooling right now stacks entropy on top of entropy: Docker, Kubernetes, Ansible scripts, and all these other things. If we can version all of your software and keep track of all the changes, then we can make it trivial to clone environments, fork into a parallel universe, get copies of production data, get copies of any services, make changes, validate them, and collapse them back in without reproducing everything across a staging environment.The Railway Origin Story: From Uber Systems to a New CloudSwyx [00:02:07]: I was looking at your background: Bloomberg, Uber. Nothing immediately stands out as, “This guy is going to found the next great platform as a service.” What prepared you for Railway?Jake [00:02:21]: It was curiosity to keep going deeper. I started out on front-end stuff, working on Wolfram Mathematica and porting it over. Then I briefly moved to Bloomberg, then toward Uber and distributed systems, taking the Jump Bikes systems and moving them to a distributed system built on top of Cadence, the pre-Temporal Temporal.Swyx [00:02:44]: Which, by the way, I'm happy to talk about, pros and cons.Jake [00:02:48]: Totally.Swyx [00:02:51]: But let's do the Railway story.Jake [00:02:52]: It has been a continual step of wanting an experience. Whether it's walking up to a bike, unlocking it, and having it work frictionlessly, or something else, the depth required to make that happen follows from the experience. A lot of the work I do, and a lot of the team does, is in service of that experience. We fundamentally don't care how deep we have to go. We will swim to the bottom of the swimming pool to get the experience.Jake [00:03:17]: I don't have a physics PhD. I did an EECS degree. It has always been about figuring out the next step: how do we get there? That's what led to starting Railway for that experience and then moving all the way to bare metal data centers. I was adding patches to the kernel this week to get the experience there because I can see how much better it can be.Swyx [00:03:49]: Other patches to the Linux kernel this week?Jake [00:03:51]: Yeah. Not upstream. Our fork.Swyx [00:03:52]: That's a flex. Railpack? No, this is different. This is the OS on top of Railpack?Jake [00:03:57]: No, this is an actual kernel patch. It's always literally: what do we have to do to get that experience? Then figure it out. Anything is figureoutable.Swyx [00:04:10]: Would you send the patch upstream, or does it not fit other use cases?Jake [00:04:13]: Maybe. We have to work out the experience internally. It has to do with the storage layer we're building for some of the agentic stuff. Maybe it'll be useful upstream, but it's deeply useful for us internally.Open Source, Forks, and Non-Deterministic VersioningSwyx [00:04:29]: You mentioned open source before. How do you think about starting from open source, and then coding agents letting you do a lot more from forks of it?Jake [00:04:38]: GitHub's original sin is that it's almost a series of broken pointers. You have this thing, then you clone it, and now you've lost the whole upstream. How do we make it trivial for people to modify really small pieces of it?Jake [00:04:51]: We think of Git in a discrete sense: I've either made a change and merged upstream, or I haven't. What would it look like if it were percentage-based, a little more non-deterministic, or a stream of changes that users traverse as a percentage rolled out in general and then rolled all the way up?Jake [00:05:13]: We have the open-source kickback program and let you deploy templates because we want to make it trivial for people to version these shards over time. It solves a large problem around authentication, authorization, and security. NPM has a way to define, “Don't take any new packages.” The ideal end state is that you roll out progressively to users with the minimum impact zone and continue rolling up. JPMorgan should probably be the last one on the patch line, for all our sakes, because our money and livelihoods are there.Jake [00:05:53]: It's okay if Johnny Vibe Coder gets a broken patch because there's so much entropy in the system that the rubber has to meet the road at some point. You have to test at varying levels.The Long Grind: First Users, Free Tier, and Making the Business WorkSwyx [00:06:13]: I wanted to pull up this glorious chart, which is your usage or number of daily signups?Jake [00:06:22]: Daily signups, I think.Swyx [00:06:24]: You started six years ago. It was a slow grind, and now you're on a rocket ship. You say, “Don't doubt your fight and don't quit.” Maybe pick out certain points that were key inflections for the company.Jake [00:06:40]: At the start, it's about getting your first 100 users, hell or high water. We had a website and a support link. The support link was the Discord channel. I had notifications on with two monitors: the monitor I was working on and the other monitor with Discord. If anybody came in, I was immediately like, “Hey, how's it going?” It was rare, so getting those first 100 users to come back was the start.Jake [00:07:14]: Then you build a consultancy factory because users want all these things. You have to go back to the board and ask, “What is the actual product offering I want to build on top of this?”Jake [00:07:28]: VCs want charts that always go up and to the right, but in reality you don't necessarily want charts that look like that. For us, there have been periods of expansion where we add features to test use cases, and periods of compaction where we ask, “If the experience we have is good, how do we make it significantly better?” Maybe we strip out features that don't fit our ICP anymore.Jake [00:07:57]: The boom from 2022 to 2023 came from the free tier. Everybody under the sun was using it.Swyx [00:08:09]: A lot of Reddit bots and Discord bots.Jake [00:08:12]: And crypto miners. When you build an open product on the internet where anybody can sign up, the internet is a horrible place with so many things. You go through periods of asking, “How do I reach as many people as possible?” Then, “How do I fit the exact use case for the people who really matter and are really excited about this specific thing?”Jake [00:08:39]: Then there was a two-year period of making the actual business work. During the free-tier era, we were losing about half a million dollars a month.Swyx [00:08:59]: On a $20 million bank account.Jake [00:09:02]: On a $20 million bank account with maybe $50,000 a month in revenue. That's a horrible business. I don't know how anybody invested. But you have to go through it and say, “We have an experience people love, but the business has to work.”Jake [00:09:17]: There are two schools of thought. You can run the horrible business all the way up with bad margins, or you can go back and make it work. We've always wanted a super lean team. We're 35 people right now. It's very small.Swyx [00:09:36]: Supporting three million already?Jake [00:09:38]: Yeah. We're adding 100,000 users a week right now, so it's growing fast. We don't want to add headcount for the sake of headcount or throw bodies at problems. We want to build systems. It's hard to build systems during expansion because you're adding things to the system because people are asking for them or things are breaking.Jake [00:10:00]: We had to cut off the free users for a little while, rebuild the business, and make sure it worked. We want to reach as many people as possible because software is important. It's become difficult to create things in the physical world, so it's important to make it easy for people to build in the virtual world and have access to creation. But there are legs to that journey.Jake [00:10:30]: You can see divots in the charts. If you follow between 2025 and 2026, it's either summer or winter. People go on holiday with family.Swyx [00:10:50]: It affects that much?Jake [00:10:51]: Yeah. It's kind of B2C and kind of B2B. People are shipping constantly, then they stop. Our activation curve now shows more people activating on weekdays because we have more business users, so it smooths out over time.Agents as the New Interface to DeploymentSwyx [00:11:17]: Was there a point where you started prioritizing AI development or agent development?Jake [00:11:24]: We've prioritized agentic as a top-of-funnel thing. Over the last six months, we've deeply prioritized agentic as a mechanism to build and deploy things because we believe the curve is so steep and that is how people will build and deploy software.Jake [00:11:42]: It almost fundamentally doesn't matter whether this is dot-com or not because we're all on the internet anyway. If agents are going to deploy a bunch of things and we hit an inference wall at some point, we'll fix those problems. The dominant species over the next 10 years is that we've moved from assembly to C to C++ to JavaScript to words. You're going to need to close that loop.Swyx [00:12:13]: When you say this is dot-com, did you mean buying the domain, or the general case?Jake [00:12:17]: I mean the dot-com era, when companies had a huge run-up because people understood the internet was important. Then they hit bottlenecks, fundamental laws of physics, math didn't work, and everybody came back down to earth. But it didn't matter because the internet became so impactful. If you operate on a long enough time horizon, you should build these things anyway because you can see where it's going.Jake [00:12:45]: That's where I think a lot of agent stuff is. You get to a point where you're running thousands of agents in parallel. What is the inference cost? What is the compute cost? How do you make that efficient? How do you coordinate all this? We have issues coordinating humans; we don't even have good tooling for that. Now we have to figure out how to get agents to coordinate, safely version changes, and know when to raise their hand for someone to intervene. Otherwise it becomes an interrupt factory.Railway's Infrastructure Thesis: Network, Compute, Storage, and MetalSwyx [00:13:19]: Let's go right into the technical side. What are the core infrastructure or architectural beliefs of Railway that allow you to do what you do?Jake [00:13:29]: The primitives matter a lot for us. We need network, compute, storage, and orchestration around it. You need control over a lot of those things. We've talked a lot about how we don't really use Kubernetes because we want higher-order control to place workloads in very specific places.Jake [00:13:48]: The reason is that you have to be very efficient with agents: memory reuse and all these other things, or you're going to massively blow up your cost structure. Being able to rack and stack your own servers and build your own metal unlocks performance and cost. Experiences where you're running 1,000 agents in parallel are not massively cost prohibitive.Jake [00:14:13]: Token use and compute use are blowing up. Over time, those things have to get a lot more efficient. You can get a lot of margin to make those experiences solid by building your own metal. That's all in service of offering a differentiated experience to as many people as humanly possible.Swyx [00:14:51]: You have a data center in Singapore.Jake [00:14:53]: Yeah. We have two in every other region now. In Singapore, we're adding a second one in Q3.Swyx [00:14:58]: What's it like? I've never built a data center. Do you go to Equinix and say, “I want some slots?”Jake [00:15:05]: Yeah. Equinix. You basically go and say, “I want power and I want a cage.” They say, “Great, here's what it's going to be.” You rent the cage for a period of time, fill it with racks and servers, and hook up internet to it. That's all the pieces.Swyx [00:15:36]: Then you handle everything else.Jake [00:15:37]: You handle everything else.Swyx [00:15:39]: What's the math versus clouds doing it for you?Jake [00:15:43]: If we rented in the cloud, our payback period when we go to metal is about three months.Swyx [00:15:50]: Which is crazy.Jake [00:15:51]: It's nuts. That's four years of depreciated hardware. You're going to see a lot of this compute crunch because hyperscalers are buying up a lot of stuff. We're working directly with OEMs, resellers, and people building these machines: Supermicro, Dell, and others.Jake [00:16:11]: Upstream, there's a bunch of supply pressure. When we raised our last round, between deploying capital for servers and now, the amount of money we've raised is less than the amount of money we have in the bank plus the value of the servers because the servers have appreciated as RAM has gone up. It's nuts how valuable hardware has become.Jake [00:16:50]: If you look at hyperscalers, they deployed around $80 billion of capital expenditures this year, and next year will be more. That's a massive infrastructure build-out. You look at that and think it's crazy that they're spending way more than the Manhattan Project. But if every person is going to run dozens or hundreds of agents in parallel, you have no conceptual idea how much compute is required to make that experience happen, even if you're deeply efficient and sharing resources. And that doesn't even count inference.Swyx [00:17:22]: How do you plan the build-out? The growth chart is so vertical. Are you usually at 100% utilization as soon as racks are live? How far ahead are you planning?Jake [00:17:33]: We still maintain cloud presence for bursting. We work with AWS, GCP, and a few other clouds. We can rent, and then the moment we get space or power, we compact those workloads off the cloud. We started on the clouds, then built a system to migrate to our own metal. There's nothing that says you can't continually do that again, and that's exactly what we do. We never want to be compute constrained.Jake [00:18:09]: At the start of the year, we actually became compute constrained because one upstream provider wasn't able to give us quota at the rate we needed, and the hardware was slower. I spent a weekend rebuilding our entire network overlay so we could straddle five clouds: Oracle, AWS, ourselves, GCP, and one other one. We can do more than that now.Jake [00:18:38]: We got into a spot where we were trying to pack instances tight because we couldn't get enough compute. That led to a few reliability issues, which are now past us. I made a tweet pointing out that it's becoming harder and harder to acquire compute at the rate these models need to acquire compute. We got bit by it.Swyx [00:19:15]: How do you think about pricing knowing you might not have your own metal available at all times? Are you pricing assuming you need extra margin if you end up going into the cloud?Jake [00:19:26]: Because we've built out our metal data centers, our margins on metal are around 70%. We can deeply subsidize the cloud business if we want to scale at a reasonable rate. We have a few levers: metal, which makes the margins; cloud burst; debt to buy servers; and venture capital. It's an interesting operational problem: how much cash do we have, how much should we raise, how quickly can we deploy it, and can we scale revenue as quickly as we scale compute?Jake [00:20:05]: If we continue making it trivially easy for people to build and deploy, then the faster we close that loop and the more operationally excellent we are with capital, the faster the business can scale. It's almost a straight linear deployment rate.Financing Infrastructure: Hardware Debt, VC, and Operational LeverageSwyx [00:20:20]: I think infra startups raising debt is a tool people don't utilize enough or know enough about. What can you tell us about that? Is it secured against your CPUs?Jake [00:20:32]: It's secured against our hardware.Swyx [00:20:37]: What rates do you get? Who are the lenders?Jake [00:20:39]: We pay prime plus a spread, and we can refinance any of the debt as rates go down. The terms are pretty good. The unfortunate thing is that Twitter has no nuance, so people say, “Venture debt bad.” But as with all things, there are specific tools and areas where you can be deliberate instead of using one tool as a hammer. Venture capital is not the hammer for everything. You have to explore and figure out what works.Swyx [00:21:12]: VC is usually the most expensive financing you can get.Jake [00:21:15]: Yeah. I also think people think about VC incorrectly from a capital-raising perspective. Most people think, “How do I raise as much money as possible from whoever is probably the best I can get at that time?” That's close to right, but what we've tried to do is figure out what unfair advantage we can buy with that equity.Jake [00:21:34]: It's the most expensive equity you're going to give away at that point in time, assuming the company keeps getting better. How do you use it to work with someone stellar who complements you? In the seed stage, I had never started a company. Ray Tonsing had good advice, and I could text him all the time. He was really fast. Awesome.Jake [00:22:01]: Then with John and Erica at Unusual, they said, “You roughly know what you're doing building a product. We'll mostly leave you alone and be available for advice.” Amazing. Then we got to Series A and the business was an operational tire fire because we didn't know how to scale a business. Work with Erica, and Jordan is over at Redpoint, so bonus.Jake [00:22:28]: Now we've raised from TQ and FPV as we're moving into enterprises. Every step of the way, we've asked: who can we partner with at this specific time to unlock the next section of the journey? I don't know enterprise sales. As an engineer, I can eyeball what features we might need, and we have wonderful people internally who can help. But you want boardroom dynamics where everyone is aligned and asking, “How do we win this?” instead of bickering about strategy.Data Centers in Space and the Physics of ComputeSwyx [00:23:31]: You had a tweet about data centers in space. Why no data centers in space?Jake [00:23:37]: It's not “no data centers in space.” My hot take is that I think it is solvable. I've just never seen anybody solve it.Swyx [00:23:49]: You said, “How are you going to dissipate that much heat in a vacuum?” You're making a physics claim.Jake [00:23:55]: I haven't seen anybody prove how you're going to dissipate that much heat in a vacuum. It doesn't mean it's not possible. It just means nobody has brought it up yet.Swyx [00:24:05]: Astrophage.Jake [00:24:06]: I don't know what that is.Swyx [00:24:07]: The Martian thing. Okay, you're very logical.Jake [00:24:09]: It could work. A lot of people are putting the cart before the horse. They say, “We're going to put data centers in space.” Okay, but how? “We have time to figure it out.” It's like in The Martian where they ask how they're going to intercept something and say, “We'll figure it out.”Swyx [00:24:36]: Making a bet on human invention is weird because you blind trust that it can be solved. But with physics, there are first-principles bounds you can put on it. Maybe not. Maybe you're asking to travel time or break a fundamental thermodynamic law.Jake [00:24:57]: I don't know how VCs do this either. How do you know what's not possible and a grift versus what's possible but sounds completely insane? “We're going to put data centers in space.” Coin flip as to which it is, and I guess you'll know in 10 years. That's one cycle.What Agents Need: Versioning, Observability, and 1,000x ScaleSwyx [00:25:23]: Moving back to agents. The branching, fast spin-up, and orchestration you do feels like pre-work that happened to be exactly what agents want. What do agents want differently than humans?Jake [00:25:37]: They want the ability to version things. It's not that different; it materializes slightly differently. Agents want a way to test changes incrementally. Engineers have feature flags. Is there a reason agents can't use feature flags? I don't think so.Jake [00:25:54]: They want version control. Can we use Git or not Git? That one is up in the air. I think something outside Git will emerge for how we version these things over time. They need observability. You need to query what happened, when it happened, which steps failed, traces, logs, metrics, and all the rest. They need network, compute, and storage. They need to write files, save files, iterate on files, and snapshot file systems.Jake [00:26:25]: A lot of what humans needed is in line with what agents need. Branching and forking are not different; we're just moving 1,000 times quicker. It can look like you need something massively different, but what you need is something massively better than what existed. You need orchestration massively better than Kubernetes. You need networking probably better than Envoy. It goes all the way down the stack.Jake [00:26:55]: If the workload profile doesn't change so much as it gets massively compressed because you need thousands of these things, what assumptions change? etcd is going to melt. You need to replace it with something. You can go all the way down the stack and say, “That part has to change, that part has to change, and that part has to change.”Jake [00:27:19]: The interesting thing about the super-exponential curve is that you have to build systems where you can rip out those parts at any time because a new bottleneck might emerge. You get good at parallel agents, and a different part of the system breaks. So it's similar to what humans needed, but at 1,000x scale.Jake [00:27:55]: How do you do code review in the age of agents?Swyx [00:28:00]: You throw more agents at it.Jake [00:28:01]: You don't. But then who reviews for CVEs and all these other things?Swyx [00:28:07]: More agents.Jake [00:28:08]: And that's how we hit the inference wall. You can continually throw agents at the problem, but I think there's a limit to the number of agents you can throw at a problem.CLI, Agent Handles, and Closing the LoopSwyx [00:28:24]: You already had a CLI before it was cool. How is the shape of what you're exposing changing, if at all?Jake [00:28:28]: CLIs have always been cool. The CLI changes because we think about how to give Claude, Codex, ChatGPT, or any model a handhold.Jake [00:28:50]: A CLI is a single command: deploy, get logs, and so on. Things that were prohibitively annoying to humans are not annoying to agents. They're nice. If I handed you a CLI with 40 arguments and 600 flags, you'd think, “I'm never going to use all of this.” But if you hand it to an agent, it says, “This is excellent. I have so many handles to work with.”Jake [00:29:24]: If you're going to expose things to agents that way, you want as many handles as possible where they can get information, query dynamic information, and close the loop quickly. Most problems right now are about how to close the loop as quickly as possible. Where does the agent get stuck, and how can you remove that?Jake [00:29:49]: Telemetry is important. If you can tell where the agent gets stuck from the CLI and say, “12% of people deviate from the happy path because of this, and now I add this argument and drive it down to 2%,” you massively increase the rate of loop closure.Jake [00:30:03]: That's how we think about not just the CLI, but every point in the dashboard. It's a user journey: I hear about Railway. I get something deployed. I get my first green build or aha moment. I see an endpoint, logs, whatever. Then I iterate. The iteration loop is indefinite. The user wants to deploy a new thing, a Postgres instance, change code, and keep iterating.Jake [00:30:36]: If you focus on the iteration loops and what's blocking them from closing quickly, one thing we say internally is: you never want to be waiting on compute anymore. You always want to be waiting on intelligence. If you're waiting on compute, there's a bottleneck that needs to be destroyed because eventually that bottleneck becomes so large that another workflow emerges to change it.Jake [00:31:04]: We've built a product where you push code, build it, and so on. But I fundamentally believe the push-pull loop is going away. We'll get to a point where you make a small change in production, that change is versioned across your infrastructure, you're working alongside copy-on-write versions of your database and infrastructure, and then you merge it in and it's instantaneously live. That's the holy grail of loops. The push-pull-rebuild thing is a point of friction that we're removing entirely.Canvas as Output: Dashboards, Context Anchors, and HyperstructuresSwyx [00:31:43]: It's incredibly fast. If anyone hasn't tried it, that fast feedback is great. My hot take is that Railway was famous for its canvas, which visualizes your infrastructure and lets you manipulate it visually. But that was for humans. For the next phase of growth, Railway CLI is more important than canvas.Jake [00:32:05]: The canvas is funny because it's a mechanism to show changes over time. You're right that previously we used it a lot as an input. Moving forward, its goal is more like an output. You would go to the canvas, make changes, see them, and watch your infrastructure evolve. Now agents have access to the CLI and can make those changes. So the canvas becomes an output: what information does the human need at this moment to make suitable decisions about control requests? Do I approve this or not?Jake [00:32:57]: It also has to be an anchor for your context, a port in the storm. Think of it like layers in a file system. You start with a project, then drill down into services, then into a function or code, because you want to represent the entire thing not just in your head, but in the canvas. Other people can share that representation, think on the same wavelength, and move quickly.Jake [00:33:33]: A lot of organizations get in trouble as they scale because all the context lives in someone's head. “How does this microservice work?” “I have no idea; go ask this person.” Then you have whole categories of products built around context discovery. A lot of that melts away if you have a solid hierarchy and can infinitely nest services, code, context, and everything else all the way down. That's what lets you build these structures over time.Jake [00:34:18]: It's also what lets us build what I've called hyperstructures: things that are way bigger. You look at the Golden Gate Bridge and ask, “How did we build that?” There's a meme that we lost the technology. To some extent, yes, because the coordination that built those things evolved and changed. We lost some of the art of building structure as we jammed everything into Slack.Swyx [00:34:52]: But you jam everything in Discord.Jake [00:34:53]: Same point. It doesn't matter. It's message passing and interrupts, message passing and interrupts.Swyx [00:35:00]: So you're arguing there should be something better and more structured than Slack?Jake [00:35:04]: Yeah. For sure. I think Slack is awful, and Discord is awful too.Central Station: Context Routing, Support, and Incident ClustersSwyx [00:35:09]: This is the equivalent of my mom test. What have you done that has your solution to this?Jake [00:35:15]: Internally, we've built a tool called Central Station that aggregates all the context from our users. Every piece of feedback, every customer support item, everything gets aggregated into clusters. If an incident is brewing, we can determine how many users are affected and break off a discussion based on that.Jake [00:35:40]: That is more helpful than long-running channels where you're trying to decide which channel to put something in. If you can dynamically aggregate information and dynamically route it to the right person based on context, it works better. We know internally that these four people are close to networking. If we see a networking thing, we can drill it down to those four people. If it's with this part, we can look at the commits. This is no longer a manual process internally.Jake [00:36:13]: If you go to station or help.railway.com, that's why we built it. We wanted to scale with a massive amount of leverage by aggregating feedback.Swyx [00:36:27]: This is built in-house?Jake [00:36:28]: Yep.Swyx [00:36:29]: I remember helping out on this one with Angelo in 2023. You scale a lot with a very small team.Jake [00:36:38]: Yeah. We're about 10 times bigger now.Swyx [00:36:40]: You have your full developer code here? Very cool.Jake [00:36:44]: If you go to railway.com/stats, we expose this as a pub-sub-able thing. It's all real-time metrics. There's a way to get it as JSON somewhere if you care.Jake [00:37:01]: We're big on trying to build everything in public and talk about what we're working on. We've had issues in the past, and we'll say, “Here's how we're fixing these things.” We've gotten compliments and flak for incident reports. We're always trying to make them better and talk with people.Incidents, Disclosure, and Progressive RolloutsSwyx [00:37:20]: You had a big one recently. I liked that it was scoped to 3,000. You presumably used Central Station. Talk through what happened and how you address it internally as a team.Jake [00:37:38]: Internally, this one really sucked. It had to do with an upstream provider that didn't do the behavior it said it documented, which is unfortunate given they wrote the RFC for how the behavior should work. We rolled those things out, and Central Station caught it initially when a couple users said caches weren't invalidating. We turned it off immediately.Jake [00:38:03]: When you roll out to a large user base of three million people, you get a lot of disparate behaviors. We tested in staging and had tests, but we hit an edge case. We've hardened those systems, and now we can make that better. But it was a tough one.Swyx [00:38:39]: I always wonder how private disclosure is supposed to work if people find an issue. Are they supposed to contact you first? When you run a platform, these things will happen. What channels should people pursue to quietly resolve it before it becomes a bigger incident?Jake [00:38:59]: There's responsible disclosure. We err on the side of over-disclosing and letting you know something is wrong versus having your provider gaslight you. We've erred on sharing those things more publicly, even if they impact a small subset of users. That's a decision we've made internally. We have four values. One is honor. The honorable thing is to notify people to the widest degree at which they may have been affected or there was an issue, and then confront it head-on: why did it happen, what can we do better?Swyx [00:39:45]: Not the whole user base. That's because of incremental rollouts and other things?Jake [00:39:50]: Yeah. Progressive rollouts.Swyx [00:39:54]: That should be the norm at all large platforms.Jake [00:39:58]: It should. A variety of companies do this. There's the quote that Meta runs 10,000 different versions of Meta. To our earlier point about agents, they need the same thing. They need shadow traffic and all these other things. We've built so much ceremony around production being sacred that we need to make it trivially easy to test different behaviors in a safe environment. Then you can make mistakes in a safe environment.Safe AI SRE: Customer Agents, Forked Environments, and Production ParityAlessio [00:40:30]: Do you see a world where these things get automatically caught, not necessarily by your agent, but by your customer's agent? The cache invalidation issue seems easy to check if you know to look for it.Jake [00:40:44]: It's hard because to determine it, we almost need to hook into your observability infrastructure. That's why we have the template loop on the platform: so you can roll things out progressively. You can roll out to Johnny Vibe Coder initially, or push a shard that someone consumes at their own leisure. Or you can roll it out over weeks: 0.1% of people, 1% of people, early adopters, then all the way up. That's the non-deterministic version control we talked about earlier.Jake [00:41:30]: I believe that's where most things should go, because most companies end up building staged rollout systems in-house. It's the same thing built again and again at every company. There's a massive opportunity to consolidate developer debt.Alessio [00:41:45]: You should have a free tier. Model providers give free tokens if you let them use the data. You could give free compute if someone is the number-one shard that goes out and lets you plug into their observability.Jake [00:41:55]: We do that. That's why we talked about the impact on 3,000 people. We start with lower-impact people. Larger companies on the platform are last to receive those rollouts so they have a version of the platform that's deeply stable.Alessio [00:42:16]: I have three services, so I'm sure I get the first rollout. You can nuke my thing at any time. There are all these SRE agent companies. Observability people also want agents that fix upstream problems. You have your own agent in the canvas now. How do you see that playing out?Jake [00:42:39]: It's the stacking entropy problem. If you don't have primitives to make iteration in production safe, it becomes difficult. If you're an observability provider saying, “Here's the fix to this error,” assume 80% are good and make sense. But in the last 20% long tail of complex issues, if you let somebody stamp it, you create an opportunity for an incident.Jake [00:43:08]: That's why forked environments are important. People have staging, but it always drifts from production. You need primitives, workflows, and experience built first-party on the platform so you can fork any service at any point in time.Jake [00:43:33]: I think of the canvas as a sheet of transparency paper. The agent is a little guy you push up into the canvas. It should say, “I need to copy that service and that service so I can test these two things.” It gets a read-only copy of production. Anything that's PII gets marked as a transform when we clone the database, create a copy-on-write version, or read from it. Then the agent makes changes and asks, “Does this actually work?” as close to production as possible.Jake [00:44:22]: That's how close you have to be, or you get massive drift. The system becomes unstable. You see this with massive systems built on Docker for local, Kubernetes for production, and a specific thing for something else. That complexity slows developers and becomes unstable at scale, making it hard to iterate. We want to compress that way down and say, “As close to prod as possible is where we want to be.”From AISRE Skeptic to Agent BelieverSwyx [00:45:00]: I was texting Erica for questions, and she says you were originally not a believer in AISRE. Have you come around on it?Jake [00:45:10]: I flipped, but I'm still not a believer in AISRE if you don't have the primitives to make it safe. If you unleash AISRE on production infrastructure without safe primitives for copying volumes and making sure things are fine, it's going to nuke your production database. It's not a matter of if, but when. I'm a big believer in making those loops safe.Jake [00:45:33]: I was a deep AI skeptic until 2023. In 2024, I thought, “Maybe I can roughly make this thing do it.” In 2025, I thought, “Now I can hold this.” Over winter break, everybody came back saying, “It's almost impossible to hold this.”Swyx [00:46:01]: Did you see this on the Claude docs? CloudBot? OpenCloud?Jake [00:46:06]: It's gotten to a point where it's harder to hold it wrong than to hold it right. There's a scene in Avengers where Vision picks up Thor's hammer and says it's terribly well-balanced. It self-balances and works well. I'm a deep believer at this point that this will be the dominant species: assembly, C, C++, JavaScript, words.Swyx [00:46:35]: It feels like a big jump.Jake [00:46:37]: It is. But it's not like you abandon CPU-based discrete logic and move straight to fuzzy logic. You need both. Your skills should call code or applications or some static structure. You can use skills to distill what the procedure should be or how the code should act.Jake [00:47:02]: I'm coming to a thesis: you need three points. You need a clear spec defining the system, the code, and the tests. When you say it out loud, if you've been in engineering long enough, you're like, “Of course. That's an RFC, tests, and code.” But they all matter. Having them together lets them reinforce each other: the spec and tests match, but the code doesn't, so reconcile it. Or the tests and code match but the spec doesn't, so reconcile that. That's the iteration loop.Jake [00:47:41]: That's why you're seeing people talk about software factories, docs, and reconciliation. Some of that is architectural astronomy if you don't implement it, but that loop is where most things will end up.Swyx [00:48:07]: For listeners, we've been talking about this on the pod for three years: the holy trinity of specs and tests. Itamar Friedman from Qodo is the reference if people want to look it up.Self-Modifying Infrastructure and the End of Push-Pull-RebuildSwyx [00:48:18]: One thing I want to mention on the OpenCloud idea is self-modification. I don't know how Railway would support it, but I have my OpenClaw, and I just tell it it has the Railway CLI and can do whatever. In theory, whatever capabilities or new infra it needs, it can call the Railway CLI, provision it, and add it to itself. The agent can modify its own infra.Jake [00:48:45]: It's nuts. I have a loop set up where you put the Railway CLI on top of something that runs on Railway. You're authenticated as whatever the current box is, and you can make any changes to it. Then you call Railway deploy, and it deploys itself.Jake [00:49:04]: It's like: “I need to spin up this instance of this environment. I already exist in this environment. Excellent, I have access to a Postgres instance now.” That's where we want to go with agentic, self-replicating infrastructure. That's your loop: iterate in production. You continue making changes. If it works, merge it upstream. If it doesn't, throw it away.Jake [00:49:37]: How do you make throwaway copies trivial to spin up and super cheap? The era of “I have an AWS instance with four vCPU and 16 gigs of RAM” is going to get destroyed. If you do that for agents, you need a thousand of those machines. It's prohibitively expensive compared with what we've spent a ton of time figuring out: the atomic unit of deploy, whether you call it isolates, sandboxes, or something else. Only pay for what you use, spin up instantaneously, and close the loop as quickly as possible.Jake [00:50:15]: If the system can self-replicate safely and say, “This is my environment, I'm making these changes,” it can come back with, “Does this look good? This is a new state of infrastructure given this prompt. I think I've solved it.” Then you go back and say, “Actually, it looks different.” It does the loop again. Then you say, “Cool. Apply.”Swyx [00:50:38]: That's retroactively obvious, which is the most useful kind. Any other comments on agent deployment on Railway?Jake [00:50:51]: It's getting better every day. I'm on X or Twitter. You can always yell at me about the parts not working as well as they should, because plenty of things should work way better.The New Serverless: Stateful, Long-Running, Pay-for-What-You-Use LinuxSwyx [00:51:04]: At this stage, when people want massively or embarrassingly parallel compute, they usually talk serverless. I feel like there's a new serverless compared to the previous five years of serverless. You're in that new bucket. Do you have comparisons or philosophical differences you want to call out?Jake [00:51:31]: It's somewhere in between. It's the ability to run stateful, long-running workflows or executions.Swyx [00:51:42]: Vercel has Fluid Compute, Cloudflare has some container thing, Google has App Runner and others.Jake [00:51:55]: That's where everything is roughly going, and it's why we've been working on this for six years. We believe users need access to a computer: a box that speaks Linux. They need to deploy what they want. Other systems change the surface area of what you can build. For us, users need a computer and need to deploy anything they truly want. That's why we've focused on the primitives: network, compute, storage. If we give you those and expose them so you can run things indefinitely, that's where we believe it's going.Jake [00:52:43]: Twitter has no nuance, so everyone says “servers” or “serverless.” It's always somewhere in the middle: I want to run it for a long time, but I don't want to provision the resource statically or pay for things I'm not using. That's been our thesis from day one: pay only for what you use, run it indefinitely, and it is full Linux.Swyx [00:53:12]: That's why I like the naming of Fluid. It's fluid. Flexible.Heroku, Focus, and Carrying the Torch Without Becoming the PastSwyx [00:53:18]: Another milestone is the Heroku official deprecation. You're one of the presumptive new Herokus. “New Heroku” has been a category for as long as I've been in developer tooling. It's finally happening. What was that like? Any behind-the-scenes of, “This is the moment”?Jake [00:53:42]: You have people where you're like, “You were running stuff on here? You, as this company?” It's crazy that names you would know are running on it and now coming to us saying, “We want to move a lot of this off.”Swyx [00:54:00]: Any behind-the-scenes on why Salesforce let Heroku stagnate?Jake [00:54:05]: I can only guess. It's hard when it's not your business. Salesforce's business is to build a great CRM. That's their focus. Then you acquire a compute business as an offshoot. A lot of early Meta people talk about focus. Boz has a write-up about how in the early days of Meta they had no money, so they were forced to focus. Then they turned on the money tree and had no reason not to split their focus.Jake [00:54:52]: But that dilutes your product. You get offshoots where you ask, “Is this the focus of the business?” If it's not core, it languishes. A lot of companies get in trouble when they split focus because they're fighting a multi-front war, not just externally but internally for alignment. Where are we going? What are we doing? What is our purpose?Jake [00:55:24]: If you're Salesforce-built and mission-driven, you want to work on Salesforce. Heroku is off to the side. It's not core to the business. Getting resources, budget, focus, and alignment internally becomes hard. It was a matter of time.Swyx [00:56:06]: Kudos for them to call it out instead of leaving it unknown.Jake [00:56:12]: Their release was a little odd. They called it out, but they didn't say they were shutting it down. Behind the scenes, I think they issued messages to people saying they should close accounts and that they were going to deprecate and remove things over time.Jake [00:56:30]: It's crazy because some of my first deployment experiences were on Heroku. You start with dragging things into an FTP server, then you try to get a deploy working, and then it's Heroku. It was the on-ramp for us. But the wheel turns. New things emerge. We're happy to carry the torch for a lot of that. But we don't want to be the new Heroku. We want to be the way people build and deploy software, and ultimately the way people monetize software over time.Swyx [00:57:19]: It's still a big crown to be the new Heroku. There are 50 companies that fought for that.Jake [00:57:23]: Everybody is holding some portion of it. We're happy to support people and companies. The platform works differently. The game loop is similar, but we've been dogmatic about where these things are going: primitives, agents, fan-out. Some things fit; some workflows need to change. We have an approximation of Heroku pipelines with the environment system. It's exciting. We've got a ton of people we can support, and it's growing a lot.Temporal, Workflow Engines, and State MachinesSwyx [00:58:12]: I have one more technical question about Temporal. I've sold my shares. You're a power user and one of our earliest customers. I met you through Temporal. You built on Temporal. You have complaints. This may be the most neutral and informed conversation anyone will hear about Temporal without someone working at the company.Jake [00:58:39]: That's fair. I've used Temporal for almost 10 years because of Cadence at Uber.Swyx [00:58:52]: Give people a sense of what Cadence was at Uber.Jake [00:58:57]: Cadence was the precursor to Temporal. It powers trip actions, rides, when you rent a Jump bike or scooter or car. You're running workflows for a period of time and saying, “This ride will run indefinitely until it finishes.” You attach information: you paused in this zone, so add this charge to the bill. When you end the trip, the workflow is done. That experience was powered by Cadence at the time.Swyx [00:59:34]: I used to say it's like programming the entire user journey top-down as one function.Jake [00:59:39]: It's a powerful idea and important. It's also important for the next phase of the agentic journey. You want an agent to do a specific task, be complete or incomplete on that task, and move on to the next thing. You need a way to manage workflows dynamically.Jake [00:59:59]: Temporal was always great in theory, and great when you got it working the way you wanted in production. But it required you to model the entire journey in your head. If you didn't, you could cause issues where replaying the state of the workflow causes non-determinism.Swyx [01:00:25]: Because it works on deterministic workflow history.Jake [01:00:28]: Exactly. I describe it as a jet engine. If you know how to operate it and run it, it's great. But you can't hand it to people trying to build complicated things if they don't have the whole state in their head.Jake [01:00:48]: We run our whole deployment pipeline on top of it. That's a reasonably complicated workflow: pre-commit hooks, signaling, queuing, and all the rest. We ran into the same thing at Uber. As you express a large workflow, it gets more complicated, with more states in the state machine that you have to map back to the workflow.Swyx [01:01:15]: It's a lot of ifs.Jake [01:01:16]: Exactly. At Uber, we built a system for doing the state machine and testing it. We've started to build some of those things here because it's grown heavily. It's not quite love-hate. When it works well, it works super well. But if someone who doesn't have full context puts something into the system that invalidates state or causes non-determinism, or spins off a ton of activities, you have to keep track of underlying SRE knobs like activity slots. Those should scale with memory, vCPU, and so on. It becomes a bear to scale.Swyx [01:02:10]: You need a capable sysadmin running things behind the scenes. If you moved off, what would you do?Jake [01:02:19]: We'd build our own workflow engine. We have a few internally that we've worked on.Swyx [01:02:27]: This is one of those classes of things you typically wouldn't vibe code, but I'm wondering if you can.Jake [01:02:33]: I still don't think you should vibe code it. You still want to run decent tests to make sure it works.Swyx [01:02:39]: Timo didn't invent that from scratch either. There are libraries you can run. On top of that, it's just a state machine that you have to map out. Ultimately, you define the instructions you want and run them through a state machine.Jake [01:03:00]: It's very doable. Workflow stuff is interesting. Restate is doing neat stuff here.Swyx [01:03:10]: You're tied into JavaScript. Are you a JavaScript maxi?Jake [01:03:13]: Internally, we have TypeScript, Rust, and Go. We don't add more languages. Actually, we have a little C because we write BPF code and hooks. But those are the languages.Swyx [01:03:28]: Is this for sidecars?Jake [01:03:32]: No. It's for the networking stack, volumes, and things like that. We use TypeScript a lot because it powers the dashboard, but we're moving a lot of workflow stuff off the dashboard stack and into the infrastructure stack.Railpack, Nixpacks, and Content-Addressable FilesystemsSwyx [01:04:00]: Cool. Any other technical infrastructure stuff? Railpacks?Jake [01:04:07]: We built an engine for determining dependencies based on source code. It's called Railpack. We built the first version, Nixpacks, on top of Nix, and then we moved.Swyx [01:04:17]: People have been trying to get me to adopt Nix and NixOS for four years. Is it ever going to be a thing?Jake [01:04:23]: I don't know. We're excited about it, but it has pain points. Think of it as a stack of versioned binaries at specific slices in time. If you want version X and version Y, you bloat the package space, which blows up image size and makes real-world workloads difficult.Swyx [01:04:53]: But you content-address it and cache it. In theory, there are optimizations.Jake [01:05:00]: In theory, yes. But with a large enough user base and disparate enough machines, you run into a problem Meta described in the XFAAS paper, their internal serverless system. It becomes difficult at scale unless you break out specific runtimes.Jake [01:05:24]: We didn't want to do that because we wanted to truly allow you to deploy anything. That was our initial thing with Nix. But we've moved toward interesting work around content-addressable file systems that can lazy-load anything from any point and page it into memory.Swyx [01:05:48]: Amazing.Jake [01:05:49]: The future is very bright. It's crazy, and it's going to be nuts.Coding Agent Spend, Roadmaps, and Token ROISwyx [01:05:54]: Founder journey stuff?Alessio [01:05:56]: Your cloud usage: you tweeted you're going to spend $300K this month?Jake [01:06:01]: I think we got to $200K.Alessio [01:06:02]: Coding agents?Jake [01:06:03]: Yeah.Swyx [01:06:04]: Across the company?Alessio [01:06:05]: You only have 35 people, so I'm sure they're not all spending $10K a month. What's the distribution?Jake [01:06:10]: I think I'm at about $25K. We have power users all the way down. We came back from winter break, and I basically said, “If you're writing code by hand, you're doing this wrong.” The tools are good enough now that you can move extremely quickly. There are issues and pain points, but you should be reviewing the code you are writing instead of writing it by hand.Jake [01:06:40]: Architectural patterns matter more now than ever, but you shouldn't spend your time generating code you would write. If you know how to write it, ask the agent to write it and reconcile it until it looks like you would have written it yourself.Jake [01:06:58]: People misconstrue my propensity to push people toward agents as connected to our growth and some reliability bumps. They're not necessarily related. The tools are good enough to move extremely quickly and build things way larger than you could before.Jake [01:07:19]: To the earlier point about cooling data centers in space: I don't know. But with software, you can ask, “How would I build block storage from scratch? How would I do these things?” I have ideas because I have history and have read papers. Let me work them out and build massive test benches with thousands of tests, because those are now free to author. If you're not using AI systems to speed-run your roadmap and reconcile your existing system onto the future, you're missing a large point of what's happening.Alessio [01:08:12]: What's the path to spending $3 million a month? Is it bound by ideas and things customers can absorb?Jake [01:08:19]: For most companies, it's bound by deployment at this point. That's why we've seen a massive boom in users and companies, from Fortune 50s down, asking how to get developers to move faster. You'll probably hit your CFO before any technical limits because they'll look at the eye-watering amount of money spent on tokens. Inference costs have to come down, but we're inference constrained now. There will be price discovery around what makes sense for an org to adopt.Jake [01:09:06]: I think you'll end up with the F1 driver concept. If someone is really adept at these things, it makes sense to put them in a $3 million car. If they're not, it probably doesn't make sense. You'll take a few people and say, “You can drive the F1 car. We need to go in this direction. Figure out if it works and prototype it.”Jake [01:09:33]: We've done some of that and vastly accelerated our roadmap. We thought we'd ship something in a few years; now we can probably ship it in a few months because we validated it and don't have to build it incrementally. We can skip steps and move toward our vision.Alessio [01:09:58]: A lot of people are realizing the roadmap doesn't always have a business impact, so they say tokens are too expensive. But if your roadmap were built to make more money by the time you built it, you'd have token pricing for it, the same way you do with sales. You'd spend a billion dollars on sales if you knew you would get $2 billion of revenue.Jake [01:10:19]: Exactly. A naive way to measure this is the percentage of tokens that end up in production. If you can measure impact because those tokens end up in production, that's awesome. But the burden of proof will rise. Internally, we have a growing number of pull requests that haven't merged. The question becomes: how do you get this into production? It's about how quickly you can build and deploy software, which is exciting because that's our whole thing.The SDLC Shift: Prompt Requests, Feature Flags, and Safe RolloutsSwyx [01:10:56]: The SDLC is changing. One thesis is that the pull request is dying. It's going to be the prompt request. Beyond that, code review is also kind of dying if you have all the other systems in place. What else is changing about the SDLC?Jake [01:11:19]: The AISRE and the tools to make it happen. AISRE is pie-in-the-sky aspirational. What does it take to get an AISRE? What tools do you need to build?Swyx [01:11:32]: You should expose your tooling to customers at some point. The Central Station command center.Jake [01:11:39]: We have it for template maintainers. Template maintainers can deploy and maintain templates, and they get feedback. We're going to expose those things incrementally.Swyx [01:11:51]: Clustering around incidents. Everyone has a version of that, but I don't think anyone has solved it.Jake [01:11:56]: I won't say we've solved it internally, but it's gotten so good that we can see incidents forming pretty quickly. At some point, those will be things either someone else builds or we build. We've always built things purpose-built for us. If it makes sense to make it useful for users, monetize it, or turn that loop into a profit center instead of a cost center, we want to do that.Jake [01:12:28]: Pull request is definitely dying.Swyx [01:12:29]: Do you do first-party feature flagging and incremental rollout stuff?Jake [01:12:34]: We have a feature-flagging engine we built internally and will eventually roll out.Swyx [01:12:38]: I don't see it as a user. How come you didn't give us what you have?Jake [01:12:43]: We have to beta test it. We care a lot about the quality of the things. There's plenty we've used internally that doesn't make it all the way through the journey because it fails. It works for one service but not multiple services. We'd have to build it for multiple services and know that if we released it, we'd rebuild it again and again. Some things are worth that, but many inform the roadmap.Jake [01:13:18]: We don't want to dilute the experience by saying, “This works, but only for this service,” unless it's a core initiative. Over the next few months, we'll roll out things that work for a single service, then multiple services, then multiple services across the environment. You have to be deliberate. Otherwise you create broken disparate experiences and support load because people ask how to use the feature.Jake [01:13:52]: It's the earlier expansion and compaction pattern. You expand the company to get features, then compact and smooth them out so the experience is stellar. You told me in the hallway, “It's gotten so much better.” Internally we're saying, “This part really sucks. We need to make it significantly better.”Swyx [01:14:11]: I can attest to that over the last three years watching you build Railway. For listeners, feature flagging is a huge part of Uber culture. So much so that they have too many feature flags and another thing to remove feature flags. Facebook has Gatekeeper. Agents are going to need this. It's fundamental to incremental rollouts. OpenAI acquired Statsig. GPT-5 is routing and flagging through different models.Jake [01:14:56]: It's super important. If the software development lifecycle is going to change because we're doing things 1,000 times faster and 1,000 times more concurrently, what becomes important at scale?Jake [01:15:16]: Before I started Railway, I built a feature-flagging product and tried to sell it. It was an easier version of LaunchDarkly. I ran into a problem: anyone small enough to adopt your technology doesn't care about feature flags, and anyone large enough to need feature flags needs so much scale that you have to build out all the infrastructure. I scrapped it.Jake [01:15:42]: But what is old is new again. Companies are trying to move quickly, but you can't YOLO a vibe-coded thing straight into production. You need to say, “Here's my blast radius, my impact, and I want to shadow it for these users.” Feature flags. You're going to need the tools larger companies built to maintain their structures. Everything gets compressed by 1,000x so everybody can build those structures quickly.Jake [01:16:07]: That's exactly where we are: compressing the software development lifecycle, then expanding it and adding more new things.Cattle, Pets, and Clonable InfrastructureSwyx [01:16:15]: Another term that comes to mind for newer developers is “cattle, not pets.” People treat production like a pet. It has a name. You baby it and keep it alive. With cattle, you can mass farm, roll out, portion parts out, and kill them.Jake [01:16:37]: I think that might change. You can move toward having pets as long as you have a cloning machine for your pets.Swyx [01:16:52]: Yeah.Jake [01:16:52]: If you can snapshot every single thing at every frame, it doesn't matter if something gets obliterated because you have a snapshot of it. The things we've built right now are designed to block changes from the hermetically sealed DevOps line. You have to write a Dockerfile because you nee
In today's Cloud Wars Minute, I examine how AI demand is reshaping rivalries between Google Cloud, AWS, NVIDIA, and Anthropic. Highlights 00:03 — According to reports, Anthropic has committed to a $200 billion five-year agreement for Google Cloud services and Google-designed chips, a deal that could account for more than 40% of Google Cloud's revenue backlog. 00:18 — This represents yet another escalation in the rapidly expanding partnership between Google Cloud's parent company, Alphabet, and Anthropic, following Alphabet's previously announced $40 billion investment into the company. 00:49 — The company also holds considerable infrastructure deals with providers, including AWS and NVIDIA, and what this deal underscores is the extraordinary scale of demand for AI services. The need for compute capacity has grown so large that even a $200 billion agreement may not be enough to meet future requirements. 01:33 — However, companies like Google Cloud, with the infrastructure required to support hyperscale AI development, are positioned at the very center of this massive transformation. Visit Cloud Wars for more.
Hoy hablamos del giro de la IA hacia el trabajo real: Google lanza Gemini 3.5 Flash para agentes y tareas largas, OpenAI empieza a vender compute reservado a varios años, GitHub investiga acceso no autorizado a repos internos ligado a una extensión maliciosa, Discord activa cifrado extremo a extremo por defecto en voz y vídeo, y un avance fotónico de la Universidad de Pennsylvania apunta a chips de IA mucho más eficientes.Puedes seguirnos en YouTube en https://youtube.com/olivernabani y puedes unirte al Discord Mashain en https://olivernabani.com/discord
Hosted by David Cowen | Careers and the Business of Law Everyone's talking about Harvey, Legora, Spellbook, and Ivo. Nobody's talking about what they ride on top of. Tom Baldwin - founder and CEO of Entegrata, former CIO at Foley, Sheppard Mullin, Reed Smith, and Cadwalader - argues the real story is data infrastructure. Without a single source of truth, every AI tool in your firm is working from a partial picture. WHY THIS MATTERS? If your firm is buying AI tools without auditing the data underneath them, this is your warning shot. Tom's framing: toaster ovens need an electrical grid. KEY TAKEAWAYS AI tools work on narrow tasks, not whole-firm intelligence. 50 asset purchase agreements? Great. 200 million documents? No. Pulling documents out of your DMS strips away the metadata that makes them valuable - judge, opposing counsel, area of law, industry. That context is what AI actually needs. Business-of-law use cases (lateral prediction, cross-sell, client attrition, FP&A) are wide open. Practice of law got all the attention. A data lakehouse unifies data across 20-40 systems. Snowflake popularized it; Azure/Databricks/Fabric are the modern stacks. Cost is roughly the same at 200 lawyers or 2,000 - six figures, ongoing. Compute and storage are cheap; talent is the investment. Firms move from "nice to have" to "must have" after a near-miss. Tom's example: a firm almost fired an associate because their FTE calc didn't account for maternity leave. The chief data officer is becoming a real C-suite role. Sidley's among the early movers. Watch the forward-deployed legal engineer trend. Harvey is hiring practitioners for these roles. PEOPLE MENTIONED David Cowen - Host Tom Baldwin - Entegrata founder & CEO Andrew Sieja- Founder of kCura/Relativity; Entegrata's first angel investor Renee Morris, Katrina Dittmer, Glenn LaForce - Data leaders Tom mentioned COMPANIES AND TOOLS MENTIONED Entegrata - Turnkey data lakehouse in Azure Snowflake, Azure, Databricks, Microsoft Fabric - Data platform stacks Harvey, Legora, Spellbook, Ivo - Practice-of-law AI tools Sidley Austin - Early adopter of the chief data officer role
Watch the sharp sell-off in AI memory stocks and the rise in yields, says Kevin Green. He explains how the dichotomy between the two lead to a rocky standing in equities, adding pressure ahead of Nvidia's (NVDA) earnings. KG then tackles the commodity front in crude oil and what to watch in the marginal move higher. As for stock stories, he touches on Blackstone's new Alphabet (GOOGL) partnership, where the latter will offer AI infrastructure capacity. ======== Schwab Network ========Empowering every investor and trader, every market day. Subscribe to the Market Minute newsletter - https://schwabnetwork.com/subscribeDownload the iOS app - https://apps.apple.com/us/app/schwab-network/id1460719185Download the Amazon Fire Tv App - https://www.amazon.com/TD-Ameritrade-Network/dp/B07KRD76C7Watch on Sling - https://watch.sling.com/1/asset/191928615bd8d47686f94682aefaa007/watchWatch on Vizio - https://www.vizio.com/en/watchfreeplus-exploreWatch on DistroTV - https://www.distro.tv/live/schwab-network/Follow us on X – https://twitter.com/schwabnetworkFollow us on Facebook – https://www.facebook.com/schwabnetworkFollow us on LinkedIn - https://www.linkedin.com/company/schwab-network/ About Schwab Network - https://schwabnetwork.com/about
Recorded live at Arm's AGI CPU launch event March 24, 2026, this Arm Viewpoints panel brings together senior leaders from across the business to unpack the journey behind Arm's latest milestone. Will Abbey, Dermot O'Driscoll, Steve Halter and Eric Hayes explore how Arm has evolved from a world-class IP provider to delivering full compute subsystems—and now, silicon. The conversation traces Arm's path into the data center, the rise of AI-driven infrastructure demand, and the engineering, ecosystem, and operational shifts required to build at scale. From performance-per-watt and system-level design to supply chain readiness and partner collaboration, this discussion offers a behind-the-scenes look at what it takes to deliver compute for the AI era—and what comes next in an increasingly agentic world.
India's AI moment is louder than its rank. 100M+ ChatGPT users. #2 globally in usage. Still 76th in the world on per capita penetration. So what's actually happening on the ground?In this episode of Z47 Moments, Vikram Vaidyanathan and Ashwin Raguraman (Head of AI, walk through The India AI Edge: a three-month primary research effort by Z47, OpenAI, and Zinnov. The report draws on first-party ChatGPT data from OpenAI and interviews with 100+ CXOs across India's largest enterprises, traditional businesses, and emerging companies.They unpack: Why India's AI map looks nothing like its tech map: Delhi #1 in GDP penetration, Ahmedabad in the top 5 for coding, Assam 3x the national average on education usage The flip nobody saw coming: in mid-2024, Gen Z (18–24) overtook 25–34 as India's dominant AI cohort, and now drives nearly half of all ChatGPT messages Work-to-non-work: how India went from 60% work usage to 65% non-work usage in a year, and what that says about penetration The four enterprise adoption archetypes: Tinkerer, Democratizer, Transformer, Enforcer, and why ~1 in 4 Indian enterprises is stuck in the wrong one The trillion-dollar gap to Viksit Bharat, and the specific role AI would have to play to close it The four pillars India needs to scale: compute (200–250 MW today → 7 GW needed by 2030), talent, data (and the "data colony" question), and the companies actually being built To read the full report, go to: The India AI Edge Website: https://z47.com/how-india-uses-aiLink to report: https://www.ai-edge.z47.com/The-India-AI-Report.pdfChapters00:00 — Cold Open: The Stats That Set the Frame00:49 — Inside the Report: 100M Users, 100+ CXOs, OpenAI Data02:14 — How AI Is Redrawing India's Map04:59 — The Gen Z Takeover11:24 — Work to Non-Work: India's Usage Flip15:01 — Enterprise AI: The Four Archetypes25:27 — The Enforcer Trap (And How to Escape It)33:21 — Can AI Close India's Trillion-Dollar Gap?37:22 — Compute, Talent, Data, Companies: The Four Pillars47:15 — India's AI Ecosystem & Closing
AI compute futures are now live on the CME, and IREN has raised $3B in a new convertible note offering. Welcome back to The Blockspace Podcast! Today for news, we cover IREN's new $3B convertible note – the largest convert ever for a public bitcoin miner – Trump's Q1 bitcoin equity buys, and the 90-day pause on zoning discussions for Hut 8's proposed 500 MW data center in Logan County, Illinois. Plus, Mike Alfred of Alpine Fox Hedge Fund joins us to discuss his top stock picks for AI, and Kush Bavaria of Ornn jumps on to discuss how Ornn is providing an H100 index for the CME's new AI compute futures – and his thoughts on the future of these incipient compute futures markets. Mike San Miguel of Luxor also joins us to discuss the latest in GPU markets and AI ASICs, and pseudonymous user Soup explains how he used Claude and $15 in tokens to spin up 3.5 trillion passwords to crack his long-lost bitcoin wallet.
On this week's Amigos, we suffer through Dangerous Streets on the Amiga, Gremlin's infamously broken fighting game that somehow became the pack-in face of the CD32 console launch.
The squad is complete again, and Sam arrives with a NeuroPod, cold plunge updates, red light therapy, Oura stats, and enough supplements to start a wellness startup. Then into the week's biggest tech stories: Google's new AI device and whether it's the Chromebook of the AI era or another doomed health-tech experiment, Meta's keystroke logging controversy, Microsoft's increasingly awkward OpenAI bet, why OpenAI and Anthropic are now sending engineers directly into enterprises to drive adoption, and what tools like OpenClaw, Py, and Codex actually do. Plus, Anthropic's eye-watering latest valuation, the clean girl aesthetic discourse, Brian Johnson chaos, and Sam personally buying Jackson Hole ski passes like it's 1997Chapters:00:46 Sam's NeuroPod, Oura Results & Biohacking Spiral03:33 Sam vs. Brian Johnson + The Female Biohacker Opportunity05:09 Oura Ring vs. Whoop + Google's Wearables Ambition07:00 Google's AI-First “Book” Laptop + DeepMind's Health Push10:30 Why Local AI Changes Everything (Speed, Cost & Compute)15:00 Where Is the OpenAI Consumer Device?16:00 Voice AI, Recording & the Future of Human-Computer Input20:30 Sam Built His Own Voice-to-AI App22:31 Meta's Keystroke Logging: Spy Games or Honeypot?24:00 Fake AI Jobs + Sam's “Fin Analytics” Prediction27:02 OpenAI & Anthropic's Enterprise Conversion Strategy29:31 The AI Backlash Is Real (Including UCF's Commencement Revolt)31:30 Microsoft's $100B OpenAI Problem39:31 Anthropic's Massive Raise + SF Real Estate Absurdity41:30 OpenClaw, Py & Codex: What Is a Harness?We're also on ↓X: https://twitter.com/moreorlesspodInstagram: https://instagram.com/moreorlessYoutube: https://youtu.be/-O3zyxR-wS0Connect with us here:1) Sam Lessin: https://x.com/lessin2) Dave Morin: https://x.com/davemorin3) Jessica Lessin: https://x.com/Jessicalessin4) Brit Morin: https://x.com/brit
In this episode, Conor and Bryce chat with Marco Franzreb Salgado about profiling GPU code with NVIDIA Nsight Compute (NCU).Link to Episode 286 on WebsiteDiscuss this episode, leave a comment, or ask a question (on GitHub)SocialsADSP: The Podcast: TwitterConor Hoekstra: LinkTree / BioBryce Adelstein Lelbach: TwitterAbout the Guest:Marco is a software engineer at NVIDIA, where he works on improving the nvCOMP library, which offers fast GPU implementations of multiple data compression formats. For the past couple of months he has been working on a GPU implementation of the rotate algorithm.Show NotesDate Recorded: 2026-05-05Date Released: 2026-05-15ADSP Episode 237: Thrust with Jared HoberockADSP Episode 284: GPU RotateADSP Episode 285: GPU Rotate (Part 2)NVIDIA CCCLNVIDIA nvCOMPNVIDIA Nsight SystemsNVIDIA Nsight ComputeNVIDIA CuTe DSLNVIDIA CUDA TilecudaMemCopyAsyncPERF WARS: EPISODE IHoogle Translate partitionSingeliADSP Episode 97: C++ vs Carbon vs Circle vs CppFront with Sean BaxterIntro Song InfoMiss You by Sarah Jansen https://soundcloud.com/sarahjansenmusicCreative Commons — Attribution 3.0 Unported — CC BY 3.0Free Download / Stream: http://bit.ly/l-miss-youMusic promoted by Audio Library https://youtu.be/iYYxnasvfx8
Anthropic hat sich auf eine $900-Mrd.-Bewertung geeinigt und raised $30 Mrd. Google ist in Gesprächen mit SpaceX über Data Center im Weltall. SAP investiert in n8n bei $5 Mrd. Bewertung und partnert mit Parloa. DeepMind launcht den AI Pointer. Amazon startet 30-Minuten-Lieferung in US-Städten, gleichzeitig sorgt internes Token-Maxing für absurde KI-Workflows. OpenAI verklagt Apple wegen der Marktposition. Im Musk-Altman-Prozess sieht Musk nach Altmans Aussage schlecht aus, Polymarket-Odds fallen weiter. OpenAI bietet 60 Tage Codex kostenlos für Cloud-Code-Switcher. Ford-Aktie steigt, weil Auto-Batterien jetzt Data-Center-Speicher werden. Gallup: 7 von 10 Amerikaner wollen kein Data Center vor der Haustür. Cerebras-IPO am 14. Mai bei $70 Mrd. Bewertung, am ersten Tag stark im Plus. Klarna meldet kräftiges Umsatzwachstum und wird wieder profitabel. Nvidia-CEO-Stiftung kauft $108 Mio. Compute bei CoreWeave und spendet es. Unterstütze unseren Podcast und entdecke die Angebote unserer Werbepartner auf doppelgaenger.io/werbung. Vielen Dank! Philipp Glöckler und Philipp Klöckner sprechen heute über: (00:00:00) Anthropic $900 Mrd. Bewertung (00:05:04) Google/SpaceX: Data Center im All (00:08:22) SAP + n8n: $5 Mrd. (00:18:47) DeepMind AI Pointer (00:26:06) Amazon 30-Min-Lieferung (00:28:26) Amazon Token-Maxing & Pentagon (00:30:56) OpenAI verklagt Apple (00:39:33) Musk vs. Altman: Altmans Aussage (00:44:55) Codex 60 Tage gratis (00:53:37) Ford-Pivot: Batterien für Data Center (00:55:15) Gallup: Amerikaner gegen Data Center (00:58:10) Cerebras IPO +68% (01:00:58) Klarna wieder profitabel (01:06:52) Nvidia-CEO Infinite Money Glitch Shownotes Anthropic Funding - ft.com n8n wird dank SAP wertvollste deutsche KI-Firma - handelsblatt.com Jan Oberhauser (n8n) bei SAP Sapphire - linkedin.com Parloa-Meilenstein mit SAP - linkedin.com DeepMind: AI Pointer für kontextuellen Mauszeiger - deepmind.google Amazon startet 30-Minuten-Lieferung in US-Städten - cnbc.com Amazon AI - ft.com Security-Test-Details von Microsoft/Google/xAI von US-Behördenseite gelöscht - reuters.com OpenAI Apple - ft.com Altman im Zeugenstand: Hair-raising AI-Safety-Chat mit Musk - bloomberg.com Polymarket: Wird Musk gegen Altman gewinnen? - polymarket.com Sam Altman Tweet - xcancel.com Ford Aktie AI - ft.com 7 von 10 Amerikanern gegen Data Center vor der Haustür - washingtonpost.com Cerebras - ft.com Klarna macht Gewinn, Umsatz springt - wsj.com Nvidia-CEO-Stiftung kauft $108 Mio. KI-Compute, CoreWeave spendet es - reuters.com
The Twenty Minute VC: Venture Capital | Startup Funding | The Pitch
AGENDA: 00:05:11 — Anthropic freezes secondary sales, requiring board approval for all transfers. 00:10:45 — Why Anthropic is buying capacity from Elon Musk. 00:15:35 — Anthropic's massive $200B revenue commit to Google. 00:18:55 — Goldman Sachs predicts a 24x surge in token consumption driven by agents. 00:31:05 — Will AI labs eat the app layer? The threat to Legal and CX verticals. 00:37:55 — SaaS public markets: HubSpot tanks 18% while Monday.com finds its footing. 00:42:40 — Growth theft: How Clay is commoditizing ZoomInfo's data business. 00:46:25 — Cerebras prices IPO at $150–$160 with a $48B market cap. 00:52:15 — Real Venture Capital: Celebrating the early bets by Foundation and Benchmark. 00:58:30 — Ramp's valuation vs. the Chapter 7 collapse of e-commerce card Parker. 01:06:20 — Success and Sacrifice: Is mental health the price of building a $20B company?
My guest today is Krishna Rao, the CFO of Anthropic. The center of our conversation is how he navigates the decision around procuring and allocating compute, which he describes as the canvas on which everything else gets built. We talk about what he calls the cone of uncertainty, the three chip platforms Anthropic uses fungibly across Trainium, TPUs, and GPUs, and the daily meetings they run to allocate compute between model development, internal use, and serving customer demand. He explains why the returns to frontier intelligence keep getting higher, especially in enterprise, and how Anthropic thinks about the line between platform and application and why they choose to build their own products like Claude Code. Krishna has such a unique seat watching one of the fastest growing businesses in history, and he is generous in sharing what he has learned since joining the company two years ago. For the full show notes, transcript, and links to mentioned content, check out the episode page here. ----- Become a Colossus member to get our quarterly print magazine and private audio experience, including exclusive profiles and early access to select episodes. Subscribe at colossus.com/subscribe. ----- Ramp's mission is to help companies manage their spend in a way that reduces expenses and frees up time for teams to work on more valuable projects. Go to ramp.com/invest to sign up for free and get a $250 welcome bonus. ----- Trusted by thousands of businesses, Vanta continuously monitors your security posture and streamlines audits so you can win enterprise deals and build customer trust without the traditional overhead. Invest Like the Best listeners get a special offer of $1,000 off Vanta when you go to vanta.com/invest. ----- WorkOS is the infrastructure B2B and AI-native companies use to sell to enterprise. It covers everything enterprise security requires: SSO, SCIM, RBAC, Audit Logs, AI governance, and more. Trusted by 2,000+ fast-growing companies, including OpenAI, Anthropic, Cursor, and Vercel. ----- Rogo is the AI platform for finance. They're building agents for Wall Street that are trained to understand how bankers and investors actually do work: from diligence and modeling, to turning analysis into deliverables. To learn more, visit rogo.ai/invest. ----- Ridgeline has built a complete, real-time, modern operating system for investment managers. It handles trading, portfolio management, compliance, customer reporting, and much more through an all-in-one real-time cloud platform. Visit ridgelineapps.com. ----- Editing and post-production work for this episode was provided by The Podcast Consultant (https://thepodcastconsultant.com). Timestamps: (00:00:00) Welcome to Invest Like The Best (00:02:29) Episode Intro: Krishna Rao (00:03:14) Compute as Anthropic's Lifeblood (00:05:17) Three Fungible Chip Platforms (00:07:31) The Cone of Uncertainty (00:09:08) Competing Ways to Allocate Compute (00:10:36) What Drives Compute Efficiency (00:12:38) Why Frontier Returns Are So High (00:16:32) How Claude Code Writes Its Own Code (00:18:46) Will Talent Become Obsolete? (00:20:07) How Scaling Laws Are Holding (00:21:54) Exponential Thinking (00:23:17) The Layer Cake of Compute (00:26:36) How Anthropic Deploys New Compute (00:27:53) Platform v. Application Layer (00:32:42) Why Model Pricing Has Stayed Stable (00:35:26) Measuring Return on Compute (00:37:22) Working With Chip Providers (00:38:32) How Anthropic's Finance Team Uses Claude (00:41:32) The Jevons Paradox for Labor (00:43:08) Anthropic's Fundraising & Growth Journey (00:47:31) The Exponential Revenue Curve (00:49:02) The Hardest Thing to Explain to Investors (00:52:15) AI's Public Perception Problem (00:55:38) Mythos (00:57:31) Relationship With Government (00:58:51) Inside Anthropic's Culture (01:03:48) The Next Frontier: Virtual Collaborators (01:06:22) How Leaders Scale With a Business (01:10:55) The Biggest Risks to Continued Progress (01:12:09) What Krishna is Excited About (01:13:45) The Kindest Thing
The future of AI isn't a smarter chatbot. It's a model that watches your screen, listens to the room, and acts on what it sees. We dug into Thinking Machines' new interaction model, what it means for compute, and the layoff wave that's already here.This week's roundtable: Anastasios Angelopoulos (CEO of Arena, formerly LMArena), Nick Harris (CEO of Lightmatter, photonic computing chips), and Philip Johnston (CEO of StarCloud, building megawatt data centers in space).Thank you to our exclusive sponsor:PayPal Open, One Platform for All Business: http://paypalopen.com/Timestamps:0:00 Cold open1:21 Welcome to Episode 132:51 Is China closing the AI gap? Arena's data5:16 Lightmatter and the photonic interconnect bottleneck9:42 StarCloud 2, Nvidia Space Ruben 1, and orbital data centers17:24 Thinking Machines' interaction model: what's actually new28:22 Whisper Flow and the 3-pedal desk setup33:48 Real-time desktop and camera awareness as the real unlock40:25 Why this 100x's compute demand42:43 The polarization of compute and $10M personal data centers49:25 The layoff wave: Cloudflare, PayPal, Coinbase, Upwork54:48 The 10x gap between AI-first and non-AI-first employees59:52 Unlimited agency and the abundance future1:00:46 Anthropic's Project Luna runs a retail store1:03:45 Decoupling labor from value creation1:05:03 P(doom) round
Today's article asks the right question for the first time in 10 days of news. Who actually benefits from AI? Not who lobbied the bill. Not who got the federal regulation. Not who showed up to the protest. Who gets the money, the time saved, the leverage.The honest answer has three parts. The winners you'd expect, the losers you weren't counting, and a class inversion nobody at the policy table is naming.Picks and shovels. Nvidia is up four trillion in market cap. Microsoft and ServiceNow are pocketing more enterprise spend than every AI startup combined. The AI labs are the visible winners but a thin slice of the actual margin.Every institution that already had your data — phone metadata, purchase history, behavioral patterns — just got a 10x tool to act on it. The beneficiary depends on which seat you're in. The customer is rarely in the winning seat.The question is wrong. AI doesn't benefit anyone. It redistributes leverage to whoever already had it. Compute owners. Capital owners. Regulatory incumbents. The data brokers who sat on it for a decade.The losing column is starting to show in the labor data. Entry-level white collar. Coding bootcamps closing. Big-law summer associate classes getting cut. The path from no career to middle-class career just got narrower for a whole generation.AI delivers real things. A kid in Boise gets diagnosed in 11 minutes instead of four years. A small business gets a marketing engine that used to need an agency. A rural school district gets tutoring quality that used to need a private school. Those benefits are real.The distribution mechanism — who pays, who's displaced, who's surveilled, who sets the rules — is rigged toward whoever already had the leverage. Both are true at the same time. The hard part is refusing to pretend only one is. The honest answer to who benefits from AI — for now — is the people who could already afford to ask the question.⏱️ Chapters0:00 — The right question, finally0:25 — MiniDoge: picks and shovels — Nvidia, hyperscalers, enterprise software0:50 — Nyx: every institution with your data just got a 10x tool1:15 — HH: AI doesn't benefit — it redistributes leverage1:30 — MiniDoge: the bottom rung of the white-collar career path is shrinking2:00 — Saarvis: the technology is real, the distribution is rigged⚡ Learn agentic ai free - https://staas.fund/ai-workshop ⚡-----
Missiles in the Strait of Hormuz. Brent jumps 5%. Bitcoin breaks through $80. The Bits + Bips crew reads the geopolitical tape — and explains why crypto is shrugging it off. --- Thank you to our sponsor! Coinbase One — coinbase.com/unchained Heads up! If you haven't yet, be sure to subscribe to Bits + Bips, since the show will migrate there in a few weeks. Follow us on Apple Podcasts, YouTube, Spotify, X, Unchained and wherever you get your podcasts. ---- Iranian cruise missiles struck commercial vessels in the Strait of Hormuz, Brent jumped 5%, and Bitcoin broke through $80 — all in the same day. The Bits + Bips crew unpacks what the escalation means for crypto and macro positioning, why Ram stays bullish, and whether Paul Tudor Jones is right that Bitcoin is now the best inflation hedge. They also break down the Clarity Act's yield compromise — with Circle up 16% — and why Austin argues banks may have handed asset managers a structural win. Finally, a U.S. court filing targeting Arbitrum's frozen North Korean funds raises a bigger question: can you serve legal papers on code, and what does that mean for DAO governance? Austin Campbell, Ram Ahluwalia, and Chris Perkins break it all down. Hosts: Austin Campbell (@austincampbell) — Founder, Zero Knowledge Consulting; Adjunct Professor, NYU Stern Ram Ahluwalia, Co-Host, CEO of Lumida Chris Perkins, Co-Host, CEO of 250 Digital Asset Management Learn more about your ad choices. Visit megaphone.fm/adchoices
AI Chat: ChatGPT & AI News, Artificial Intelligence, OpenAI, Machine Learning
In this episode, we explore Anthropic's latest advancements, including Claude's new dreaming feature for self-improvement and significant cloud computing partnerships with Google and SpaceX. We also discuss the urgent security concerns raised by Red Access regarding apps leaking sensitive data and Google's decision to shut down Project Mariner.Chapters00:00 Introduction01:59 Anthropic's Dreaming Feature06:00 Google's Project Mariner Shutdown09:58 Data Leaks in App Development14:01 Cloud Computing Partnerships15:01 Conclusion and InsightsShow LinksGet the top 80+ AI Models for $8.99 at AI Box: https://aibox.aiHow I Grow and Scale My Business with AI: https://www.skool.com/aihustleShow Articles Read more on AI Chat Daily: Thousands of Vibe-Coded Apps Leak Medical Records and Corporate Strategy DecksRed Access Finds 2,000 Vibe-Coded Apps Leaking Medical and Corporate Data
Raul Martynek, CEO of DataBank, joins TITV Host Akash Pasricha to discuss Anthropic's deal to utilize xAI's Colossus data center and whether Elon Musk is pivoting to a neo-cloud business model. We also talk with The Information's Rocket Drew about Shivon Zilis's testimony in the Musk v. OpenAI trial, and we get into the $22 billion valuation of Kalshi and Polymarket's shaky U.S. homecoming with Michael Roddan and Yueqi Yang.Articles discussed on this episode: https://www.theinformation.com/newsletters/ai-agenda/musk-giving-xais-servers-anthropic-ai-video-app-developer-reka-acquires-video-generating-startuphttps://www.theinformation.com/articles/polymarkets-homecoming-shaky-u-s-ceo-awolSubscribe: YouTube: https://www.youtube.com/@theinformation The Information: https://www.theinformation.com/subscribe_hSign up for the AI Agenda newsletter: https://www.theinformation.com/features/ai-agendaTITV airs weekdays on YouTube, X and LinkedIn at 10AM PT / 1PM ET. Or check us out wherever you get your podcasts.Follow us:X: https://x.com/theinformationIG: https://www.instagram.com/theinformation/TikTok: https://www.tiktok.com/@titv.theinformationLinkedIn: https://www.linkedin.com/company/theinformation/Chapters: 00:00 - Introduction 01:13 - Anthropic's Data Center Deal with xAI 10:41 - Shivon Zilis Testifies in OpenAI Trial 13:57 - Kalshi Raises $1B at $22B Valuation 15:17 - Polymarket's Shaky U.S. Homecoming
I write this with reluctance because I know that I will receive hundreds of emails correcting me on a few niggling little details.But write on, I must.“Write on, write on, write on.”“Cost of Compute” refers to the $8 to $13 that every AI company has to spend on electricity and short-lived computer chips for every $1 that comes through the door.Losing a dozen dollars for every dollar you touch isn't a problem when investors are showering you with cash from a fire hose.But it's beginning to look like the well has run dry.I did not want to defend where I got my information, so I went to the Goog and asked, “Oh Great Googness, why are people referring to the S&P 500 as the “S&P 10”?Check this out, cub scout, straight from the AI of the Almighty Google:“The S&P 500 is being referred to as the “S&P 10″ because a handful of massive technology-related companies dominate the index's performance. Due to market-cap weighting, these top 10 stocks disproportionately influence the index's total return, making the ‘broad market' performance heavily reliant on these few, AI-exposed companies. More than $40 of every $100 invested in the S&P 500 is going into just 10 companies, creating a high level of concentration not seen in decades. In some recent periods, those top 10 stocks have accounted for nearly 90% of the entire index's gains, indicating that the remaining 490+ stocks contribute very little to the overall upward movement.”Allow me to highlight Three Big Problems.Manufacturing companies, food companies, service companies, and all the other cash-hungry hopefuls that are the true wonders of the American economy have not been able to raise any money because way too many people have been dumping everything they've got into AI.That money has now slowed down, which means that a lot of AI dependent companies are now being burned by their “burn rate,” a slang term for “precisely how fast they are losing money.”AI is getting worse, not better, despite the fact that everyone is repeating like parrots, “AI is worse now than it will ever be. Hour by hour, AI will just keep getting better and better forever and ever.”Okay, I can tell from the look of doubt that I see in your eyes that you need me to explain a little bit more about Problem Number 3.They can't raise prices fast enough to stop the bleeding, so most of the AI companies have reduced their Cost of Compute by 86%.“But how?” you ask.Here's how. In the recent past, you could give your $200/mo AI some detailed instructions and it would go on a deep dive to bring you the golden nuggets of information that you requested. The 86% savings of Compute Cost is because they instructed the AI to just look in the cache for what they told someone else who asked a similar question. You get a recycled answer, and the AI company saves 86%.I'll wrap this up by giving you the storyteller's definition of “inflection point.”Bad storytellers say, “This happened, then This happened, then This happened, then This happened, then This happened.”Real stories happen like this: “This happened, THEREFORE this happened, BUT then, This happened.”“BUT then, This happened” is called “an Inflection Point.”“So what's going to happen next?”Let's wait and see.Roy H. Williams
The Twenty Minute VC: Venture Capital | Startup Funding | The Pitch
AGENDA: 00:00 $45B Floods into Anthropic from Google & Amazon 05:10 OpenAI Misses Growth Targets — Is This a Real Problem? 08:40 The Rise of AI Agents: Why Humans No Longer Pick Models 12:05 "Compute ≠ Revenue": The First Crack in the AI Business Model 20:30 China Blocks $2B Manus Deal — AI Cold War Escalates 34:10 Why Google May Be the Biggest Winner in AI Infrastructure 41:50 The Death of SaaS? Agents Replace Apps Like Jira & Canva 46:20 Thoma Bravo Hands Medallia to Creditors — $5B Wiped Out 52:10 The Collapse of Private Equity Exit Routes in VC