Podcasts about Andrej Karpathy

AI researcher at Tesla

  • 165PODCASTS
  • 278EPISODES
  • 54mAVG DURATION
  • 5WEEKLY NEW EPISODES
  • Jun 4, 2026LATEST
Andrej Karpathy

POPULARITY

20192020202120222023202420252026


Best podcasts about Andrej Karpathy

Latest podcast episodes about Andrej Karpathy

Microsoft Cloud IT Pro Podcast
Episode 429: Getting started with LLM Wikis

Microsoft Cloud IT Pro Podcast

Play Episode Listen Later Jun 4, 2026 44:04 Transcription Available


Welcome to Episode 429 of the Microsoft Cloud IT Pro Podcast. In this episode, Scott and Ben dig into the concept of LLM wikis, specifically building personal knowledge management vaults using Obsidian, markdown, and AI tooling like Claude Code, GitHub Copilot CLI, and Copilot Cowork. The core idea comes from a gist by Andrej Karpathy and involves creating a structured folder of markdown clippings that an LLM can reason over to extract entities, concepts, and sources, building a searchable, graph-linked knowledge base over time. Scott walks through how he wired up Obsidian Web Clipper and an RSS Dashboard plugin to feed articles into his vault automatically, then had the LLM help build a Python script to automate the ingest workflow and cut down on token usage. The conversation expands into how Copilot Cowork fits into this workflow as a scheduling harness, with practical examples of using it to pull email from an inbox daily, convert messages to markdown, and generate a prioritized to-do list. Ben shares how he applied the same approach to 428 episodes of podcast transcripts, and both hosts note that token costs can run high fast without some upfront thinking about optimization. Scott closes with a reminder that pulling data into plain markdown sidecars outside of IRM and sensitivity label protections means teams should stay mindful of organizational data policies. Your support makes this show possible! Please consider becoming a premium member for access to live shows and more. Check out our membership options. Show Notes LLM Wiki GitHub Copilot Wiki: An AI-Powered Second Brain Template Karpathy’s LLM Knowledge Base Wiki for Enterprise Karpathy’s LLM Wiki? No Code with Claude or Github Copilot! sametbrr/llm-wiki-manager Sponsors TrustedTech is a leading Microsoft Cloud Solution Provider (CSP) specializing in Microsoft Cloud services, Microsoft perpetual licensing, and Microsoft Support Services for medium and enterprise-sized businesses. Their robust team of in-house, U.S.-based Microsoft architects and engineers are certified in all 6/6 Microsoft Solutions Partner Designations in the Microsoft Cloud Partner Program. M365 Licensing Consultation M365 Tenant Assessment Copilot Readiness Assessment ShareGate is your migration and governance solution for Microsoft 365. ShareGate helps your teams simplify tenant migrations, get Copilot-ready, and take control of Microsoft 365 governance. Nasuni is a leading unstructured data platform for enterprises where file data is mission-critical for both people and AI. Nasuni powers the operational file layer where work happens — helping organizations manage, protect, and activate data so teams can work smarter, reduce costs, and operate securely without limits. Intelligink — Would you like to become the irreplaceable Microsoft 365 resource for your organization? Let us know!

The Startup Podcast
Why Gary Lo's product strategy has evolved in the era of DIY software

The Startup Podcast

Play Episode Listen Later Jun 1, 2026 52:40


What if writing software became as easy as taking a selfie?This episode, Yaniv Bernstein sits down with Gary Lo - founder of OpenBA and one of the sharpest AI-and-startups thinkers Yaniv knows - to discuss the concept of 'selfie software': disposable, hyper-personal, AI-generated tools that anyone can create for themselves, with no hand-written code.AI-generated tools like these are changing the startup landscape. While founders now have more tools at their disposal, it's now necessary than ever to create a product that truly disrupts the market.Gary and Yaniv discuss all of this and more, likening Claude and ChatGPT to Windows and Mac, and exploring what this tech landscape means if you're building a software startup today.In this episode, you will:Understand the 'selfie software' concept: why AI is making software disposable, personal, and low-stakes, and what that means for the market you're building inLearn why AI platforms are forcing startups to rethink whether they should build on their own infrastructure or embed into Claude and ChatGPT insteadHear Gary's 'burn it down' exercise: how to identify which parts of your product are genuinely defensible, and which will simply catch fire in the next AI waveUnderstand why software engineering isn't dead, but the problems worth solving with it have fundamentally shiftedTimestamps00:00 Coming Up...01:09 On Today's Show: Gary Lo on 'Selfie Software'02:48 About Gary03:16 How 'Hyper-Personalized' AI Is Like Photography05:36 Gary's Real Estate Workflow (OpenBA)07:29 Defining 'Selfie Software': Why Custom Tools Win10:33 So... Is It Bad Software?13:29 'Can't You Just Add This One Thing...'15:30 When Personalization Becomes Bloat17:42 Working In-App with Anthropic and OpenAI APIs20:29 Token Economics and Moats25:28 Microsoft's Lessons in Platform Power30:27 But What If Anthropic Comes For My Vertical?32:48 How Open Source Keeps AI in Check35:18 Unlearning and Rebuilding39:05 Gary's 'Burn It Down' Test44:01 Is Software Engineering Dead? (No.)50:17 Closing ThoughtsResources in this episodeGary Lo's previous TSP episode (on OpenClaw and Claude Cowork): https://youtu.be/V3YFghiy8p0 Garry Tan's gstack: https://github.com/garrytan/gstack Andrej Karpathy on Software 2.0: https://karpathy.medium.com/software-2-0-a64152b37c35 Vera (Yaniv's startup, AI-supported guidance for people caring for ageing parents): https://vera.guideThe PactHonor the Startup Podcast Pact! If you have listened to TSP and gotten value from it, please:Follow, rate, and review us in your listening appSecure your official TSP merchandise at https://shop.tsp.show/Follow us here on YouTube for full-video episodes: https://www.youtube.com/channel/UCNjm1MTdjysRRV07fSf0yGgGive us a public shout-out on LinkedIn or anywhere you have a social media followingKey linksThis episode of the Startup Podcast is sponsored by .tech domains. Forget weird prefixes and creative misspellings; the availability for .tech domains is simply way better than .com. For a clean name that highlights your tech credentials, get a .tech domain at your favorite registrar.This episode of the Startup Podcast is sponsored by Vanta. Vanta helps businesses get and stay compliant by automating up to 90% of the work for the most in demand compliance frameworks. With over 200 integrations, you can easily monitor and secure the tools your business relies on. For a limited time offer of US$1,000 off, go to ⁠⁠⁠⁠https://⁠www.vanta.com/tsp⁠⁠⁠⁠⁠ The Startup Podcast website: https://www.tsp.show/episodes/Learn more about Chris and YanivWork 1:1 with Chris: http://chrissaad.com/advisory/Follow Chris on Linkedin: https://www.linkedin.com/in/chrissaad/Follow Yaniv on Linkedin: https://www.linkedin.com/in/ybernstein/Producer: Justin McArthur https://www.linkedin.com/in/justin-mcarthurAssistant Producer: Steph Hefferan https://www.linkedin.com/in/steph-heff/Intro Voice: Jeremiah Owyang https://web-strategist.com/

El podcast de El Club de Inversión
312 - ¿Estamos en Burbuja de la IA? 4 consejos para proteger tu dinero AHORA

El podcast de El Club de Inversión

Play Episode Listen Later May 28, 2026 18:50 Transcription Available


↴    ↴    ↴Descubre aquí tu perfil de inversor.En este podcast te explico por qué el 92% del crecimiento económico de Estados Unidos en 2025 depende exclusivamente de la inteligencia artificial y por qué este nivel de concentración puede ser una señal de posible burbuja. Analizo contigo qué es una burbuja financiera, cómo encaja la inversión en IA dentro de este patrón histórico y por qué las Magnificent Seven han alcanzado un nivel de dominio sin precedentes dentro del S&P 500.Después te muestro los datos más recientes de Goldman Sachs para entender si realmente las valoraciones actuales justifican el precio de las acciones tecnológicas. Revisamos el PER, el crecimiento real de beneficios y casos como NVIDIA para evaluar si estamos ante una burbuja especulativa o un crecimiento fundamentado en ingresos reales.Más adelante profundizo en el enorme riesgo que supone el Capex en inteligencia artificial, la financiación circular, la deuda oculta y el papel de la IA como columna del PIB estadounidense. También te cuento por qué expertos como Andrej Karpathy contradicen las predicciones optimistas sobre la llegada de la AGI y qué implica este desacople entre expectativas tecnológicas y realidad.Finalmente te doy cuatro consejos prácticos como inversor para navegar este escenario: cómo invertir a largo plazo, cómo diversificar en la cadena de valor de la IA, cómo evitar el FOMO y cómo priorizar empresas con fundamentales sólidos. Cerramos analizando si estamos ante una burbuja, una revolución o ambas cosas a la vez, y qué significa esto para tus inversiones en los próximos años.

Let's Talk AI
#246 - Gemini 3.5 + Omni, Musk Loses, OpenAI vs Erdős

Let's Talk AI

Play Episode Listen Later May 25, 2026 93:59


Our 246th episode with a summary and discussion of last week's big AI news!Recorded on 05/22/2026Hosted by Andrey Kurenkov and Jeremie HarrisFeel free to email us your questions and feedback at andreyvkurenkov@gmail.com and/or hello@gladstone.aiRead out our text newsletter and comment on the podcast at https://lastweekin.ai/In this episode:Google I/O highlights included Gemini 3.5 (with 3.5 Flash emphasized for speed and benchmarks), the always-on agent Gemini Spark running on Google Cloud with MCP tool support, and Gemini Omni multimodal video generation/editing, plus updates like Anti-Gravity 2.0, Gemini for Science, and Genie world-model navigation using Street View and Waymo simulation.Coding-agent competition accelerated with Cursor Composer 2.5 (fine-tuned on Moonshot's Kimi K2.5) and xAI's early Grok Build release, alongside discussion of potential Cursor–xAI ties and xAI's talent churn and compute utilization concerns.Business and legal updates included Elon Musk losing his OpenAI lawsuit on statute-of-limitations grounds, reported OpenAI–Apple partnership tensions, Anthropic agreeing to a $30B funding round at a $900B valuation and projecting its first profitable quarter, and Cerebras' IPO surging about 90%. Research and safety stories covered OpenAI's result on an 80-year-old Erdős geometry problem, findings on “negation neglect” in training, interpretability work showing multiple redundant circuits per capability, agent benchmarks like Terminal World, new deepfake takedown enforcement under the Take It Down Act, demonstrations of autonomous hacking/self-replication, rapidly improving AI cyber capabilities, and steps toward image provenance metadata and watermarks.Timestamps:(00:00:10) Intro / Banter(00:01:15) News PreviewTools & Apps(00:05:05) Google unveils AI model Gemini 3.5 and AI agent Gemini Spark(00:11:43) Google's Gemini Omni turns images, audio, and text into video — and that's just the start | TechCrunch(00:17:27) Google launches Antigravity 2.0 with an updated desktop app and CLI tool at IO 2026 | TechCrunch(00:22:35) Google Debuts AI-Powered Tools To Optimize Scientific Research Workflows(00:27:20) Google's Genie world model can now simulate real streets with Street View | TechCrunch(00:29:51) Cursor's Composer 2.5 matches Opus 4.7 and GPT-5.5 benchmarks at a fraction of the cost(00:37:37) xAI Introduces Its Coding Agent Called Grok BuildApplications & Business(00:41:55) Musk loses OpenAI court battle as he waited too long to sue(00:48:08) Anthropic agrees terms of $30bn funding deal at $900bn valuation(00:53:12) OpenAI co-founder Andrej Karpathy joins Anthropic's pre-training team | TechCrunch(00:56:49) Greg Brockman Officially Takes Control of OpenAI's Products in Latest Shake-Up | WIRED(00:58:15) OpenAI-Apple Partnership Frays, Setting Up Possible Legal Fight - Bloomberg(01:01:13) AI chipmaker Cerebras soars 90% in year's biggest IPO so farResearch & Advancements(01:07:10) AI just solved an 80-year-old ‘Erdős problem,' and mathematicians are amazed | Scientific American(01:11:50) Negation Neglect: When models fail to learn negations in training(01:13:18) All Circuits Lead to Rome: Rethinking Functional Anisotropy in Circuit and Sheaf Discovery for LLMs(01:16:20) Autonomous AI research for nanogpt speedrun(01:21:59) TerminalWorld: Benchmarking Agents on Real-World Terminal TasksPolicy & Safety(01:23:15) America's dangerous, messy deepfakes crackdown is here | The Verge(01:25:17) Language Models Can Autonomously Hack and Self-Replicate(01:28:48) How fast is autonomous AI cyber capability advancing?(01:31:32) Positive Alignment: Artificial Intelligence for Human FlourishingSynthetic Media & Art(01:33:15) OpenAI is making it easier to check if an image was made by their models | TechCrunch(01:33:56) How Chinese short dramas became AI content machines | MIT Technology ReviewSee Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

The Generative AI Meetup Podcast
Karpathy Joins Anthropic and the AI Compute Gold Rush

The Generative AI Meetup Podcast

Play Episode Listen Later May 24, 2026 99:57 Transcription Available


  This week on AI Meta, we break down Andrej Karpathy's move to Anthropic, Claude's growing developer   mindshare, and why recursive self-improvement may be the next major frontier in AI. We also cover   Google's latest Gemini announcements, Anthropic's reported compute deal with xAI/SpaceX, the rise of   gray-market Claude API access in China, OpenAI's ongoing drama, Cerebras, Nvidia, Intel, and Leopold   Aschenbrenner's massive AI infrastructure bets.   Plus: SpaceX IPO speculation, Cursor, Grok, and why the AI economy increasingly looks like a global   casino. Not financial advice.   https://novacut.ai

Leveraging AI
295 | The Foothills of the Singularity: Connecting the Dots on the Week AI Quietly Became “a Profound Moment for Humanity" (quotes from Demis Hassabis, CEO Google Deep Mind) May 22, 2026

Leveraging AI

Play Episode Listen Later May 23, 2026 44:00 Transcription Available


This week wasn't just another wave of AI announcements.It may have been the week the industry quietly crossed into a different phase entirely.In this episode, Isar connects the dots behind one of the biggest weeks in AI so far—from Anthropic's explosive growth, to Google I/O, OpenAI's legal win, NVIDIA's record earnings, and Andrej Karpathy joining Anthropic to work on recursive self-improvement.Individually, each story matters.Together, they point to something bigger: accelerating AI capability, accelerating infrastructure buildout, and growing signals from the people closest to the frontier that we may be entering a very different era.The quote that framed the episode came from Demis Hassabis: “We were standing at the foothills of the singularity. It will be a profound moment for humanity.”This episode breaks down what that actually means—and why the implications go far beyond new models and product launches.In this session, you'll discover: - Why Anthropic's projected $44B annualized revenue shocked the industry - How Anthropic became more profitable per user than OpenAI, Google, and Microsoft - Why Andrej Karpathy joining Anthropic may be one of the year's biggest AI stories - What recursive self-improvement (RSI) means—and why labs are racing toward it - How OpenAI's legal win against Elon Musk clears the runway for a potential IPO - Why Google's AI strategy suddenly looks both confusing and incredibly ambitious - What Google's shift from “search” to autonomous AI agents means for websites and SEO - Why AI solving an 80-year-old math problem matters more than most people realize - How NVIDIA, SpaceX, and compute infrastructure are becoming central to the AI race - Why electricity—not chips—may become the biggest bottleneck in AI expansion - What Demis Hassabis means when he says we're at the “foothills of the singularity”About Leveraging AIThe Ultimate AI Course for Business People: https://multiplai.ai/ai-course/YouTube Full Episodes: https://www.youtube.com/@Multiplai_AI/Connect with Isar Meitis: https://www.linkedin.com/in/isarmeitis/ Join our Live Sessions, AI Hangouts and newsletter: https://services.multiplai.ai/eventsIf you've enjoyed or benefited from some of the insights of this episode, leave us a five-star review on your favorite podcast platform, and let us know what you learned, found helpful, or liked most about this show!

All-In with Chamath, Jason, Sacks & Friedberg
SpaceX's $2T Case, Nvidia's Shock Selloff, America Turns on AI, Trump Pulls AI Order, Bond Crisis?

All-In with Chamath, Jason, Sacks & Friedberg

Play Episode Listen Later May 22, 2026 102:00


(0:00) Gavin Baker joins the show! (0:30) Andrej Karpathy joins Anthropic; hypergrowth and profitability (12:42) Why Americans have turned on AI, anti-human perception (27:22) Trump pulls AI EO, US-China AI relationship, dystopian AI layoffs (45:19) SpaceX S-1 tear down! Breaking down the three major businesses and the case for a $2T valuation (1:11:22) Nvidia smashes earnings but stock falls, why people are shorting chips (1:22:25) Market update: Flashing red signals, oil, inflation, yields up (1:32:45) China trip flops, or was progress made behind the scenes? Follow Gavin Baker: https://x.com/GavinSBaker Apply for Summit 2026: https://allin.com/events Follow the besties: https://x.com/chamath https://x.com/Jason https://x.com/DavidSacks https://x.com/friedberg Follow on X: https://x.com/theallinpod Follow on Instagram: https://www.instagram.com/theallinpod Follow on TikTok: https://www.tiktok.com/@theallinpod Follow on LinkedIn: https://www.linkedin.com/company/allinpod Intro Music Credit: https://rb.gy/tppkzl https://x.com/yung_spielburg Intro Video Credit: https://x.com/TheZachEffect Referenced in the show: https://www.cnbc.com/2026/05/19/anthropic-hires-openai-cofounder-andrej-karpathy-former-tesla-ai-lead.html https://github.com/karpathy/autoresearch https://github.com/multica-ai/andrej-karpathy-skills/stargazers https://techcrunch.com/2026/05/19/openai-co-founder-andrej-karpathy-joins-anthropics-pre-training-team https://www.wsj.com/tech/ai/mind-blowing-growth-is-about-to-propel-anthropic-into-its-first-profitable-quarter-7edbf2f4 https://x.com/i/broadcasts/1dxYljYVREYJX https://apnews.com/article/trump-ai-executive-order-ee318f35acc8a2c43e47f3ebf26cb459 https://x.com/wallstengine/status/2057378437485216031 https://x.com/MorePerfectUS/status/2056842597117636890 https://x.com/lulumeservey/status/2057239284487201043 https://polymarket.com/event/spacex-ipo-closing-market-cap-above https://x.com/elonmusk/status/2057228707606196434 https://www.sec.gov/Archives/edgar/data/1181412/000162828026036936/spaceexplorationtechnologi.htm https://s201.q4cdn.com/141608511/files/doc_financials/2027/Q127/NVDA-F1Q27-Quarterly-Presentation-FINAL.pdf https://www.ibtimes.co.uk/leopold-aschenbrenner-investment-shift-agi-over-ai-chips-1797606 https://polymarket.com/event/may-inflation-us-annual https://www.cnbc.com/2026/05/15/inflation-rate-projected-to-hit-6percent-in-the-second-quarter-top-economic-forecasters-say.html https://polymarket.com/event/fed-rate-hike-in-2026 https://www.cnbc.com/2026/05/18/treasury-yields-inflation-bond-rout-oil.html https://www.cnbc.com/quotes/US10Y

The AI Breakdown: Daily Artificial Intelligence News and Discussions

A week of AI news added up to something bigger than any single story: Anthropic's path to profitability, OpenAI's math breakthrough, Google pushing AI deeper into Search and Docs, Cursor's cheaper coding model, SpaceX becoming an AI compute player, Andrej Karpathy joining Anthropic, and the political fight over AI policy all pointed in the same direction. AI acceleration is showing up across business models, model capabilities, consumer products, compute infrastructure, and regulation at the same time.Enterprise Claw Cohort 3 Registration: ⁠⁠⁠⁠⁠⁠https://enterpriseclaw.ai/⁠⁠⁠⁠⁠⁠Brought to you by:KPMG – Agentic AI is powering a potential $3 trillion productivity shift, and KPMG's new paper, Agentic AI Untangled, gives leaders a clear framework to decide whether to build, buy, or borrow—download it at ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.kpmg.us/Navigate⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Granola - The AI notepad for people in back-to-back meetings. 100% off your first 3 months with code AIDAILY at ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠http://granola.ai/aidaily⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Scrunch - The AI customer experience platform - ⁠⁠⁠⁠⁠⁠⁠⁠https://scrunch.com/⁠⁠⁠⁠⁠⁠⁠⁠Mercury - Modern banking for business and now personal accounts. Learn more at ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://mercury.com/personal-banking⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Zenflow Work - Agents for knowledge work - ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://zenflow.free/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Drata - The agentic trust management platform - ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://drata.com/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Blitzy - Want to accelerate enterprise software development velocity by 5x? ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://blitzy.com/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠AssemblyAI - The best way to build Voice AI apps - ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://www.assemblyai.com/brief⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Robots & Pencils - Cloud-native AI solutions that power results ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://robotsandpencils.com/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠The AI Daily Brief helps you understand the most important news and discussions in AI. Subscribe to the podcast version of The AI Daily Brief wherever you listen: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://pod.link/1680633614⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Our Newsletter is BACK: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://aidailybrief.beehiiv.com/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Interested in sponsoring the show? sponsors@aidailybrief.ai

Software Defined Talk
Episode 573: How many quadrillions in a Googol?

Software Defined Talk

Play Episode Listen Later May 22, 2026 62:24


This week, we discuss Google I/O, the OpenAI soap opera, and ChatGPT going full financial advisor. Plus, thoughts on improving the conference hallway track. Watch the YouTube Live Recording of Episode 573 Runner-up Titles Stupid Macs I like my idea What was I thinking? Opt-in AI Kentucky Derby's this Weekend It's a low plateau There's no vibe in X-Code Matt's trading with AI Everyone's watching Rundown Google I/O A new era for AI Search All the news from the Google I/O 2026 Developer keynote I/O 2026: Welcome to the agentic Gemini era AI Stuff Elon Musk lost his case against Sam Altman Greg Brockman Officially Takes Control of OpenAI's Products in Latest Shake-Up OpenAI launches ChatGPT for personal finance, will let you connect bank accounts Anthropic hires OpenAI co-founder Andrej Karpathy, former Tesla AI leader Relevant to your Interests Datacenter NIMBYism: What Did You Think Was Going to Happen? Cerebras raises $5.5B, then stock pops $108%, in the first huge tech IPO of 2026 Andreessen Horowitz Is Spending on Politics Like No Other Cisco announces record revenue and 4,000 layoffs in the same day Amazon ditches Rufus chatbot, launches Alexa shopping agent in AI strategy pivot Open source tool maker Grafana Labs says hackers stole its code, refuses to pay ransom Anthropic has acquired the dev tools startup used by OpenAI, Google, and Cloudflare From Open Source Software to Open Source Strategy CISA Admin Leaked AWS GovCloud Keys on Github Shai-Hulud Returns: npm Worm hits @antv in latest ongoing campaign Intel CEO says foundry business is gaining momentum as customer interest grows Removing the Modem and GPS from my 2024 RAV4 Hybrid Sponsors WebRTC.ventures – Real-time communication & Voice AI integration Sentry - use the code: sdt26 for $100 in Sentry credits for new users Nonsense "solutions architects" in 2026 Conferences WeAreDevelopers Europe, July 8-10, 2026 Berlin, Coté speaking. DevOpsDays Graz, Sept 4-5, 2026 DevOpsDays Rockies, Sept. 22 – 23, 2026, Discount Code: 26DODSWEDEFTALK WeAreDevelopers NA, Sept 23-25, 2026, Discount Code: DEVPOD26 DevOpsDays Dallas, Sept 28-29, 2026 DevOpsDays Vilnius, Sep 30 - Oct 1. 2006 DevOpsDays Istanbul, October 24th, 2026 - Coté keynoting. VMware User Groups (VMUGs): Dallas (June 9-11, 2026) Orlando (October 20-22, 2026) SDT News & Community Join our Slack community Email the show: questions@softwaredefinedtalk.com Free stickers: Email your address to stickers@softwaredefinedtalk.com Follow us on social media: Twitter, Threads, Mastodon, LinkedIn, BlueSky Watch us on: Twitch, YouTube, Instagram, TikTok Book offer: Use code SDT for $20 off "Digital WTF" by Coté Sponsor the show Sponsor more podcasts with Failover Media Recommendations Brandon: Varlock Matt: Sheep Detectives

Doppelgänger Tech Talk
SpaceX S-1 & Anthropic 130% Wachstum | Google I/O | Nvidia Earnings | Karpathy zu Anthropic #564

Doppelgänger Tech Talk

Play Episode Listen Later May 22, 2026 119:57


Natürlich geht es heute um die SpaceX-S1-IPO-Filings, aber vorher sprechen wir über die Google I/O (Universal Cart, Gemini Spark, Gemini 3.5 Flash und viele mehr). Wir vergleichen Umsatz und Verlust (Gewinn) von OpenAI und Anthropic. Andrej Karpathy wechselt zu Anthropic, Cursor erreicht $3 Mrd. Annual Sales Rate. Starlink ist die SpaceXs Cashcow ($11 Mrd. Umsatz, $4,4 Mrd. Profit) und subventioniert die mit 12,5% wachsende KI-Sparte. OpenAI kündigt am Tag des S1-Filings überraschend frühen IPO an. Binance launcht SpaceX Pre-IPO Perpetuals. Bezos hat sich diese Woche auch zu Wort gemeldet. Zudem sprechen wir über Forum-AI-Studie: Falsche News-Antworten von KI, Google holt Contextual-AI-Team für $100 Mio, Airbnb erweitert auf Hotels und Mietwagen. Earnings von Nvidia und Workday. SAP, Mistral und unser Digitalminister Wildberger lässt offenbar Texte/Reden von schreiben. Unterstütze unseren Podcast und entdecke die Angebote unserer Werbepartner auf ⁠⁠⁠⁠⁠⁠doppelgaenger.io/werbung⁠⁠⁠⁠⁠⁠. Vielen Dank!  Philipp Glöckler und Philipp Klöckner sprechen heute über: (00:04:00) Google I/O Recap (00:25:44) OpenAI Q1 Earnings: $5,7 Mrd. (00:34:33) Anthropic Q1 & profitable im Juni (00:41:21) Karpathy zu Anthropic (00:42:40) Cursor bei $3 Mrd. Runrate (00:44:39) SpaceX S1 Filing Deep Dive (01:19:30) SpaceX kauft Cybertrucks für $140 Mio. (01:24:30) OpenAI IPO-Filing kommt früher (01:28:29) SpaceX Pre-IPO Perpetuals (01:32:37) Arbeiter stirbt in Starbase (01:33:28) Bezos: Space-Datacenter & Steuer-Debatte (01:38:34) Forum AI: GROK unzuverlässig bei News (01:41:19) Google holt Contextual-AI-Team für $100 Mio. (01:41:51) Airbnb: Hotels, Mietwagen, Everything-Travel (01:43:39) Nvidia Earnings +85% (01:46:51) Workday Earnings +14% (01:47:22) Zuckerberg-Audio: Mitarbeiter-Spionage (01:47:44) Enhanced Games (Steroid-Olympics) (01:52:21) Reuters: GROK 3 von 400 US-Behörden-Fällen (01:53:12) WaPo: DOGE-Datenzugriffe geheim (01:54:04) Trump schützt sich vor IRS (01:54:46) Cohere übernimmt Reliant AI Shownotes Google I/O 2026: Größte AI-Ankündigungen - theverge.com OpenAI behält $1 Mrd. Umsatz-Vorsprung vor Anthropic in Q1 - theinformation.com OpenAI Action-Figur-Werbung auf Instagram - instagram.com Anthropic wird erstmals profitabel - wsj.com Andrej Karpathy wechselt zu Anthropic - bloomberg.com Cursor erreicht $3 Mrd. Annual Sales Rate - bloomberg.com SpaceX-IPO: Founders Fund vor $60 Mrd. Return - theinformation.com OpenAI IPO-Filing kommt früh - wsj.com OpenAI klaut SpaceX die Show mit IPO-Ankündigung - marketwatch.com Binance launcht Pre-IPO Perpetuals für SpaceX - prnewswire.com SpaceX: Arbeiter stirbt in Starbase - futurism.com Bezos / Blue Origin: Data Center im All - cnbc.com WOLF Financial Tweet (bitte manuell prüfen) - xcancel.com Studie: ChatGPT, Claude, Gemini, Grok bei News unzuverlässig - bloomberg.com Google: $100 Mio. Acqui-License von Bezos' Contextual AI - bloomberg.com Airbnb fügt Hotels und Mietwagen hinzu - cnbc.com Nvidia-Earnings: +85% durch AI Boom - theguardian.com Workday Q1 Earnings: Aktie +14% - cnbc.com LayoffAI Tweet - xcancel.com Steroid Olympics - ft.com Christian Angermayer und die Enhanced Games - theguardian.com Grok fällt in Washington durch: Nur 3 von 400 Behörden-Fällen - reuters.com Behörden verweigern Auskunft über DOGE-Datenzugriffe - washingtonpost.com Trump schützt eigene Steuererklärungen vor IRS-Prüfung - spiegel.de Cohere übernimmt deutsches KI-Startup Reliant AI - manager-magazin.de Reliant-AI-Gründer Karl-Moritz Hermann verkündet Cohere-Deal - linkedin.com Schreibt ChatGPT die Reden des Digitalministers? - de.linkedin.com

The top AI news from the past week, every ThursdAI
AI just cracked an 80-year-old math problem nobody could solve — plus everything from Google I/O 26

The top AI news from the past week, every ThursdAI

Play Episode Listen Later May 22, 2026 109:18


Hey, Alex here, just got back from the sunny Shoreline Theater in Mountain view, so let me catch you up! This week was definitely Google heavy, we are covering Google's IO conference for the third year in a row, and today we have a special guest, Logan Kilpatrick, is joining to discuss the announced Gemini 3.5 Flash, Google Omni model, and the new Managed Agents offerings. Plus, this week, for the first time, OpenAI announced that AI solved a Math problem that humans couldn't solve for 80 years, Cursor is showing off Composer 2.5 which is partly trained on XAI data, Karpathy joins Anthropic and much more! Let's dive in! P.S - We've announced our upcoming hackathon, Weavehacks-4, June 6-7, I'll be there, we're expecting the seats to run out very soon so register nowThursdAI - We'd love to have your subscription, and if you're already subscribed, please hit that bell on YT to never miss an episode!Google I/O 2026 - Google goes agentic everywhereI went to cover Google I/O for the third year in a row, shoutout to the DeepMind team for inviting ThursdAI again, and folks, this one felt different.Last year, Google I/O was still very model-centric. This year, the story was not “here is another benchmark chart.” The story was: Google is putting Gemini into everything, and the agentic layer is becoming the product layer. Search, Gemini app, Android, Workspace, YouTube, AI Studio, Cloud, Antigravity, Flow, managed agents, smart glasses, all of it is now orbiting around one pretty clear strategy: Gemini is the intelligence, Antigravity is the agent harness, Google's products are the distribution. I saw many reactions that were milquetoast, as in, “we expected more” and those seem to dominate the X feed. But I think the distribution is the part that many folks on X are missing. Yes, we can argue about Gemini 3.5 Flash pricing. Yes, we can argue whether “Flash” still means what Flash used to mean. But when Google says the Gemini app itself has 900 million monthly active users, before even counting Search, Gmail, YouTube, Docs, Drive, Android, and the rest of the Google surface area, that's massive! OpenAI ChatGPT is supposedly stagnated at ~900M, I don't remember them crossing a 1B. Meanwhile Google is gaining traction. And they just updated all those folks with a new model!Wolfram said it really well on the show: his mother is not sitting there reading model cards. She just uses her Pixel, voice unlocks Gemini, asks for help, and suddenly the default intelligence available to her goes up. Antigravity 2.0 - the agent harness takes center stageThe biggest strategic signal from Google I/O for me was Antigravity.Remember, Antigravity was an IDE that came from the Windsurf acquisition saga. Part of the Windsurf team went to Google, part went to Cognition, and now Google is very clearly putting Antigravity in the middle of its agentic future. And I mean very clearly. Sundar mentioned it. Demis mentioned it. Varun Mohan the co-founder was on stage immediately after them! If you've ever watched a Google I/O keynote, you know how carefully every minute is allocated. Google has YouTube, Search, Gmail, Android, Cloud, Ads, Workspace, and a thousand VP-level products that could be on stage. The fact that Antigravity was that prominent should tell you everything.Logan Kilpatrick joined us and framed this in a way I loved: Gemini became the through-line across Google products, and now the Antigravity agent harness is becoming the through-line for agentic experiences.The new Antigravity 2.0 is a complete overhaul, showing only an agentic interface (which was previously just a separate window called Agent Manager) and separating the IDE layer completely into its own app and showing a Codex like agent-first interface, which got a few folks furious. This move may be weird to some folks, but if you follow along where everyone's going, this seems to be the way of the future, coding is no longer about lines of code, it's about managing fleets of agents. The new Gemini 3.5 absolutely shines inside the new Antigravity, the model was trained with this harness in mind, and is currently offered at an incredible speed (12x), so I'm definitely going to try it! Gemini 3.5 Flash - fast, determined, and maybe not the old “Flash”The most debated model release of the week was Gemini 3.5 Flash.Some folks saw the pricing and token usage and immediately went “this is not Flash.” I get that reaction. Flash used to mean cheap, fast, lightweight chat model. But Logan's framing on the show was important: Flash is now being built for the agentic era.In a chat era, you optimize for one user message and one model answer. In an agentic era, the real token volume is in tool loops, intermediate reasoning, retries, file reads, web searches, code execution, and self-correction. That's a different product profile.Wolfram already ran Gemini 3.5 Flash through WolfBench, and the results were fascinating. With the Hermes agent harness, Gemini 3.5 Flash hit an 87% ceiling on Terminal Bench 2.0, meaning across runs it could solve more of the benchmark than even GPT-5.5 extra high in that setup. The variance was higher with the simpler Terminus harness, but with a real agent harness, the model looked much stronger.That tracks with what Nisten saw in his “Martian railgun from Olympus Mons” test. Gemini 3.5 Flash went extremely detailed, almost too determined, kept correcting itself, overcorrecting itself, and built a whole game-like simulation. Logan laughed and basically said: yeah, this model is very determined, possibly an overcorrection from the “Gemini is lazy” feedback. It also tracks with the mismatch in other benchmarks, in some, Gemini 3.5 flash shines (like the above Apex-agents from AA) and in some, it doesn't match the other frontiers. In my tests, it was definitely over-eager to use a million and a half tool calls, read tons of files, to just help me review this draft inside antigravity. It's like a super eager robotic golden retriever! Gemini Omni - Nano Banana for video, but actually more than thatThe biggest update from last year IO was Veo 3! This year, the biggest wow factor was also visual, but it wasn't VEO 4, it was a new model that is multimodal, trained end-to-end they call Omni. Google is calling this their first “create anything from anything” model, and the first version, Gemini Omni Flash, starts with conversational video editing. The easy description is: Nano Banana for video. You upload or create a video, then talk to it. Change this character. Replace this person. Add an object. Make this scene claymation. Keep the scene, but change the environment.I played with it live and showed a few examples. I asked for a claymation explainer of protein folding, then gave it my face and asked it to replace the character with me. It did it. I uploaded pictures of Sonia, my cat, and it generated a talking cat video with the right kind of cat teeth, which is weirdly important because so many pet generations accidentally add human teeth and become nightmare fuel.The failure modes are still there. I asked it to make Sonia a Russian-speaking female cat, and it only partly switched languages and didn't really change the voice. Audio upload support is also not fully productized yet, even though the underlying model is multimodal. But the direction is very clear.This is not just “Veo with a chat model glued on.” I asked Jeff Dean - Google's chief scientist about this at I/O, and he explained that Omni is trained end-to-end. The intelligence and the generative media capabilities are part of the same model family, not a hacky two-model pipeline. He also said the intelligence is around a recent Flash-level model, which is a big deal when you think about video editing as reasoning over physics, identity, scene continuity, and intent.A lot of people compared Omni to Seedance 2.0, and I think that's the wrong comparison. Seedance is amazing at cinematic generation (lkaregly due to lack of copyright concerns from Bytedance). Omni's unlock is iterative editing on real footage and coherent multi-turn creative control. Other Google IO 2026 releases I found notableThis was a concentrated effort of a huge company to insert AI into every product surface they have so of course I can't cover ALL of it here, but the most notable things for me were: * Gemini Spark - a new agentic experience from Google, to help you with tasks across Gmail, Drive and more. It should support skills, and is a de-facto OpenClaw/Hermes alternative from Google for regular folks. It's not “yet” live so we'll talk more about it when I can test it out* Managed Agents in the Gemini API - We chatted with Logan about this one, Google is re-imagining how agents are going to get built, and are offering 1 api call to spin up an agent in a full Linux env, with security and sandboxing in mind. I'll expand more on this in a next episode, as I recorded a complete conversation about this with Ali Çevic, a PM for Google APIs* AI overhaul of Google Search - AI Overviews will not expand into AI mode, and the iconic Google search box itself will change, for the first time in 25 years to include AI mode! * SynthID expantion and OpenAI collab - Google showed off that OpenAI is joining in marking all AI generate imagery and video with an invisible SynthID watermark. I think this is amazing and more companies should adopt this standard* AI Glasses! We got Google Glasses demos - Together with Warby Parker and Gentle Monster, Google finally showed off their answer to Meta Raybans/Oakleys. They look like regular glasses too, but can hear and talk to you, with the full power of Gemini multimodality. Available in the fall sometime! * Demis Hassabis “we're on the cusp of the singularity” closer - CEO and Co-Founder of DeepMind, Demis Hassabis, closed the show with his remarks about the positive future and that we are nearing this Singularity point after which the future is very uncertain. I found it to be very inspiring and closed our show with that clip as well! * Personally, I got to chat to: Demis Hassabis, have breakfast with Jeff Dean, ask Josh Woodward a bunch of questions, and pester about 20 other great folks on a live stream, and had a lot of fun! Huge thanks to the DeepMind folks, Lucie, Dimple, JD and many others for the continued belief in ThursdAI and invite me to cover this great event. OpenAI LLMs solve an 80yo math problem - Erdős Unit Distance ConjectureOutside of Google I/O, the biggest story of the week was OpenAI announcing that a general-purpose reasoning model made progress on the Erdős planar unit distance problem.This problem goes back to 1946. For nearly 80 years, mathematicians believed the best constructions looked roughly like square grids. OpenAI's model found a new family of constructions with a polynomial improvement, using algebraic number theory ideas that humans apparently had not explored in this context. The above is a representation of it! Important caveat: this does not fully solve every version of the asymptotic Erdős conjecture. Some mathematicians are pushing back on the framing, and fair enough. Precision matters. But even with the caveat, this is still a huge moment.The reason it matters is not that I personally understand the math. I absolutely do not. The reason it matters is that this was not a special-purpose IMO model fine-tuned only for math competitions. This was a general-purpose reasoning model exploring a real open problem, generating candidates, verifying them, and finding a path humans hadn't taken. Extrapolate this to other sciences, Physics for example? This means an amazing future. LDJ pointed out that mathematicians have been skeptical because there have been previous false alarms. But this one landed differently. When Fields Medalist-level mathematicians verify the proof, the discourse changes from “lol stochastic parrot” to “wait, what does this mean for my PhD?”My answer is: yes, still study math. Please study math. The mathematicians who use these tools will do much more than people who don't understand the domain. Same with software engineering. Senior engineers with Codex, Claude Code, Hermes, Antigravity, Cursor and other agents are becoming dramatically more effective because they can steer, evaluate, and recover the work.This being published a day after Demis's “foothills of the singularity” is a great conjecture. Cursor Composer 2.5 - Opus 4.7 performance model from Cursor, at 10x better efficiencyCursor dropped Composer 2.5, and folks, this is a serious release.Composer 2.5 is built on Moonshot's Kimi K2.5 base, like Composer 2, but Cursor scaled the post-training dramatically. They used 25x more synthetic tasks and introduced targeted textual feedback during RL rollouts, where the model gets hints inserted at the point of failure instead of only getting a noisy final reward.The benchmark story is strong: around 69.3 on Terminal Bench 2.0, basically neck and neck with Opus 4.7 in Cursor's chart, and strong results on SWE-bench multilingual and CursorBench. The pricing is the part that makes this especially interesting: $0.50 per million input tokens and $2.50 per million output tokens, with a faster variant at $3 / $15. That is much cheaper than the frontier models it is trying to replace for day-to-day coding work.Cursor engineers are reportedly dogfooding Composer 2.5 heavily and rarely switching away. That matters more to me than any single benchmark. If the people building Cursor can use it as a daily driver, that is a very real signal.The wild part is what comes next. Cursor is partnering with SpaceXAI to train a much larger model from scratch using 10x more compute on Colossus 2. Cursor has the workflow data. xAI has enormous compute. If this works, Cursor stops being just the IDE company and becomes a coding-model lab.We've been saying for months that coding agents are the path toward general agents. Anthropic has Claude Code. OpenAI has Codex. Google has Antigravity. xAI has Grok Build. Cursor has Composer. I'm looking forward to seeing how well it performs on our own benchmarks! Anthropic, xAI, Karpathy, and the compute warsThe compute story this week was bonkers.The SpaceX IPO filing reportedly revealed that Anthropic is paying SpaceXAI $1.25B per month for AI compute at the Memphis Colossus facility. Per month. That's about $15B a year, through May 2029, for access to more than 220,000 NVIDIA GPUs including H100s, H200s and GB200s.This is apparently inference compute for Claude Pro, Max and API users, not training. And it explains a lot of the recent quota changes. Anthropic doubled some Claude usage limits, and suddenly the product feels less constrained.Also, can we just acknowledge the comedy here? Elon Musk publicly called Anthropic “misanthropic,”, went off against every competitor to XAI, is now selling spare GPU time to Cursor and Anthropic? Who's next, OpenAI? The bigger point is that the AI capex story is no longer just NVIDIA. It's also whoever owns the data centers, power, cooling, networking, and GPU clusters. Compute is becoming the land under the AI economy.Also, Andrej Karpathy joined Anthropic. Karpathy could work anywhere. He co-founded OpenAI, led Tesla Autopilot vision, taught half the AI world how neural nets work, and now he's going back into frontier LLM R&D at Anthropic.Open source LLMs - Cohere, Qwen, NousOpen source had a strong week too.Cohere released Command A+, a 218B total parameter sparse MoE model with only 25B active parameters per token, under Apache 2.0. This is their first model that unifies reasoning, vision, multilingual, tool use and citations in one package.The hardware story is great: W4A4 quantization can run on 2 H100s or a single B200. Cohere says it supports 48 languages, 128K input context, 64K output, and gets big jumps over Command A Reasoning, including Tau-squared Bench Telecom from 37% to 85% and Terminal-Bench Hard from 3% to 25%.Cohere is one of those labs that doesn't always chase the loudest consumer hype, but they are very serious on enterprise and multilingual. Apache 2.0 makes this one especially useful.Alibaba also dropped Qwen 3.7-Max, positioned as an agentic frontier model. The headline from their testing is wild: 35 hours of continuous autonomous operation with more than 1,000 tool calls. They also showed it controlling a physical robot inside Alibaba offices and finding an umbrella after about 20 minutes of agent interaction.This digital-to-physical bridge is where things start feeling very real. An agent loop that can write code and use tools can also navigate physical tasks if you give it the right robotics stack.And our friends at Nous Research released Lighthouse Attention, a sparse attention method for long-context pretraining. At 512K context, they report a 17x faster forward+backward pass than standard attention on a single B200, and the recovered checkpoints actually beat dense-from-scratch final loss at the same token budget.The clever part is that the selection logic sits outside the attention kernel, so you still use regular FlashAttention on a gathered dense subsequence. No custom sparse kernel nonsense. If this holds up, this could matter a lot for long-context training.Tools and agentic engineering - X subscriptions, Grok Build, Codex MobileOne really practical tool update: Hermes and OpenClaw can now use your X subscription directly.This is more important than it sounds. You can connect your X Premium subscription and get access to semantic X search and Grok-related tooling without using sketchy browser automation or unofficial APIs that might get you banned. Wolfram already used this to have his agent go through his likes and bookmarks from the past week and send me news items for the show. That is exactly the kind of “small but real” agent workflow that becomes addictive.xAI also launched Grok Build, their agentic CLI coding tool, in early beta for SuperGrok Heavy subscribers. Early users are already running parallel Grok Build agents through tmux supervisors and using it for more than coding: fleet data triage, security patching, training label work, and general automation.The pricing being discussed is aggressive, around $1 per million input tokens and $2 per million output tokens for the API. The model version is grok-build-0.1, and folks have already wired it into Hermes with a 256K context window.And then there's Codex Mobile, which OpenAI shipped inside the ChatGPT mobile apps. This is one of those releases that sounds small until you start using it. You can control Codex sessions remotely from your phone, connected to your machine, and because Codex has native connectors to Gmail, Calendar and other surfaces, it sometimes feels faster and more reliable than local CLIs duct-taped to third-party integrations.I ported Wolfred into Codex with skills and everything, and I've been comparing the same tasks in Hermes and Codex. Codex is often faster, not necessarily because the model is always smarter, but because the connectors and harness are cleaner. Harness matters. We keep coming back to this.This Week's Buzz - W&B, CoreWeave, WolfBench and roboticsThis week in the Buzz, Wolfram walked us through a few things from the Weights & Biases / CoreWeave world.CoreWeave is a gold sponsor at ICRA 2026 in Vienna, the International Conference on Robotics and Automation. NVIDIA is also going big there with a keynote on generalist humanoid robots, 17 accepted papers and workshops around sim-to-real, robot foundation models, autonomous driving, manipulation, and physical AI.Wolfram will be there later in the week, after speaking at the AI Developer event in Cologne about WolfBench. If you're in Europe and into robotics or agent evals, find him.We also looked at WolfBench results for Gemini 3.5 Flash, which honestly became one of the more interesting empirical points of the episode. The model looks variable in simple harnesses, but very capable in better agent loops. That's the whole thesis of measuring model + harness together instead of pretending the model card tells the whole story.The water discourse, almonds, and data center realityWe also got into the data center water discourse, because this talking point is everywhere right now.There are real infrastructure questions around AI. Power, land, cooling, grid capacity, permitting, local impact, all of that matters. But the “AI is stealing drinking water” version of the argument is often wildly detached from scale.The stat I brought up on the show: California almonds use roughly 3 to 5.5 million acre-feet of water per year, multiple times more than all North American data centers combined in 2025. Nisten and LDJ added the important cooling nuance: many large data centers use closed-loop cooling, and evaporative cooling is not universal. Some data centers can avoid water use almost entirely, but at the cost of higher electricity usage.This doesn't mean “no concerns are valid.” It means if we're going to regulate or pause data centers, let's be honest about the actual tradeoffs. AI compute is becoming the substrate for medicine, robotics, science, logistics, software, education and every other productivity layer. We should build responsibly, but not based on viral fear math.Closing thoughts - foothills of the singularityDemis closed I/O saying we're in the foothills of the singularity, and I know how that lands when you write it down. But I was in the room, and after the keynote he told me something I haven't been able to shake: he thinks AI is going to be 10x as impactful as the Industrial Revolution, and 10x as fast. Basically 100x. This is the AlphaFold guy. Not someone loose with his words.Then look at the week. A general reasoner cracked an 80-year-old math problem. Cursor is training near-frontier coding models on a fraction of the big-lab budget. Anthropic is paying Elon $15B a year for inference. Karpathy left education to go back into pre-training. Google rolled out an intelligence uplift to a billion people who don't even know a model dropped.If you put that on a whiteboard in 2023, it reads like a sci-fi pitch.LDJ's mathematician friends are asking if they should keep doing their PhDs. My answer hasn't changed: yes, please keep going. The people who combine domain taste with these tools are going to ship more in 5 years than the previous generation did in 50. The tool doesn't replace the taste. It just removes the bottleneck.That's the whole reason ThursdAI exists. Not to hype every drop, not to dunk for engagement, but to give you a shot at being one of the people who knows what's happening, with the receipts.This week, a lot changed.See you next Thursday.TL;DR and Show Notes* Hosts and Guests* Alex Volkov - AI Evangelist at Weights & Biases / CoreWeave, @altryne* Co-hosts: @WolframRvnwlf, @nisten, @ldjconfirmed* Guest: Logan Kilpatrick, MTS at Google DeepMind / AI Studio, @OfficialLoganK* Google I/O 2026* Google went all-in on agents across Search, Gemini, Antigravity, Workspace, Android, Cloud and YouTube (I/O site, Alex thread)* Antigravity 2.0 became the central agentic coding harness across Google (Sundar, Google OS demo)* Gemini 3.5 Flash launched as a fast, determined workhorse model for agentic loops (Logan, Noam Shazeer, Jeff Dean)* Gemini 3.5 Flash is rolling out across the Gemini app, Search AI Mode, Gemini API, Google AI Studio, Antigravity and Gemini Enterprise Agent Platform (Koray Kavukcuoglu)* Google Search is getting new Gemini 3.5 Flash-powered agentic capabilities, including a new AI-powered Search box and background information agents (Sundar)* Gemini Spark was announced as a 24/7 personal AI agent that can proactively work across Google surfaces (News from Google)* Google teased Gemini-powered Android XR smart glasses with eyewear partners Gentle Monster and Warby Parker (Google, Alex live reaction)* Google AI Studio and the Gemini API got major agentic developer updates, including Managed Agents (Google AI Developers)* Vision & Video* Google DeepMind launched Gemini Omni, a “create anything from anything” multimodal model starting with conversational video editing (DeepMind, Google DeepMind on X)* Omni is available in the Gemini app, Google Flow and YouTube, with API support coming soon (Logan, Gemini App, Sundar)* Key distinction: Omni is not just text-to-video, it is an iterative multi-turn video editing model that combines Gemini intelligence, world knowledge, multimodal inputs and generative media (Google)* Big CO LLMs + APIs* OpenAI announced a general-purpose reasoning model made progress on the Erdős planar unit distance problem, challenging an 80-year-old mathematical belief (OpenAI, X)* Cursor launched Composer 2.5, built on Kimi K2.5, with Opus-class coding performance at much lower cost (Cursor blog, X)* Alibaba released Qwen 3.7-Max, an agentic frontier model with long autonomous runs and robotics demos (Qwen blog, X, robot demo)* Andrej Karpathy joined Anthropic to work on frontier LLM R&D (X)* SpaceX IPO filing revealed Anthropic is paying $1.25B/month for AI compute at the Memphis Colossus facility (Axios, Sawyer Merritt)* The jury in Musk v. Altman found Musk's OpenAI claims barred by statute of limitations, with Musk saying he will appeal (Elon Musk, Sawyer Merritt, Max Zeff)* Open Source LLMs* Cohere released Command A+, a 218B MoE model with 25B active parameters under Apache 2.0 (Cohere, Nick Frosst, HF W4A4, HF BF16)* Nous Research released Lighthouse Attention, a sparse attention method for long-context pretraining with major speedups (Blog, X, arXiv, GitHub)* Tools & Agentic Engineering* Google launched Managed Agents in the Gemini API, letting developers spin up hosted Antigravity agents with Linux sandboxes and persistent state (Docs, X)* xAI launched Grok Build, an agentic CLI coding tool in beta for SuperGrok Heavy users (xAI CLI, X)* Hermes and OpenClaw can now use X subscription auth for semantic search and Grok tooling (Alex)* OpenAI Codex Mobile is now available in the ChatGPT mobile apps for remote agent workflows (OpenAI)* Anthropic doubled Claude usage outside peak hours for a limited period, including Claude Code and other Claude surfaces (Claude)* This Week's Buzz - W&B / CoreWeave* Weights & Biases by CoreWeave is at ICRA 2026 in Vienna, with robotics and automation taking center stage (ICRA, W&B event page)* NVIDIA heads to ICRA 2026 with robotics work around generalist humanoids, physical AI and sim-to-real systems (NVIDIA Robotics, NVIDIA ICRA)* Wolfram is speaking about WolfBench at the AI Developer event in Cologne before heading to ICRA in Vienna (Wolfram)* Other Topics* Data center water usage discourse came up again, including why comparisons need real scale and context rather than viral fear math* The broader theme of the week: coding agents are becoming general agents, and the major labs are now competing on the full stack of model, harness, tools, context and compute This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit sub.thursdai.news/subscribe

Valuetainment
"26 Million Jobs GONE!" - Anthropic STEALS OpenAI's Best As AI War Gets UGLY

Valuetainment

Play Episode Listen Later May 21, 2026 17:46


OpenAI just lost one of its biggest brains to Anthropic, and the shockwave could hit 26 million jobs. Andrej Karpathy's stunning move, AI safety fears, white-collar job losses, and the future of small business all collide in this urgent breakdown of the AI war.

This Week in Google (MP3)
IM 871: CTRL-F Techno King - Google's Search Overhaul

This Week in Google (MP3)

Play Episode Listen Later May 21, 2026 173:48


Dashlane's CTO pulls back the curtain on how password managers are actually using AI, why it's more complicated than hype suggests, and what the rise of AI-powered code review means for the next wave of digital security. Nvidia Rides Blistering Chip Sales to Another Record Quarter Mind-Blowing Growth Is About to Propel Anthropic Into Its First Profitable Quarter SpaceX Filing Starts Countdown to Massive IPO Gemini 3.5 Flash: more expensive, but Google plan to use it for everything Google's Gemini Spark is an agentic AI assistant - Engadget Anthropic's Co-Founder to Launch Encyclical on AI With Pope Leo (21) Andrej Karpathy on X: "Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time." / X Most U.S. doctors are quietly using this AI tool. Few patients know about it. Greg Brockman Officially Takes Control of OpenAI's Products in Latest Shakeup Amazon's Alexa+ Now Produces AI-Generated 'Podcasts' Featuring Chats Between Two Robot 'Co-Hosts' AI chatbots are giving out people's real phone numbers Geoffrey Fowler and the Launch of the Youth AI Safety Institute We let four AIs run radio stations. Here's what happened. | Andon Labs The last six months in LLMs in five minutes Lake Tahoe Power Crisis: How AI Data Centers Are Cutting Power to 50,000 Residents What happens when you post a real Monet and say it's AI? The coolest art social experiment I've seen in a while. Thank you @SHL0MS Book on Truth in the Age of A.I. Contains Quotes Made Up by A.I. OpenClaw's Peter Steinberger's tokenmaxxing 'Obvious markers of AI': doubts raised over winner of short story prize Man drives Cybertruck into Grapevine Lake Stewart Brand's Maintenance of Everything Sports Illustrated Just Deleted Every Article by One of Its Writers After Accusation of AI Plagiarism The great digital media valuation collapse Sperm racing Hosts: Leo Laporte, Jeff Jarvis, and Paris Martineau Guest: Frederic Rivain Download or subscribe to Intelligent Machines at https://twit.tv/shows/intelligent-machines. Join Club TWiT for Ad-Free Podcasts! Support what you love and get ad-free audio and video feeds, a members-only Discord, and exclusive content. Join today: https://twit.tv/clubtwit Sponsors: outsystems.com/twit monarch.com with code IM zscaler.com/security XBOW.com

The AI Breakdown: Daily Artificial Intelligence News and Discussions

Anthropic delivered one of the most consequential weeks any AI lab has had yet: Andrej Karpathy joined to work on AI-accelerated pre-training research, new financials suggested the company is already profitable, and its deepening SpaceX compute partnership added fuel to the acceleration story. NLW breaks down why this is bigger than a lab horse race, why recursive research and compute constraints matter, and how Anthropic's momentum is forcing a reset in how markets understand the AI boom. In the headlines: OpenAI's IPO plans, Cursor's efficient coding model, and more.Apply for our Growth Engineering role: ⁠⁠⁠⁠⁠⁠⁠https://jobs.aidailybrief.ai/⁠⁠⁠⁠⁠⁠⁠Enterprise Claw Cohort 3 Registration: ⁠⁠⁠⁠⁠https://enterpriseclaw.ai/⁠⁠⁠⁠⁠Brought to you by:KPMG – Agentic AI is powering a potential $3 trillion productivity shift, and KPMG's new paper, Agentic AI Untangled, gives leaders a clear framework to decide whether to build, buy, or borrow—download it at ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.kpmg.us/Navigate⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Granola - The AI notepad for people in back-to-back meetings. 100% off your first 3 months with code AIDAILY at ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠http://granola.ai/aidaily⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Scrunch - The AI customer experience platform - ⁠⁠⁠⁠⁠⁠⁠https://scrunch.com/⁠⁠⁠⁠⁠⁠⁠Mercury - Modern banking for business and now personal accounts. Learn more at ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://mercury.com/personal-banking⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Zenflow Work - Agents for knowledge work - ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://zenflow.free/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Drata - The agentic trust management platform - ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://drata.com/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Blitzy - Want to accelerate enterprise software development velocity by 5x? ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://blitzy.com/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠AssemblyAI - The best way to build Voice AI apps - ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://www.assemblyai.com/brief⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Robots & Pencils - Cloud-native AI solutions that power results ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://robotsandpencils.com/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠The AI Daily Brief helps you understand the most important news and discussions in AI. Subscribe to the podcast version of The AI Daily Brief wherever you listen: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://pod.link/1680633614⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Our Newsletter is BACK: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://aidailybrief.beehiiv.com/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Interested in sponsoring the show? sponsors@aidailybrief.ai

The Twenty Minute VC: Venture Capital | Startup Funding | The Pitch
20VC: Andrej Karpathy Joins Anthropic & Anthropic Raises $30BN at $900BN Price | SpaceX Files S1: How Does it Trade | Cerebras Smashes Day 1: What it Means for IPOs | Why Mass Layoffs Are More Worrying Than Anyone Sees

The Twenty Minute VC: Venture Capital | Startup Funding | The Pitch

Play Episode Listen Later May 21, 2026 81:53


AGENDA:  00:00 – Anthropic Eyes $900B Valuation & Andre Karpathy's Shock Move 04:46 – Unpacking Anthropic's $30 Billion War Chest 10:52 – The True Cost of AI Tokens: Is Salesforce Spending Too Much? 15:59 – The Bear Case for Token Growth & Why Software Leaders Must Adapt 22:56 – Public Tech Rebound: Figma & Datadog Crush Expectations 26:59 – The Death of Traditional Web Builders? The Decline of Wix & Squarespace 36:59 – Compute Starvation: Is the Semiconductor & Hardware Boom Sustainable? 45:59 – Cerebras IPO Smashes Day One: The Biggest Tech Public Debut Since Snowflake 48:17 – SpaceX Sets Date For the Largest IPO in History 57:28 – Y Combinator's Mic Drop Deal & The Drama Behind Elon Musk's OpenAI Lawsuit 01:11:57 – The Looming Backlash: Mass Tech Layoffs and the Politics of AI 20VC: 

All TWiT.tv Shows (MP3)
Intelligent Machines 871: CTRL-F Techno King

All TWiT.tv Shows (MP3)

Play Episode Listen Later May 21, 2026 173:48


Dashlane's CTO pulls back the curtain on how password managers are actually using AI, why it's more complicated than hype suggests, and what the rise of AI-powered code review means for the next wave of digital security. Nvidia Rides Blistering Chip Sales to Another Record Quarter Mind-Blowing Growth Is About to Propel Anthropic Into Its First Profitable Quarter SpaceX Filing Starts Countdown to Massive IPO Gemini 3.5 Flash: more expensive, but Google plan to use it for everything Google's Gemini Spark is an agentic AI assistant - Engadget Anthropic's Co-Founder to Launch Encyclical on AI With Pope Leo (21) Andrej Karpathy on X: "Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time." / X Most U.S. doctors are quietly using this AI tool. Few patients know about it. Greg Brockman Officially Takes Control of OpenAI's Products in Latest Shakeup Amazon's Alexa+ Now Produces AI-Generated 'Podcasts' Featuring Chats Between Two Robot 'Co-Hosts' AI chatbots are giving out people's real phone numbers Geoffrey Fowler and the Launch of the Youth AI Safety Institute We let four AIs run radio stations. Here's what happened. | Andon Labs The last six months in LLMs in five minutes Lake Tahoe Power Crisis: How AI Data Centers Are Cutting Power to 50,000 Residents What happens when you post a real Monet and say it's AI? The coolest art social experiment I've seen in a while. Thank you @SHL0MS Book on Truth in the Age of A.I. Contains Quotes Made Up by A.I. OpenClaw's Peter Steinberger's tokenmaxxing 'Obvious markers of AI': doubts raised over winner of short story prize Man drives Cybertruck into Grapevine Lake Stewart Brand's Maintenance of Everything Sports Illustrated Just Deleted Every Article by One of Its Writers After Accusation of AI Plagiarism The great digital media valuation collapse Sperm racing Hosts: Leo Laporte, Jeff Jarvis, and Paris Martineau Guest: Frederic Rivain Download or subscribe to Intelligent Machines at https://twit.tv/shows/intelligent-machines. Join Club TWiT for Ad-Free Podcasts! Support what you love and get ad-free audio and video feeds, a members-only Discord, and exclusive content. Join today: https://twit.tv/clubtwit Sponsors: outsystems.com/twit monarch.com with code IM zscaler.com/security XBOW.com

Radio Leo (Audio)
Intelligent Machines 871: CTRL-F Techno King

Radio Leo (Audio)

Play Episode Listen Later May 21, 2026 173:48


Dashlane's CTO pulls back the curtain on how password managers are actually using AI, why it's more complicated than hype suggests, and what the rise of AI-powered code review means for the next wave of digital security. Nvidia Rides Blistering Chip Sales to Another Record Quarter Mind-Blowing Growth Is About to Propel Anthropic Into Its First Profitable Quarter SpaceX Filing Starts Countdown to Massive IPO Gemini 3.5 Flash: more expensive, but Google plan to use it for everything Google's Gemini Spark is an agentic AI assistant - Engadget Anthropic's Co-Founder to Launch Encyclical on AI With Pope Leo (21) Andrej Karpathy on X: "Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time." / X Most U.S. doctors are quietly using this AI tool. Few patients know about it. Greg Brockman Officially Takes Control of OpenAI's Products in Latest Shakeup Amazon's Alexa+ Now Produces AI-Generated 'Podcasts' Featuring Chats Between Two Robot 'Co-Hosts' AI chatbots are giving out people's real phone numbers Geoffrey Fowler and the Launch of the Youth AI Safety Institute We let four AIs run radio stations. Here's what happened. | Andon Labs The last six months in LLMs in five minutes Lake Tahoe Power Crisis: How AI Data Centers Are Cutting Power to 50,000 Residents What happens when you post a real Monet and say it's AI? The coolest art social experiment I've seen in a while. Thank you @SHL0MS Book on Truth in the Age of A.I. Contains Quotes Made Up by A.I. OpenClaw's Peter Steinberger's tokenmaxxing 'Obvious markers of AI': doubts raised over winner of short story prize Man drives Cybertruck into Grapevine Lake Stewart Brand's Maintenance of Everything Sports Illustrated Just Deleted Every Article by One of Its Writers After Accusation of AI Plagiarism The great digital media valuation collapse Sperm racing Hosts: Leo Laporte, Jeff Jarvis, and Paris Martineau Guest: Frederic Rivain Download or subscribe to Intelligent Machines at https://twit.tv/shows/intelligent-machines. Join Club TWiT for Ad-Free Podcasts! Support what you love and get ad-free audio and video feeds, a members-only Discord, and exclusive content. Join today: https://twit.tv/clubtwit Sponsors: outsystems.com/twit monarch.com with code IM zscaler.com/security XBOW.com

This Week in Google (Video HI)
IM 871: CTRL-F Techno King - Google's Search Overhaul

This Week in Google (Video HI)

Play Episode Listen Later May 21, 2026 173:48


Dashlane's CTO pulls back the curtain on how password managers are actually using AI, why it's more complicated than hype suggests, and what the rise of AI-powered code review means for the next wave of digital security. Nvidia Rides Blistering Chip Sales to Another Record Quarter Mind-Blowing Growth Is About to Propel Anthropic Into Its First Profitable Quarter SpaceX Filing Starts Countdown to Massive IPO Gemini 3.5 Flash: more expensive, but Google plan to use it for everything Google's Gemini Spark is an agentic AI assistant - Engadget Anthropic's Co-Founder to Launch Encyclical on AI With Pope Leo (21) Andrej Karpathy on X: "Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time." / X Most U.S. doctors are quietly using this AI tool. Few patients know about it. Greg Brockman Officially Takes Control of OpenAI's Products in Latest Shakeup Amazon's Alexa+ Now Produces AI-Generated 'Podcasts' Featuring Chats Between Two Robot 'Co-Hosts' AI chatbots are giving out people's real phone numbers Geoffrey Fowler and the Launch of the Youth AI Safety Institute We let four AIs run radio stations. Here's what happened. | Andon Labs The last six months in LLMs in five minutes Lake Tahoe Power Crisis: How AI Data Centers Are Cutting Power to 50,000 Residents What happens when you post a real Monet and say it's AI? The coolest art social experiment I've seen in a while. Thank you @SHL0MS Book on Truth in the Age of A.I. Contains Quotes Made Up by A.I. OpenClaw's Peter Steinberger's tokenmaxxing 'Obvious markers of AI': doubts raised over winner of short story prize Man drives Cybertruck into Grapevine Lake Stewart Brand's Maintenance of Everything Sports Illustrated Just Deleted Every Article by One of Its Writers After Accusation of AI Plagiarism The great digital media valuation collapse Sperm racing Hosts: Leo Laporte, Jeff Jarvis, and Paris Martineau Guest: Frederic Rivain Download or subscribe to Intelligent Machines at https://twit.tv/shows/intelligent-machines. Join Club TWiT for Ad-Free Podcasts! Support what you love and get ad-free audio and video feeds, a members-only Discord, and exclusive content. Join today: https://twit.tv/clubtwit Sponsors: outsystems.com/twit monarch.com with code IM zscaler.com/security XBOW.com

All TWiT.tv Shows (Video LO)
Intelligent Machines 871: CTRL-F Techno King

All TWiT.tv Shows (Video LO)

Play Episode Listen Later May 21, 2026 173:48


Dashlane's CTO pulls back the curtain on how password managers are actually using AI, why it's more complicated than hype suggests, and what the rise of AI-powered code review means for the next wave of digital security. Nvidia Rides Blistering Chip Sales to Another Record Quarter Mind-Blowing Growth Is About to Propel Anthropic Into Its First Profitable Quarter SpaceX Filing Starts Countdown to Massive IPO Gemini 3.5 Flash: more expensive, but Google plan to use it for everything Google's Gemini Spark is an agentic AI assistant - Engadget Anthropic's Co-Founder to Launch Encyclical on AI With Pope Leo (21) Andrej Karpathy on X: "Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time." / X Most U.S. doctors are quietly using this AI tool. Few patients know about it. Greg Brockman Officially Takes Control of OpenAI's Products in Latest Shakeup Amazon's Alexa+ Now Produces AI-Generated 'Podcasts' Featuring Chats Between Two Robot 'Co-Hosts' AI chatbots are giving out people's real phone numbers Geoffrey Fowler and the Launch of the Youth AI Safety Institute We let four AIs run radio stations. Here's what happened. | Andon Labs The last six months in LLMs in five minutes Lake Tahoe Power Crisis: How AI Data Centers Are Cutting Power to 50,000 Residents What happens when you post a real Monet and say it's AI? The coolest art social experiment I've seen in a while. Thank you @SHL0MS Book on Truth in the Age of A.I. Contains Quotes Made Up by A.I. OpenClaw's Peter Steinberger's tokenmaxxing 'Obvious markers of AI': doubts raised over winner of short story prize Man drives Cybertruck into Grapevine Lake Stewart Brand's Maintenance of Everything Sports Illustrated Just Deleted Every Article by One of Its Writers After Accusation of AI Plagiarism The great digital media valuation collapse Sperm racing Hosts: Leo Laporte, Jeff Jarvis, and Paris Martineau Guest: Frederic Rivain Download or subscribe to Intelligent Machines at https://twit.tv/shows/intelligent-machines. Join Club TWiT for Ad-Free Podcasts! Support what you love and get ad-free audio and video feeds, a members-only Discord, and exclusive content. Join today: https://twit.tv/clubtwit Sponsors: outsystems.com/twit monarch.com with code IM zscaler.com/security XBOW.com

AI Inside
Google Changes Search for the First Time in 25 Years

AI Inside

Play Episode Listen Later May 21, 2026 81:03


Jason Howell and Jeff Jarvis break down everything from Google I/O 2026, where the company made its strongest case yet for winning the AI race. Gemini 3.5 Flash and Gemini Spark were unveiled, AI agents are now doing the searching instead of returning links, and Google's reach extended into design, science, YouTube, and shopping. Jason also demos Genie World Models live.Also in this episode: Andrej Karpathy joins Anthropic, Anthropic acquires a major dev tools startup, Amazon Alexa+ can now generate podcast episodes, Elon Musk's latest lawsuit drama, and a growing American rebellion against AI. Speed round includes the OpenAI IPO, xAI's coding agent, Meta's AR glasses, and more.New episodes every Wednesday at aiinside.show Note: Time codes subject to change depending on dynamic ad insertion by the distributor. CHAPTERS: 0:04:31 - Everything announced at Google I/O 2026              - Times: How Google Is Starting to Win the A.I. Race 0:22:42 - A new era for AI Search              - Gemini 3.5: frontier intelligence with action 0:27:53 - Google Launches Gemini Spark: A 24/7 AI Agent That Wants to Make You Ditch OpenClaw 0:44:35 - OpenAI co-founder Andrej Karpathy joins Anthropic 0:46:32 - Anthropic has acquired the dev tools startup used by OpenAI, Google, and Cloudflare 0:55:20 - Amazon's new Alexa+ powered feature can generate podcast episodes 0:57:27 - The Art of War, Elon Musk Edition: How to Lose a Lawsuit and Still Claim Victory 0:59:30 - The American Rebellion Against AI Is Gaining Steam 1:01:46 - NextEra Energy to buy Dominion in deal that unites two key players in race to power AI data centers 1:04:42 - Pope Leo XIV will publish his first encyclical, Magnifica Humanitas, on May 25, with Anthropic co-founder Christopher Olah joining the launch panel at the Vatican 1:06:25 - Linus Torvalds says AI-powered bug hunters have made Linux security mailing list ‘almost entirely unmanageable' 1:08:57 - Meta brings virtual writing to everyone with Meta Ray-Ban Display glasses 1:10:36 - Musk's xAI Unveils First Coding Agent in Bid to Rival Anthropic 1:10:59 - OpenAI is Preparing to File for an IPO Very Soon Hosts: Jason Howell and Jeff Jarvis Download and subscribe to AI Inside in audio and video: https://aiinside.show/  Support the podcast on Patreon for special perks: https://www.patreon.com/aiinsideshow. You'll get ad-free episodes, members-only Discord, T-shirts and stickers you love, and get ad-free audio and video feeds, a members-only Discord, and exclusive content. Learn more about your ad choices. Visit megaphone.fm/adchoices

Radio Leo (Video HD)
Intelligent Machines 871: CTRL-F Techno King

Radio Leo (Video HD)

Play Episode Listen Later May 21, 2026 173:48


Dashlane's CTO pulls back the curtain on how password managers are actually using AI, why it's more complicated than hype suggests, and what the rise of AI-powered code review means for the next wave of digital security. Nvidia Rides Blistering Chip Sales to Another Record Quarter Mind-Blowing Growth Is About to Propel Anthropic Into Its First Profitable Quarter SpaceX Filing Starts Countdown to Massive IPO Gemini 3.5 Flash: more expensive, but Google plan to use it for everything Google's Gemini Spark is an agentic AI assistant - Engadget Anthropic's Co-Founder to Launch Encyclical on AI With Pope Leo (21) Andrej Karpathy on X: "Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time." / X Most U.S. doctors are quietly using this AI tool. Few patients know about it. Greg Brockman Officially Takes Control of OpenAI's Products in Latest Shakeup Amazon's Alexa+ Now Produces AI-Generated 'Podcasts' Featuring Chats Between Two Robot 'Co-Hosts' AI chatbots are giving out people's real phone numbers Geoffrey Fowler and the Launch of the Youth AI Safety Institute We let four AIs run radio stations. Here's what happened. | Andon Labs The last six months in LLMs in five minutes Lake Tahoe Power Crisis: How AI Data Centers Are Cutting Power to 50,000 Residents What happens when you post a real Monet and say it's AI? The coolest art social experiment I've seen in a while. Thank you @SHL0MS Book on Truth in the Age of A.I. Contains Quotes Made Up by A.I. OpenClaw's Peter Steinberger's tokenmaxxing 'Obvious markers of AI': doubts raised over winner of short story prize Man drives Cybertruck into Grapevine Lake Stewart Brand's Maintenance of Everything Sports Illustrated Just Deleted Every Article by One of Its Writers After Accusation of AI Plagiarism The great digital media valuation collapse Sperm racing Hosts: Leo Laporte, Jeff Jarvis, and Paris Martineau Guest: Frederic Rivain Download or subscribe to Intelligent Machines at https://twit.tv/shows/intelligent-machines. Join Club TWiT for Ad-Free Podcasts! Support what you love and get ad-free audio and video feeds, a members-only Discord, and exclusive content. Join today: https://twit.tv/clubtwit Sponsors: outsystems.com/twit monarch.com with code IM zscaler.com/security XBOW.com

The Future of Work With Jacob Morgan
Meta Cuts 8,000 Jobs, Top AI Researcher Leaves OpenAI for Anthropic, and 80% of Americans Are on Their Own With AI

The Future of Work With Jacob Morgan

Play Episode Listen Later May 20, 2026 33:14


May 20, 2026: Meta began notifying 8,000 employees of their layoffs this morning — while simultaneously redirecting $145 billion into AI infrastructure. Andrej Karpathy, one of the founding members of OpenAI and the architect of Tesla's self-driving brain, just joined Anthropic with a specific mission: use AI to make AI better. And a major new Milken-Harris Poll finds that 80% of Americans want government workforce transition programs now, 68% say they're navigating the AI shift entirely alone, and 88% of business leaders privately admit companies cannot solve this without a coordinated national response. 

AI For Humans
Google's New Gemini Omni AI Is Too Much. All The Good Stuff From Google I/O.

AI For Humans

Play Episode Listen Later May 20, 2026 27:58


Google I/O 2026 just dropped Gemini Omni, a world-model AI that simulates physics, edits video, and might be the biggest leap since Seedance 2. But it's not perfect. Gavin and Kevin break down everything from Google I/O 2026, including the launch of Gemini Omni (Google's new world model), Gemini 3.5 Flash benchmarks against GPT-5.5 and Opus 4.7, the Gemini Spark personal agent, AskYouTube, Docs Live, new AI glasses, the first search box redesign in 25 years, and the shocking news that Andrej Karpathy is joining Anthropic. SHOW LINKS: Google I/O 2026 Full Keynote: https://www.youtube.com/live/wYSncx9zLIU?si=Nb881MfGTlf1Q0II Gemini Omni physics demos from Google DeepMind: https://x.com/GoogleDeepMind/status/2056786449312493669?s=20 Gemini Omni's incredible London knowledge (via fofrAI): https://x.com/fofrAI/status/2056789242274259242?s=20 Sundar Pichai and Demis Hassabis on Omni video editing: https://x.com/sundarpichai/status/2056524502746747048?s=20 Gavin's hands-on Gemini Omni experiments: https://x.com/gavinpurcell/status/2056762427879182692?s=20 Gemini Omni's character cameo feature (less impressive): https://x.com/gavinpurcell/status/2056772793539481830?s=20 Gemini Omni volleyball fail: https://x.com/flavioAd/status/2056771223359549645?s=20 Google's new Content Credentials Verification: https://x.com/Google/status/2056787498676658576?s=20 Genie 3 IRL — Google's world model now simulates real streets with Street View: https://techcrunch.com/2026/05/19/googles-genie-world-model-can-now-simulate-real-streets-with-street-view/ Bilawal Sidhu on Genie 3 IRL: https://x.com/bilawalsidhu/status/2056804315721843024?s=20 Gemini 3.5 Flash launches — official announcement: https://x.com/GeminiApp/status/2056788115893993701?s=20 Gemini Spark — Google's new personal coding agent: https://x.com/Google/status/2056791134295273554?s=20 Google's new AI glasses  https://x.com/backlon/status/2056807059707036050?s=20 Andrej Karpathy joins Anthropic to focus on recursive self-learning: https://www.axios.com/2026/05/19/anthropic-openai-karpathy-andrej-claude  

Elon Musk Pod
OpenAI co-founder leaves for Anthropic - shakes up AI

Elon Musk Pod

Play Episode Listen Later May 20, 2026 16:44


An OpenAI co-founder has joined the competition. Andrej Karpathy, who helped start the high-profile artificial intelligence outfit in 2015, has now joined Anthropic, he announced Tuesday. Karpathy was instrumental in the development of Tesla's Autopilot system, working for the automaker for about five years in between two separate stints at OpenAI. In his new role at rival Anthropic, Karpathy will "get back to R&D," he wrote in a social media post, adding, "I think the next few years at the frontier of LLMs will be especially formative."

The Rundown
Google Revamps Search for the AI Era, Target's Turnaround Gains Steam as Sales Surge

The Rundown

Play Episode Listen Later May 20, 2026 10:40


Check out the Public app for incredible investing tools and to support the show (LINK)Follow us on Instagram (@TheRundownDaily) for bonus content and instant reactions.In today's episode:Google unveils Gemini 3.5 Flash, AI agent and a major overhaul to Search at its annual I/O developer conferenceTarget crushes earnings with its first positive same-store sales in five quarters and raises its full-year outlookCava defies the restaurant slowdown with 9.7% same-store sales growth and a raised forecastLowe's beats on earnings but stock slips as comparable sales disappoint in a tough housing marketOpenAI co-founder Andrej Karpathy joins Anthropic

Techmeme Ride Home
Elon Loses

Techmeme Ride Home

Play Episode Listen Later May 19, 2026 20:06


The Musk v. Altman jury unanimously rejected Musk's claims on statute of limitations grounds. Andrej Karpathy joined Anthropic's pre-training team. Polymarket partners with Nasdaq on private company markets, Blackstone and Google form a TPU venture, and KPMG embeds Claude into tax advisory. Musk v. Altman: the jury unanimously rejects Elon Musk's claims against OpenAI and Sam Altman, as he filed them outside of a three-year statute of limitations (CNBC) Andrej Karpathy joins Anthropic to help launch a team focused on using Claude to accelerate pre-training research; he helped found OpenAI and worked at Tesla (Axios) Polymarket partners with Nasdaq to launch markets tied to private company milestones, including IPO timing, valuations, earnings, and secondary market activity (The Block) Blackstone announces a joint venture with Google to create a US company that will offer customers Google TPU access, and makes a $5B initial equity commitment (WSJ) KPMG partners with Anthropic to embed Claude into its tax and advisory platforms; KPMG's tax and legal services unit saw revenue grow ~8% YoY to $9.3B in 2025 (WSJ) Learn more about your ad choices. Visit megaphone.fm/adchoices

Marketing Against The Grain
This AI Second Brain Remembers Everything I Save (Codex)

Marketing Against The Grain

Play Episode Listen Later May 12, 2026 30:18


Free Guide to Build Your Personal AI Wiki: https://clickhubspot.com/fcmt Ep. 425 You can create a second brain in 15 minutes and for free. Kipp and Matt Wolfe (Future Tools) dive into how to build your own "second brain" using tools like Obsidian and Codex, so you can finally make use of all the information you're gathering. Learn more on setting up a personal wiki to organize your knowledge, harnessing AI-powered agents to automate and interlink content, and turning your information hoarding into proactive recommendations and smarter business decisions. Mentions Matt Wolfe https://www.linkedin.com/in/matt-wolfe-30841712/ Future Tools https://futuretools.io/ Obsidian https://obsidian.md/ Obsidian Web Clipper https://obsidian.md/clipper Codex https://openai.com/codex/ Claude Code https://code.claude.com/docs/en/overview Cursor https://cursor.com/ Andrej Karpathy https://karpathy.ai/ Visual Studio Code https://code.visualstudio.com/ Notion https://www.notion.com/ Get our guide to build your own Custom GPT: https://clickhubspot.com/customgpt Resource [Free] Steal our favorite AI Prompts featured on the show! Grab them here: https://clickhubspot.com/aip We're on Social Media! Follow us for everyday marketing wisdom straight to your feed YouTube: ​​https://www.youtube.com/channel/UCGtXqPiNV8YC0GMUzY-EUFg  Twitter: https://twitter.com/matgpod  TikTok: https://www.tiktok.com/@matgpod  Thank you for tuning into Marketing Against The Grain! Don't forget to hit subscribe and follow us on Apple Podcasts (so you never miss an episode)! https://podcasts.apple.com/us/podcast/marketing-against-the-grain/id1616700934   If you love this show, please leave us a 5-Star Review https://link.chtbl.com/h9_sjBKH and share your favorite episodes with friends. We really appreciate your support. Host Links: Kipp Bodnar, https://twitter.com/kippbodnar   Kieran Flanagan, https://twitter.com/searchbrat  ‘Marketing Against The Grain' is a HubSpot Original Podcast // Brought to you by Hubspot Media // Produced by Darren Clarke.

Supra Insider
#110: Why AI makes systems thinking the most valuable skill for PMs | Apurva Garware (ex-VP Product at Upwork, ex-Amazon)

Supra Insider

Play Episode Listen Later May 11, 2026 58:53


What if the most important skill for building AI products has nothing to do with evals, technical background, or knowing how to write a prompt? What if it is the ability to design systems that can handle what you never planned for?In this episode of Supra Insider, Marc Baselga and Ben Erez sit down with Apurva Garware, who has built and scaled products across Amazon, Microsoft, and Upwork, to make the case that systems thinking is the defining skill of the next era of product management. Apurva explains why non-determinism forces PMs to stop thinking in features and start designing the guardrails, agent contracts, and escalation points that govern how a system behaves at runtime, when no one is watching. They explore a three-phase framework for governing AI systems across design, deployment, and production; heuristics for deciding what to hand to agents versus escalate to humans; and a sharp insight about the two products every AI-native company is actually building: the customer-facing product, and the internal operational system that drives margin and velocity. Marc and Ben also share their own experience calibrating an agentic workflow at Supra, grounding the conversation in practice.If you are a PM trying to find your footing in the AI era without a deeply technical background, a founder wrestling with when to reach for AI versus simpler deterministic automation, or a product leader who wants to build more discipline into how your team ships AI products, this episode is for you.All episodes of the podcast are also available on Spotify, Apple and YouTube.New to the pod? Subscribe below to get the next episode in your inbox

da Brand a Friend
#423 - Oltre i Second Brain

da Brand a Friend

Play Episode Listen Later May 10, 2026 22:45


#423 - Oltre i Second BrainIn questa puntata racconto una piccola grande idea di Andrej Karpathy: l' LLM-wiki. Invece di usare l'IA per interrogare ogni volta una pila di documenti, Karpathy propone di fare una cosa diversa: usare l'IA per trasformare quei documenti in una knowledge base che non si limita a conservare informazioni, ma che cresce, si aggiorna e diventa più utile ogni volta che ci interagisci o aggiungi nuove fonti.Entra in scena The Curator, un'app open-source creata da Dr. Tali Režun, che prova a rendere questa idea utilizzabile da chi lavora con le informazioni: esperti, consulenti, advisor.The Curator prende PDF, articoli, note e trascrizioni, li trasforma in concetti, entità, riassunti e collegamenti, e costruisce una mappa viva del tuo sapere. Puoi visualizzarla, interrogarla, correggerla, editarla e farla crescere anche a partire dalle conversazioni che hai con lei.E no, non è la stessa cosa di NotebookLM. NotebookLM ti aiuta a parlare con le tue fonti. The Curator ti aiuta a costruire una infrastruttura privata di conoscenza dove le fonti si parlano fra loro.Il vantaggio competitivo che adottare uno strumento del genere può portare ad un esperto è rilevante. Specialmente se ci sono tanti altri autori-esperti che parlano di uno stesso argomento. Un esperto che inizia oggi a costruire una propria infrastruttura privata di conoscenza, aggiungendo fonti in modo metodico, interrogandola e aggiungendo sistematicamente nuovi spunti e riflessioni, avrà fra un anno una capacità di vedere, comprendere e fare ragionamenti che gli altri autori nel suo stesso settore, non potranno.Il futuro non è pubblicare di più. È pensare meglio. Collegare meglio. Capire meglio. E costruire un sistema che renda questa profondità visibile, utile e riutilizzabile nel tempo.Cosa Posso Fare per TeAiuto esperti e consulenti che stanno costruendo una nuova carriera online a guadagnare l'autorevolezza, la credibilità e la visibilità che inizialmente non hanno. Lo faccio insegnando come diventare fonti di riferimento fidate nel proprio campo. A questo scopo ho creato un video workshop gratuito (in Inglese), di 55 minuti, che aiuta a scoprire:Perché i formati di contenuto curati sono così utili per costruire credibilità e autorevolezzaQuando usarli e quali requisiti richiedonoUna lista aggiornata di esempi reali di questi formati al lavoroGli strumenti specifici che servono per fare questo di coseI passaggi chiave che trasformano la ricerca e la scrittura in fonte di riferimentoGli 11 errori tipici che fanno coloro che curano info e risorse da pocoThe Curation-Based Authority System - Video Workshop.Auditing del tuo Progetto OnlinePer esperti e consulenti che hanno bisogno di un scambio diretto e desiderano potersi confrontare direttamente con me per analizzare gli ostacoli che stanno incontrando e migliorare il loro piano di azione.Consulenza privata one-to-one disponibile in due livelli:a) per chi è appena agli inizi eb) per chi pubblica da più di un anno, ma non sta vedendo i risultati che si aspettava.https://robingood.gumroad.com/l/one-to-one-audit-and-consulting _______________Info Utili• Sostieni questo podcast:Entra in contatto con me, ottieni feedback, ricevi consigli sul tuo progetto onlinehttps://Patreon.com/Robin_Good•  Musica di questa puntata:"Bittersweet" by baskaat disponibile su YouTube•  Nella immagine di copertina:Design e infomap Robin Good - Immagine generata da ChatGPT Image 2• Ascolta e condividi questo podcast:https://www.spreaker.com/show/dabrandafriend• Archivio del podcast organizzato per temi:https://start.me/p/kxENzk/da-brand-a-friend-archivio-podcast• Seguimi su Telegram:https://t.me/RobinGoodItalia• Newsletter in Italiano:https://robingooditalia.substack.comEssere riconosciuti come punti di riferimento fidati curando invece che creando.• Newsletter in Inglese:https://robingood.substack.comHow to be recognized as top trusted influencers by becoming top sector curators. . 

Keen On Democracy
Do We Really Want a No-Hands Job From Silicon Valley? Who Holds the Power in the Age of AGI

Keen On Democracy

Play Episode Listen Later May 2, 2026 48:42


“Anyone that's properly using AI now knows that you tell it what you want, it gives you a plan, carries out the work, and you judge and tweak. You're not a passive victim — you're an active user with outcomes in mind.” — Keith Teare Do we really want a no-hands job from Silicon Valley? That Was the Week newsletter publisher Keith Teare — who thinks all tech innovation results in human progress — thinks we do. No hands, no problem, Keith says. But I'm not sure. Especially given the powers-that-be giving us that no-hands job. Keith welcomes the end of what he calls the “typed” and “touched” computing era — keyboards, mice, touchscreens, and all the manifold ways we have used our hands to interact with computers since the 1980s. That's the outcome, he predicts, of the race to AGI. So far so good. But what happens if our no-hands AI future is controlled by Google, Microsoft, Amazon, and Facebook? This week these four behemoths committed 00 billion to AI infrastructure investment in 2026 alone — 2 percent of all US GDP. These companies are racing to build (and own) the foundational mechanics of AGI. That's always how it's been, Keith says, embracing our no-hands future. I'm less open-armed. What happens if we want our hands to fend off AGI? No, I'm not so keen on a no-hands job from Silicon Valley. Especially one couched in the altruism of human progress. Five Takeaways •       The End of the Hand-Driven Computing Era: Andrej Karpathy's observation at Sequoia's AI Ascent: he no longer uses his hands to do his work. He speaks to the computer; the computer acts; he judges and refines. The keyboard, the mouse, the touchscreen — all the hand-driven interfaces that have defined computing since the 1980s are entering their twilight. Karpathy calls it “software 3.0”. Keith, two years ago, wrote an editorial called “eyes, hands, ears, and mouth” about the inclusion of other human attributes beyond hands. That prediction has arrived. •       $700 Billion: The CapEx Explosion: A post by @Signal framed the week's numbers: $700 billion in AI infrastructure spending in 2026, equivalent to 2 percent of all US GDP. This kind of spending, the post observes, usually happens via governments or wars. This time, it's four private companies — Microsoft, Amazon, Google, and Meta — racing to build the foundational mechanics of AGI. Meta was punished by Wall Street for overspending; Google was rewarded because its numbers were strong enough to justify it. The same bet, two different verdicts, depending on your quarterly earnings. •       Was the Internet Privately Built? The ARPANET Argument: Keith's claim: innovation waves have always been privately financed. The railways, the telephone, the electricity grid, the commercial internet. Andrew's counter: ARPANET was a massive government investment that created the protocols on which the internet runs. Keith's response: ARPANET was a university bulletin board that created the precedent, not the infrastructure. Andrew's response: that's not exactly what ARPANET was. They agree that government research matters. They disagree on how much credit it deserves for what became the commercial internet. •       The Revenge of the Idea Guy: Sam Altman's line of the week. In the past, an idea person came up with a concept and then needed expensive engineers to build it. Many ideas never saw the light of day because the engineering cost was prohibitive. Now, anyone can speak an idea into existence. AI builds the plan, executes the work, and you judge and refine. That changes the economics of creativity, advertising, software development, and anything else that used to require specialist execution. The specialist is not dead — but specialists will increasingly use AI to scale themselves, rather than being hired one at a time. •       Should Kids Use AI in Schools? A New Yorker piece asks what it would take to get AI out of schools. Keith's view: the premise misunderstands how AI works now. The fear is passive students asking chatbots for answers and having their brains atrophy. The reality is that proper AI use requires active judgment at every step — telling it what you want, refining the plan, evaluating the output. If schools understand that, they embrace AI. If they don't, they produce graduates unequipped for a world in which the idea guy with AI tools now has the power the engineering team used to have. Andrew's prediction: the kids whose parents ban AI will eventually sue them. About the Guest Keith Teare is a British-American entrepreneur, investor, and publisher of the That Was the Week newsletter — a daily curation of the most important stories at the intersection of technology, business, and culture. He is a co-founder of TechCrunch and a long-time interlocutor on Keen On America. References: •       That Was the Week newsletter by Keith Teare — this week's editorial: “Hand Job?” •       Andrej Karpathy at Sequoia Capital AI Ascent 2026 — the Karpathy interview on Software 3.0 and the end of typed input. •       @Signal, “$700 billion on AI infrastructure” — the post that framed the CapEx question. •       Jessica Winter, “What Will It Take to Get AI Out of Schools?” The New Yorker, 2026. •       Episode 2891: John Steele Gordon on how information technology knitted America together — the ARPANET backstory that feeds directly into this week's argument. About Keen On America Nobody asks more awkward questions than the Anglo-American writer and filmmaker Andrew Keen. In Keen On America, Andrew brings his pointed Transatlantic wit to making sense of the United States — hosting daily interviews about the history and future of this now venerable Republic. With nearly 2,900 episodes since the show launched on TechCrunch in 2010, Keen On America is the most prolific intellectual interview show in the history of podcasting. WebsiteSubstackYouTubeApple PodcastsSpotify Chapters: (00:31) - Keith leads with “Hand Job?” — explaining the headline (03:27) - Karpathy at Sequoia: the end of typed and touched input (04:30) - CapEx: the real story of the week (05:35) - $700 billion — 2% of US GDP on AI infrastructure (06:38) - Was the commercial internet privately built? (07:35) - ARPANET: pathetic bulletin board or foundational infrastructure? (09:08) - Keith and Andrew agree to disagree on government's role (11:00) - Big Tech earnings: Google up, Meta down, and why (17:00) - OpenAI's strategy: the long game

Training Data
Andrej Karpathy: From Vibe Coding to Agentic Engineering

Training Data

Play Episode Listen Later Apr 30, 2026 29:48


Andrej Karpathy (co-founder of OpenAI, former head of AI at Tesla, and now founder of Eureka Labs) talks with Sequoia partner Stephanie Zhan at AI Ascent 2026 about what's changed in the year since he coined "vibe coding." He explains why he's never felt more behind as a programmer, why agentic engineering is the more serious discipline taking shape on top of vibe coding, and why we should think of LLMs not as animals but as ghosts: jagged, statistical, summoned entities that require a new kind of taste and judgment to direct. He also touches on Software 3.0, the limits of verifiability, and why you can outsource your thinking but never your understanding.

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0
Shopify's AI Phase Transition: 2026 Usage Explosion, Unlimited Opus-4.6 Token Budget, Tangle, Tangent, SimGym — with Mikhail Parakhin, Shopify CTO

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Apr 22, 2026 72:25


Early bird discounts for the San Francisco World's Fair, the biggest AIE gathering of the year, end today - prices will go up by ~$500 tonight so do please lock in ASAP!From near-universal AI tool adoption inside Shopify to internal systems for ML experimentation, auto-research, customer simulation, and ultra-low-latency search, Mikhail Parakhin joins us for a deep dive into what it actually looks like when a 20-year-old, $200B software company goes all-in on AI. We cover why Shopify has become much more vocal about its internal stack, what changed after the December model-quality inflection, and why the real bottleneck in AI coding is no longer generation, but review, CI/CD, and deployment stability.We also go inside Tangle, Tangent, SimGym, which are three major AI initiatives that Shopify is doing to make experimentation reproducible, optimization automatic, customer behavior simulatable, and search and catalog intelligence faster and cheaper at scale. Along the way, Mikhail explains UCP, Liquid AI, and why token budgets are directionally right but often measured badly, why AI-written code can still increase bugs in production, what makes Shopify's customer simulation defensible, and what he learned from the Sydney era at Bing.We discuss:* Mikhail's path from running a major Microsoft business unit spanning Windows, Edge, Bing, and ads to becoming CTO of Shopify* Why Shopify is talking more publicly about AI now, and why staying at the frontier has become necessary for the company* Shopify's internal AI adoption curve, the December inflection, and why CLI-style tools are rising faster than traditional IDE-based tools* Why Jensen Huang is directionally right on token budgets, but raw token count is still the wrong way to evaluate engineering output* Why the real unlock is not more agents in parallel, but better critique loops, stronger models, and spending more on review than generation* Why AI coding can still lead to more bugs in production even if models write cleaner code on average than humans* Why Shopify built its own PR review flow, and why Mikhail thinks most off-the-shelf review tools miss the point* How PR volume, test failures, and deployment rollback are becoming the real bottlenecks in the agent era* Why Git, pull requests, and CI/CD may need a new metaphor once code is written at machine speed* What Tangle is, and how Shopify uses it to make ML and data workflows reproducible, collaborative, and production-ready from the start* Why Tangle is different from Airflow, and why content-addressed caching creates network effects across teams* What Tangent is, and how Shopify is using auto-research loops to optimize search, themes, prompt compression, storage, and more* Why Tangent is becoming a democratizing tool for PMs and domain experts, not just ML engineers* Why AutoML finally feels real in the LLM era, and where auto-research still falls short today* Why Tangle, Tangent, and SimGym become much more powerful when combined into one system* What SimGym is, why simulated customers only work if you have real historical behavior, and why Shopify's data gives it a moat* How SimGym evolved from comparing A/B variants to telling merchants what to change on a single live storefront to raise conversions* Why customer simulation is so expensive, from multimodal models to browser farms to serving and distillation costs* How Shopify models merchant and buyer trajectories, runs counterfactuals, and thinks about interventions like discounts, campaigns, and notifications* Why category-level behavior is so different across commerce, and why ideas like Chinese Restaurant Processes are showing up again in practice* Shopify's new UCP and catalog work, including runtime product search, bulk lookups, and identity linking* Why Shopify is using Liquid AI, and why Mikhail sees it as the first genuinely competitive non-transformer architecture he has used in practice* Where Liquid already works inside Shopify today, from low-latency query understanding to large-scale catalog and Sidekick Pulse workloads* Whether Liquid could become frontier-scale with enough compute, and why Shopify remains pragmatic and merit-based about model choice* Who Shopify is hiring right now across ML, data science, and distributed databases* The Sydney story at Bing, why its personality was not an accident, and what Mikhail learned from deliberately shaping AI character early onMikhail Parakhin* LinkedIn: https://www.linkedin.com/in/mikhail-parakhin/* X: https://x.com/MParakhinTimestamps00:00:00 Introduction: Mikhail Parakhin, Microsoft, and Shopify00:01:16 Why Shopify Is Talking More About AI00:02:29 Internal AI Adoption at Shopify and the December Inflection00:06:54 Token Budgets, Jensen Huang, and Why Usage Metrics Can Mislead00:10:55 Why Shopify Built Its Own AI PR Review System00:12:38 AI Coding, More Bugs, and the Real Deployment Bottleneck00:14:11 Why Git, PRs, and CI/CD May Need to Change for Agents00:18:24 Tangle: Shopify's Reproducible ML and Data Workflow Engine00:21:19 Why Tangle Is Different from Airflow00:26:14 Tangent: Auto Research for Optimization and Experimentation00:30:07 How Tangent Democratizes Experimentation Beyond ML Engineers00:33:06 The Limits of Auto Research00:36:36 Why Tangle, Tangent, and SimGym Compound Together00:37:20 SimGym: Simulating Customers with Shopify's Historical Data00:42:47 The Infra Behind SimGym00:46:00 Why SimGym Gets Better with Real Customer History00:47:30 Counterfactuals, HSTU, and Modeling Merchant Trajectories00:51:55 CRPs, Clustering, and Category-Level Customer Behavior00:53:30 UCP, Shopify Catalog, and Identity Linking00:55:07 Liquid AI: Why Shopify Uses Non-Transformer Models00:59:13 Real Shopify Use Cases for Liquid01:03:00 Can Liquid Scale into a Frontier Model?01:09:49 Hiring at Shopify: ML, Data Science, and Databases01:10:43 Sydney at Bing: Personality Shaping and AI Character01:13:32 Closing ThoughtsTranscript[00:00:00] swyx: Okay. We're here in the studio, a remote studio, with Mikhail Parakhin, CTO of Shopify. Welcome.[00:00:08] Mikhail Parakhin: Thank you. Welcome.[00:00:10] swyx: I don't even know if I should introduce you as CTO of Shopify. I feel like you have many identities. Uh, you led sort of the, the Bing ML team, I guess, uh, uh, or ads team. I, I don't know, I don't know, uh, you know, it's, uh, people va-variously refer you as like CEO or, or, uh, I don't know what that, that, that said previous role at Microsoft was.[00:00:29] Mikhail Parakhin: Uh, that was... Yeah, my previous role w- at Microsoft was the-- I actually was the CEO of one of Microsoft's business units, which included, as I, you know, as we discussed, all the things that people like to laugh about, uh, including Windows and Edge and Bing and ads and everything.[00:00:47] swyx: Yeah, yeah. What a, what a, what a wild time.You've obviously, uh, done a lot since you landed at Shopify. Uh, one of the reasons I reached out was because you started promoting more sort of internal tooling, uh, primarily Tangle, but also a lot of people have seen and adopted Tobi's QMD, uh, and obviously, I think, uh, Shopify has always been sort of leading in terms of, uh, engineering.I think more-- it's just more recent that you guys have been more vocal about your sort of AI adoption. Is that, is that true?[00:01:16] Mikhail Parakhin: Well, I think AI tools in general are fairly recent development, uh, and we've-- Shopify, you know, at this stage of its development, we're developing AI in-in-house and other, uh, building tools that use AI and, you know, interfacing with the wider AI community, uh, you know, are on the sort of the, uh, runaway trajectory.So it just did by sort of natural byproduct. We, we talk about it more also. We just, uh, just even yesterday, Andrej Karpathy was famous in tweeting about, oh, are there some, uh, ways, uh, that, that you can organize your agents to store the data and then, uh, look up the data so that you don't have to research or, or lose context every- Yestime. And a little bit tongue in cheek, I tweeted that, “Hey, we've, we've done it much earlier, and we even have different approaches, Tobi and I.” Tobi, of course, is a big fan of QMD, and I'm more of a SQL, SQLite fan. But, uh, yeah, very similar things that we've already done here. The point is, yeah, we're very dynamic, you know, explosively growing company, and we have to be at the forefront of AI adoption, obviously.[00:02:29] swyx: Yeah. Yeah. Um, you, your team kindly prepared some slides actually that we were gonna bring up on to, uh, the screen. I think I can, I can screen share, and then we can kind of go through some of the shocking stats that maybe, maybe put some numbers to what exactly is going on. So here we have, uh- An internal AI tool adoption chart.What are we looking at here? What ?[00:02:54] Mikhail Parakhin: Yeah, this is very interesting statistics. Uh, this is number of daily active workers, you know, think of, uh, DAO, basically the active users of-[00:03:05] swyx: Yeah ...[00:03:05] Mikhail Parakhin: AI tool as a percentage of all the people in the company, right? And then- Yeah ... different AI tools. And, uh, you could see two things here is that one is the green is total.Uh, green is just total. So you could see that it approaches really % by now. It's hard not to do your job now without interacting deeply, at least with one tool. You could see another interesting thing is just as many people commented in December was the phase transition when suddenly models gotten good enough that, that everything took off and started growing.Uh, it, it was many people noticed that the thing is that small improvements accumulated into this big change in Sep- December roughly timeframe.[00:03:52] swyx: Yeah.[00:03:52] Mikhail Parakhin: The other thing I would claim you could see is that, uh, CLI-based tools and tools that don't require you to look at the code becoming more popular, and you could see, yeah, various versions of, uh, Cloud Code and Codex and Pi and internal development tools taking off.Uh, exactly, yeah, uh, and blue is our River, just internal agent for coding, where tools, uh, that require IDEs such as, uh, GitHub, Copilot or Cursor, they're not exactly shrinking, but they're not growing as fast. Like, uh, red, red line is, is the IDE kind of tools. So you could see that they're, they're not experiencing as, as fast of a growth.[00:04:37] swyx: As I understand it, basically, every employee has their choice, right? Of choose whatever tool you use, and then you're just kind of doing a, a daily sur-survey or something.[00:04:47] Mikhail Parakhin: Exactly. And, uh, we- Yeah ... the, the push is to get your job done, you can use any tool, and we effectively fund unlimited tokens for everybody.Uh, we, we do, we do try to control the models that, uh, people use, but from the bottom, not from top. Like we basically say, “Hey, please don't use anything less than Opus four point six.”[00:05:09] swyx: Oh .[00:05:10] Mikhail Parakhin: Some people, some people end up using GPT five point four extra high. Some people use Opus four point six. Um, uh, you know, uh, there are some, uh, there are plus and minuses in going for full one million context window versus not.But, uh, we try to discourage people from using anything less than that.[00:05:28] swyx: Yeah, yeah. Got it, got it. Uh, I mean, uh, that's, you know... The, the next chart here, it really kind of shows the expansion and the sort of December twenty twenty-five inflection, right? That, uh, people are using a lot of tokens. I think it's also really interesting that no one was kind of abusing it in twenty twenty-five.Like it was- Had comparatively, uh, to this year, there was almost no growth. I mean, it's still like, you know, probably, probably gave fifty percent.[00:05:56] Mikhail Parakhin: Yeah. This is just a different scale. It's still exponential- Yeah, yeah ...growth at just a different- ...rate of expansion. Uh, there was inflection point, and Sean, I would claim the, the super interesting part here is that you could see that the distribution becoming more and more skewed.Yes. The top percentiles grow faster. So that means- Yeah ...the people in the top ten percentile, they, their consumption grows faster than seventy-five and so forth. So, uh, the distribution skews more and more towards the highest users, which is... I don't know what it tells me. It's like it feels not ideal, to be honest.Or maybe it's okay. We'll see.[00:06:36] swyx: Why does it feel not ideal? Is, is it because of, um, quantity over quality, or what's the concern?[00:06:42] Mikhail Parakhin: Because take it to the limit. That means, you know, if, if this rate of separation continued- Ah, yes ...a year, there will be one person consuming all the tokens. So it's just, it's kinda strange.[00:06:54] swyx: Yeah, I mean, um, uh, I, I think internal like teaching and all that, uh, will, will help sort of distribute things more widely. But in, in the early days, of course, the people who are sort of more AI-pilled will obviously find more ways to use it than the people who are less AI-pilled. Maybe let's, let's call it that.I'll just, I'll just kinda quickly, uh, pause from the, the... You know, we will go back to the rest of the slides, but I just wanna, um, review, you know, there are a lot of CTOs of, of large companies like yourself where they're all considering some kind of token budget, right? Like I think it's something, something that Jensen Huang has been talking about, where like if your 200K engineer is not using 100K of tokens every year, like they're, they're underutilizing coding agents.Of course, Jensen Huang would say that, but like it seems a very quantity over quality approach and like some, some people are basically saying like, well, is this comparable to judging engineer quality by lines of code, right? Which we also know is like kind of flawed, but better than nothing. So I, I don't know if you have like a sort of management take here on, on how to view this kind of, uh, metrics.[00:08:02] Mikhail Parakhin: Well, I mean, you're, you're baiting me. I, I like... This is my favorite topic. Uh, if you let me, I'll probably talk for two hours on just this. I have a lot of things to say. Like I do think Jensen gotten a lot of bad press saying, “Oh, of course you're, you know, this, uh, the- ...the cake seller says you don't need enough cakes.”You know? Like, of course. Uh, but, uh, I actually, uh, think that's undeserved. I think he, he's actually right. Uh, I do think- He,[00:08:33] swyx: he's directionally correct.[00:08:35] Mikhail Parakhin: Yeah. Yeah. He's directionally correct for sure. Uh-[00:08:37] swyx: Who knows what the right number is? Yeah.[00:08:39] Mikhail Parakhin: The thing that I do Uh, want to say, and this is something that we learned through trial and error and very important is like two things.One is that it's not about just consuming tokens. Uh, you can consume tokens and, and in fact, the anti-pattern is running multiple agents, too many agents in parallel that don't communicate with each other. That's almost useless, uh, compared to just fewer agents and burns tokens very efficiently. Uh, setting up the right critique loop, especially with the high quality models, where one agent does something, the other one, ideally with a different model, critiques it, uh, suggests ways to improve it, the agent redoes it with this critique and, and so it takes much longer.So people don't like it because latency goes up. You know, they, they have to wait until this debate is happening. But, uh, the quality of the code is much higher. And another thing, just since you mentioned like, look, uh, uh, yeah, the overall budget is just like, uh, lines of codes. Lines of codes are exploding for everybody right now, or partially because AI is really mover balls, but partially just because AI can write a lot more code, you know, doesn't get tired.And so you have to have to have a very strong narrow waist during PR review. Otherwise, just the number of bugs will go through the roof. It's, uh, it's this unexpected consequence of the just volume trumping everything. I would claim by now good model writes code on average with fewer bugs than, than the average human.But since they write so much more of it, like more of it will make it into production. So you have to- You still[00:10:26] swyx: have[00:10:26] Mikhail Parakhin: more bugs. Yeah. Have to have a very rigorous PR reviews, also automated of course. But, uh, yeah, that to spend a lot budget there. Like this, this for me, for me, actually, the important metric is the ratio of budget spent during code generation versus, uh, spent, uh, expensive tokens like GPT, uh, five point four Pro or, uh, uh, Deep Think from Gemini, you know, checking on PR reviews.[00:10:55] swyx: Yeah, totally. Uh, I noticed in your chart you didn't have any review tools. Do you just use like, like let's say a Claude code to review tools? Or do you have another set of review tools like the Greptiles, the Code Rabbits, uh, Devin Reviews has a review tool. I don't know if you've had those specialist review tools.[00:11:13] Mikhail Parakhin: You are a little bit jumping on my store tool right now because the graphs I was only showing public tools. Uh, uh, the-- I haven't found a good PR review tool that, that does what I think should be done. And, uh, partially my, my thinking is because it's so... It just goes against both what people feel like emotionally they prefer and, uh, some of the, uh, you know, frankly Even business models that, that the companies run.At peer review tool, uh, time, you want to run the largest models. That means, I don't know, Codex or, or, uh, Cloud Code is not gonna cut it. You need to have pro-level models if you really want to, uh, stand the tide of bots from going into production. And you need us to spend a lot of time, the models taking turns, but you don't want, like, a big swarm of, uh, of, uh, agents.So in fact, you end up in a different dual-dualistic world where you generate not that many tokens. You, in fact, generate few tokens, but it takes f-a long time because these are expensive models taking turns rather than many, many agents trying to do many things in parallel. So that's, that's why I feel like I haven't found good tools, so we are using our own for peer review for now.[00:12:33] swyx: Yeah. Yeah. I mean, uh, I think a lot of companies are building their own, uh, especially to their needs, right?[00:12:38] Mikhail Parakhin: Mm-hmm.[00:12:38] swyx: Um, I, uh, you also have a chart here going back to the slides on, uh, PR merge growth, where we're now at thirty percent, uh, month on month rather than ten percent. Uh, and also the, the estimated complexity is going up.You know, this is productivity, right? ‘Cause y- presumably there's more stuff going into the code base and more, more features getting worked on. I'm curious about the backlog, right? Like the, the, the-- I actually don't mind a pro-level model taking an hour or two hours to review my PR, because I've dealt with humans who take a week to review my PR, right?And I keep pinging them on Slack, “Hey, hey, review my PR.” So, you know, I think there's some trade-off here where, like, it still doesn't make sense.[00:13:18] Mikhail Parakhin: Exactly. That, that's exactly m-my point. Uh, that on one hand, you can tolerate longer latencies at, uh, PR. On the other hand, like right now, the real problem is not in spending time waiting for PR.It's real problem is since there's so much more code than- Yeah ... uh, probability of at least some tests failing going up, and then you, like, keep de-failing, then you have to find the offending PR, evict it, retest it without that PR, and so deployment cycle becomes much longer. Uh, so it actually, in terms of the overall time to deploy, it's total time savings if you spend more time on a longer model, like thinking for an hour, because then, then you, you don't have to spend all that time during testing and rolling, you know, rolling back the deployment.[00:14:03] swyx: Yeah, totally. That's still worth it. You know, you don't look at the individual, look at the aggregate, and look at the, the, the change in the aggregate system.[00:14:11] Mikhail Parakhin: Exactly.[00:14:11] swyx: I'm kind of curious if, like, there's this PR mentality and, like, c-- the, the, the CICD paradigm will be changed eventually. Some people are like, obviously a lot of people want new GitHub, but I even wonder if, like, Git is the problem, right?Like, is that the bottleneck? Is the concept of a PR a bottleneck? Do you guys use stack diffs? I don't know if, uh, that's a, like, a merge queue stack diff type of thing.[00:14:34] Mikhail Parakhin: We, we use, we use Stacks, we u- we use Graphite. We worked with, uh, Graphite a lot. Uh, so we use Stack, uh, PRs. I think, uh, like that's clearly the overall CICD in general, and the interaction with the code repository right now is the, clearly the sort of the, the main issue and the bottleneck for us, uh, and highest top of mind.I would say we probably need a different metaphor or different whole design of how to process it in new agentic world. I haven't seen anything dramatically better yet. I, I think everybody right now is just trying to keep their head above the water ‘cause, ‘cause there, there's so many PRs and then everybody's CICD pipelines start creaking, the, the times are increasing, the number of bugs slipping by increasing, and you have to, have to clap on down.And so we are a little bit in this situation when we need to first stabilize that story and then start thinking, hey, what, what it could be a completely different and new world, which I haven't... I know some people working on it. I haven't seen something, like anything super compelling yet, but clearly the old thing were designed for humans will need to be morphed into something new.[00:15:53] swyx: One of the thing that I, I think about is kind of like the merge conflict is basically a global mutex on the whole system, right? And in, in hu- in human organizations, we do have something like that. It's the company standup. But like, other than that, it's like it's actually fitting for us to be somewhat decentralized, somewhat plugged into one stream of information source, but somewhat lossy.Like it's okay, you know, that, that not every delivery is like atomic consistency. Like we're not dealing with a database sometimes.[00:16:27] Mikhail Parakhin: This is a very good point, uh, because since humans don't write code too fast, you know that global mutex is not too bad. Once you-[00:16:36] swyx: Yes ...[00:16:37] Mikhail Parakhin: start writing code at the speed of machine, it becomes the, you know, the bottleneck.Then what do you do? Maybe, and I can't believe I'm saying this because I, I'm long-- lifelong opponent of, uh, microservices, and I always thought that was, like, a really bad idea. And now that you're saying it, like, maybe in new guys like microservices will make a comeback, you know, because then you, you can ship things independently in tiny things and, and the managing all that complexity automatically will be much easier.I don't know. Like, we'll s-- we'll have to see.[00:17:10] swyx: Yeah. I mean, I don't know what the Microsoft or, or Shopify thing is, but I, I read this paper from Google where they have a monorepo that deploys into microservices, right? And then, uh, the other concept that I think about a lot is the Chaos Monkey concept from, from Netflix.Being able to create, like, this robust system where, um, uh, you know, you, you have the service discovery, you have the, uh, the independent, independent microservices discovery and, and, uh, you know, probably going to be a fair amount of duplication. That's how an organic system sort of scales, uh, that, that you have that...I don't know how you call it. Slack? Robustness? Depend-- uh, d-duplication. I, I, I forget the-- I, I'm-- And this-- those-- these are not exactly the terms- Hmm ... I'm looking for, but I c-can't really think of the words. Okay. I was gonna go into Tangent and Tangle. Uh, so, uh, we, we sort of discussed the overall stats that, uh, Shopify has.Uh, but, you know, I, I think some, some pretty cool stuff that you guys are working on is your ML experimentation, uh, and your, your sort of auto tr-research training pipeline. Presumably you're much closer to this one because it's, it's a sort of personal hobby of yours. How, how would you explain them in, together?I thought we have a slide that, like, uh, has the s- the system diagram.[00:18:24] Mikhail Parakhin: Yeah. Tangle first and then Tangent as a-[00:18:27] swyx: Yeah ...[00:18:28] Mikhail Parakhin: as a thing on top of Tangle. And, uh, Tangle is the third generation, I claim, of, uh, systems of, uh, running any data processing, but a bit with a skew for ML experiments, but not necessarily. Any sort of data processing tasks where you need to iterate, share, and you have scale so that you want maximum efficiency.You know how, like, normally you would work, you would-- Imagine you're a data scientist or an ML practitioner, you would get Jupiter notebooks or, or maybe you would get, uh, you know, Pyth- your Python scripts, and you would manage the data, and you produce those TSV files, and you put them in some JFS or something.Then you would notice that, oh, it has this, uh, weird missing values. You go and write another script that, uh, goes and replaces them with, uh-[00:19:20] swyx: Ah ...[00:19:21] Mikhail Parakhin: dash S. And then, then you, then you run some, some, uh, “Oh, I need to filter bots.” And so you run some light GBM model that, uh, removes the bots. And then, then you like-- And then you, you kind of like get into shape, and then you start experimenting, and you run multiple experiments, and then you're like, “Oh my God,” like, “this experiment is worse.”You undo, and you cannot get to previous result. And like, “Ah, what did I do?” Like that. Again, then, then you finally like get everything working. Then you like start throwing it over the fence to production. You, you replicate it, those things don't work, and then sometimes you like don't notice that you forgot some feature naming and the, the features don't match.But then, like imagine you, you did everything, and then six months later you're like, have to repeat it because now there's more data, or you wanted to do another pass, and you're like, “What, what did I do?” Or like, or like, “This script crashes now,” or the, “the path has changed.” And then, then you're trying to, like you spend another month just doing ar- digital archeology on your own, you know, history, right?Now multiply that by many, many teams. Now imagine you got an intern that you wanna ramp up. Now you have to show that intern, “Oh, you know, look, here's the folder, there's the scripts, you know, ask your cloud agent to do, and then, uh, to, to figure it out.” And then cloud agent does something, and then you're, “Ah, yeah, right, right, it was the wrong folder.I forgot to tell you, I actually have this other thing I forgot myself.” And, and that's, that's the, like, the daily life we all, uh, all know it, uh, if, if you're a data scientist, machine practitioner, ma- machine learning practitioner or, uh, or even like any data managing, uh, person.[00:21:00] swyx: Yeah. So I, I used to do this, uh, f- uh, on the quant finance side, uh, in, in my hedge fund.So we did this before Airflow, and then, uh, obviously Airflow came along and, uh, then more recently Dagster, uh, I would say is like, in my mind, what I would use for that shape of problem, uh, where you had to materialize assets and create a pipeline.[00:21:19] Mikhail Parakhin: And that's, that's very good segue because... So Airflow is great, but Airflow is more about you, you have something and you wanna repeatedly run it in production on schedule.It's less about you as a team developing things and being able to share, and you grabbing the standard pipeline and saying, “Hey, I wanna change this tiny little component in the huge sea of data processing, and I don't wanna-- I wanna run ten experiments on this, and I wanna do hyperparameter optimization.”All that is very hard to do with Airflow. It's very easy to do with Tango. Tango is m- more about, it's everything about group of people Running experiments, it might be agents too nowadays. Uh, running experiments cheaply, collaborating, sharing results. Uh, you don't need to understand fully. You, you grab-- you clone somebody else's experiment or somebody else's pipeline, uh, run, uh, change small piece, run it, be, like, get it to production state, and then ship in one click.So then the... You don't have to port it into any other system to, to run in production. You can just run the same experiment. It's, it's fully production ready. And, and it's, uh, it has lots of... Again, as I said, it's third generation system. The original one was, I would claim there was Ether and then, uh, at least in my career, Ether was the first, first, uh, that pioneered this type of approach.And then there was, uh, Nirvana, which, uh, uh, at Yandex, which did kind of sec-second take on this. And now this one aggregates the, the learnings from all of those and, and Airflow as well to, to get to the state where you try it, it, it feels kind of magical. Uh, ‘cause now everything is based on content, uh, hashes.So even if the version changed, but if the output didn't change, nothing is being rerun. It's very efficient. If you... Multiple people start experiment that needs the same sort of data preprocessing, it's not repeated multiple times. It's automatically done only once. If you start ten experiments that all require, you know, some, some data preparation first as the first step, and you don't have to coordinate for that.Like, you don't have to know that other people are starting it. You now, it's very easy compos-, uh, composability, any language you can u- uh, you wanna use, and it's very visual. So you can see immediately, you can edit it easily, you can assemble small things with just even mouse clicks if you want to, and, uh, share, clone.And everybody knows also it's fully kind of static in the sense that we rerun it second time, it will exactly have the same results. Like, you will never have to do digital archeology. So full versioning and everything is also there.[00:24:06] swyx: Uh, so, so people can, uh... It's open source. Go to the GitHub repo and, and, uh, check it out.Uh, and it is also a really good, uh, blog post about it. I think all these is, like, really appealing. The, the, the, the thing that I think sells me the most about it is that, um, sort of development to production transition, right? Which I think, um, a lot of people haven't really solved that, uh, strictly, right?Like, we develop really, really well in, in Python notebooks, but then, you know, that's obviously not a sort of production ready process. I think that, like, any way in which that is solved, I think is, is very appealing. Then the other thing that you mentioned, which also raised my eyebrows, was content-based caching, which you mentioned is, is, um, you know, is ve-very much, uh, um, a sort of efficiency measure about, uh, you know, just like recalculation only on, on sort of content addressing Which I think makes sense.Uh, it surprised me that the savings could be this much, but maybe I just haven't worked at your scale where there's so much duplication, uh, that people just rerun because they change a single ID upstream.[00:25:10] Mikhail Parakhin: It does, yeah. But it's not only you rerun. The, the main savings are coming from the fact that you ran it, you got your job done, and you moved on.Then- Yeah ... somebody else in some department you don't know existed runs the same task, but on a newer version.[00:25:27] swyx: Yeah.[00:25:27] Mikhail Parakhin: Like right now, you can't, in, in most of the organizations, you can't even find out about it so that you can't even measure that you're spending that time twice, right? Here- Yeah ... if everybody's on Tango, that's detected automatically and detected that the output is the same.And then for that person, all it looks like is like experiment just suddenly moved, jumped forward, right? Uh, uh- Yeah ... so that's because, because the, there's network effect of multiple people helping each other.[00:25:51] swyx: Yeah. This is one of those things where it's designed to be a platform from the beginning rather than an individual developer's tool from the beginning, right?And, and everything's gonna streams down from there. That is the sort of Tango, uh, orchestrator, and it's, it manages jobs. We've seen a few versions of this, and this is obviously, uh, uh, the sort of, uh, unique approaches that you guys have, have, uh, figured out. And then there's Tangent.[00:26:14] Mikhail Parakhin: Yeah. And Tangent is basically an automatic auto research loop that can help and kind of do your work for you.Uh- ... you know, uh, effectively, effectively, Andrej Karpathy recently popularized it with auto research. Yes. Remember he said like he was, uh, speed running this, uh... Yeah, uh, you know the story. The, here we're basically bringing the same capability into Tango so that, uh, the, uh, Tangent can analyze it. It's just an agent that can run multiple experiments, figure out what can be changed, and keep on rerunning it, keep on modifying until, uh, maximizing some goal, some loss function, whatever you need to, to achieve.And in general, I would say if you're not using auto research-like approach in whatever you do, like literally whatever you do, then you're missing out. We saw at Shopify that taking like a wildfire, anything where you can put measurements can be done dramatically better. Our-[00:27:19] swyx: Mm-hmm ...[00:27:20] Mikhail Parakhin: uh, speed of, uh, templatization HTML, uh, completely new UX tem- uh, templatization of, uh, reducing latency for liquid themes.Uh, we-- Our, uh, search, uh, recently we moved from It's hard even, uh, quote from eight hundred QPS to forty-two hundred QPS with the same quality just by pure optimizations and not a research loop that kept running and changing code in our index serve on the same number of machines, just increasing the throughput.We, we managed to improve the quality of gisting and machine learning process. Uh, you know, gisting is the prompt compression technique that[00:27:59] swyx: allows for[00:28:00] Mikhail Parakhin: lower latency and, and lower and, uh, actually higher quality slightly. So like literally whatever different walks of life, and it doesn't have to be AI related.Uh, we, we had a reduction in, uh, storage because the agents would go and find data sets that clearly are derivative, uh, and then you don't need to store things twice. You know, we, we, we found somewhat embarrassingly that it was one of the largest tables was hashing random IDs into another random ID, and we literally- Oofput only one. So it was translating, yeah, two random IDs hashed[00:28:36] swyx: into[00:28:37] Mikhail Parakhin: each. So, so[00:28:37] swyx: it has access to the code as well, so it can, it can check the, like what, what the hell is it doing?[00:28:42] Mikhail Parakhin: So there, there cou- it could be run in two levels. You, uh, you know, at the superficial level, it could just use ex-existing components and, uh, reshuffle them.Uh, you know, like you can grab- Yeah ... uh, XGBoost, and you can grab some, some Py- PyTorch module, and then can grab some, you know, grab another tools and, and combine them. At a deeper level, since Tangle is all sort of CLI based underneath you, every, every component is a wrapped really CLI, uh, call and a YAML file, it can analyze code and create new components and, and, uh, keep on iterating as well.So, so you can, you can both have quick modifications of existing t- uh, pipelines with the, with components that are already there pre-baked, or you can create new components, uh, and-[00:29:29] swyx: Yeah ...[00:29:29] Mikhail Parakhin: keep iterating on those. So auto research is, again, this is probably the, the thing I was excited the most in the last two months happening, and we see it taking like, like totally like a wildfire.Just, uh, everybody, every day, every... well, every day, every minute, I would, uh, have somebody Slack message saying, “Oh, look how much better I made it.” And, uh, it's all throughout the research.[00:29:53] swyx: Is this democratized in some way in, in the sense that like is it your ML, uh, engineers and researchers doing this, or is it your regular PMs and software engineers also have the ability to auto-- to use Tangent?[00:30:07] Mikhail Parakhin: This is an awesome question. Like, Tango in general and Tangent in particular are extremely democratizing. Like they- Yeah ... they are the main tools for- ‘Cause I don't[00:30:15] swyx: need the details.[00:30:16] Mikhail Parakhin: Yeah. Exactly. Initially used by ML and AI engineers, but then literally, as you said, PMs are like the highest user right now is one of PMs on our org, uh, Sartak and he was, he was number one by, by usage of, of this ‘cause they're just, uh, energetic and knowledgeable, and now it, it unlocks a lot of capability where you don't have to co-change code manually.[00:30:39] swyx: I mean, I mean, because it kind of cuts out the ML, ML engineer from the process because the, the, the PMs have the domain knowledge and the ability to think about, uh, from first principles about, okay, what, what results do I want? And they can-- they even have the access to the data that, that needs to go in.So it's like in some ways, like this is the magic black box that we've always wanted for, for training and, and for, uh, I guess, uh, uh, hill climbing, whatever.[00:31:04] Mikhail Parakhin: It's basically cloud code for your AI development- ... uh, situation, right? Like now, now you don't have to know exactly how algorithms work. You can just, uh, bring your domain knowledge and expertise and product knowledge and iterate within Tangent until you've gotten the results that you need.[00:31:21] swyx: In my previous roles, every time that someone has pitched AutoML, you know, I've always been like, “Uh, this is not, this is not gonna work. It's, you know, it's, it's always gonna be a flop.” Somehow it's working now. I mean, presumably the answer is now we have LLMs and it's good enough, right? It's, it's an emergent property that we can do auto research, but like, it doesn't feel that satisfying that how come we didn't do this before, right?Like we just did like parameter search and like, I don't know. That's maybe that's it.[00:31:48] Mikhail Parakhin: Yeah. Bayesian optimization and hyperparameter optimization was, was the one that, or facet of AutoML that was used very actively, which incidentally also built into, uh, Tango. But, you know, I know Patrice Simard very well, and, uh, he was such a, uh, such a proponent of AutoML, and he put, like literally spent careers trying to democratize it.Without LLMs, it just turned out to be very hard. Like it, you, you would have flexibility within certain narrow domain, but it was hard to wider scale, and now with LLMs suddenly it's like magic wand, and so suddenly everybody- ... is an AutoML expert.[00:32:28] swyx: Yeah, I, I think it's multiple things, right? Like I'm, I'm just gonna bring up the, the, the chart again, right?Like LLMs can do the monitoring very well. That is the very potentially unbounded, super unstructured. It can do the analysis very well, it can do the... Uh, and basically it is much more intelligence poured into every single step. Uh, there's maybe nothing structurally changed about AutoML, but this is just m-more intelligent and more unstructured.[00:32:53] Mikhail Parakhin: Exactly.[00:32:54] swyx: Any flaws that you've run into? Like everyone is like drinking the Kool-Aid, oh my God, time savings, uh, you know, performance improvements. Like what, what, uh, issues have you have, uh, come up?[00:33:06] Mikhail Parakhin: This is really cool. It's not a solution to all the world's problems for sure. The limitations are usually the ones I-- And this is where we get into a bit of a subjective territory.Uh, I can only share what I've, I've seen so far, and I'm sure the situation, uh, is changing, and, you know, maybe after I say it, like many people will reach out and say, “Hey, what about this?” And you don't know that, and then, then we'll be probably right. But what I've seen is auto research is very good at doing kind of obvious things that you don't have bandwidth to do or you didn't notice or maybe you're not aware of like the-- some standard practices.It is not good at doing something completely out of distribution, something that, you know, you have to think for, for multiple days, uh, and, and do something like none of this. So, so it's, uh, I, uh, set an experiment once, uh, on, on my sort of, uh, hobby thing, and I let it run for, uh, ended up, uh, several weeks run, uh, you know, it's like full production kind of scale, so it, you know, slow runs and, and it ex-- it performed in the end, uh, over four hundred experiments, and only one was successful.I'm like, “Okay, that's, that's good.” But-[00:34:18] swyx: But it saved time.[00:34:19] Mikhail Parakhin: Yeah, I saved time. Like it, it was the, that thing. Yeah, if I, if I were doing four hundred experiments myself, my betting average, as I said, would have been much higher, I'm sure. But also, first of all, it would take me like three years to do four hundred experiments.And, uh, I didn't have to do them. Like the machines were just, uh, the price of electricity did that. So, and I got one improvement, uh, that in, uh, my, my-- Honestly, when I was starting that experiment, my thinking was to go and show that, “Hey, Andre, maybe you just don't know how to optimize.” And I was super smart because in, in my pro-problem, it was optimized for many years, and it was like fully improved.Uh, and I didn't expect it, you know, auto research to find anything at all. Yet it did. So instead of making fun of Andre, I ended up, uh, a big, big supporter. Yeah, that's exactly the tweet. Yes.[00:35:10] swyx: You and Toby really, really go back and forth on-online a lot, which is really funny. Uh, think of it as, as an eval for the optimalness of the code it's running on.Uh, it's almost like it reminds me of like a Kolmogorov complexity thing, but, uh, I guess it's-- there's some optimal thing that you're trying to sort of reduce down to, I guess. Um, and so, so you, you, you know, you should congratulate yourself that you had, uh, you know, uh, ninety-nine percent, uh, optimality.[00:35:36] Mikhail Parakhin: Exactly, yeah. I think Andre really deserves a lot of credit for popularizing this approach. This is, uh, this is incredibly, I think, powerful and cool and You know, the, uh, even him, him just mentioning it led to a lot of gains in a lot of places in the industry, so we should be thankful.[00:35:56] swyx: Yeah. I think he also has a just...I don't know what it is. Like, um, you know, it, it is a simple self-contained project that people can take and apply to other things, which is, is, is one thing, but also just the name. Just like somehow no one, no one managed to call their thing auto research. It's just naming things is very important. I think that that is mostly, uh, our coverage of Tango and, and, uh, Tangents.I think obviously, you know, there's a lot of, uh, ML infra at, at Shopify that people can, uh, dive into. We're about to go into SimGym, but before I do that, any, any other sort of broader comments around this whole effort? Like where is it, where is it leading to?[00:36:36] Mikhail Parakhin: As a segue to SimGym, like all those things start composing strongly.And, uh, you could see a huge unlock when you can look at each one of the tools and, and you see, oh, they're extremely useful. Uh, Tango is useful by itself. Auto Research is useful by itself. SimGym is useful by itself. If you combine all three, you create like synergetic effect. I think that's why we wanted to even, uh, cover them today is because this is something that if you go back even, you know, five years ago, would've been unthinkable.Uh, replicating that, uh, would, would be either incredibly costly or impossible, right? With probably thousands of people are required.[00:37:20] swyx: Well, we have serverless human, uh, serverless intelligence, right? Like, uh, so yes, you do have thousands of hu-- of, of intelligences, not just, not humans. And that's, that's close enough, right?Even if they're not AGI, they're, they're close enough to do the, the task that you need them to do. And, and, you know, that's, there's plenty for, for a lot of routine work, knowledge work. Okay, let's get into SimGym. Um, this is one of those things I, I was surprised to see actually it's apparently your, uh, one of your most popular launches, and I think something that, uh, I think Sim AI, I think Yunjun Park, who did the Smallville thing, there's a very small cottage industry of people trying to do like the simulate customer thing.I think a lot of people maybe don't super trust this yet because they're like, well, obviously they would just do what you prompt them to do, right? But maybe just think, uh, tell us about the sort of inspiration or origin story.[00:38:10] Mikhail Parakhin: That's exactly actually the thing I wanted to cover, because if you don't have the historical data, all you can do is prompt a-agents in a vacuum, and they will do exactly what you prompt them to do.In fact, when I first proposed it, and this is a bit of, um, my brainchild initially, if I, I can boast, even Toby said like, “But wouldn't they, they just repeat what, what you tell them?” And, uh, but I'm like, “Yes, except Shopify has decades of history of how people made changes and what there is, uh, there, what it resulted in terms of sales.”So now what we can do is we can-- we have this... It's not, it's a noisy data. There's a small, usually websites, uh, you know, like things, things are never in isolation. It's almost never AB experiment. It's always AA experiment when there's has two meanings, but basically, you know, in different time you run two different things.But if you aggregate in general, uh, like everything together, and you apply, uh, denoising and collaborative filtering like approach, you can extract a very clear signal. And then you can optimize your agents. And that's why it took so long. It took almost a year of that optimization of just us sitting and fiddling, and, and we had this internal goals of correlation of hitting-- internal goal was to hit zero point seven correlation with, uh, add to cart events, for example.Like that, that if we run real AB test experiment, that it should, it should go and, and rep-uh, replicate, uh, same sort of success that, that humans had or lack thereof. And it, it took forever, and I don't think that's easily replicatable because, uh, like who else would have that data? You have to have this historic, you know, decades, uh, worth of data.And now, now the, like the other thing you need is in-infrastructure and the scale, right? Because, uh, w- again, what we found, uh, stat sig results, you need to run a lot of simulations, a lot of agents, and, and it's-- Those are expensive things. Like you're, you're making actions in the browser because you want a real friction.You want to, to be able to get the image like of what humans will see because you wanna, uh, detect effects like, “Hey, if I make my images larger, will I have more sales or l- uh, fewer sales?” And like usually people's intuition here, by the way, is that I increase my images, I will have more because they look nicer.You know, designers all look sparse and big images. Like usually your sales tank, right? But, but, uh, you know, from HTML, all the characters look the same only the, the size tag looks different, right? So it's very hard. So you have to take visual information, you have to run this in simulated browser environment on the big farm and, and of course, you have to have, uh, like very, very expensive model, good model with multi-model model.So all this it's-- is what's taken so long and, uh, to share my personal fail a little bit there, Sean, is like, you know, we always had this bias to-- for like large company bias. You know, we always, uh, whenever you-- we do, we're like, “Hey, we'll run an experiment,” right? We make, make a change, and we will run an experiment and then, uh, see, uh, see which one's better or like, “No, this is worse,” and most of them are worse, so you discard it and keep iterating, hill climbing.And we're like, “Oh, like smaller merchants, they cannot get stat sig results. They cannot really run experiments simply because, you know, in a week there would be not enough data for them.” So we thought from this perspective. What we didn't realize is that most people don't have A and B, they just have one thing, and they need suggestions of What A and B should be.So, uh, we first build this, hey, we run simulation on two separate teams and, and, uh, say, “Hey, which one is better?” We then morphed it into, and very recently just released it, when you have just your site, your theme, we run over it and we say, “Hey, here's what predicted values of, of, uh, uh, conversions are, and here's how we think you should modify it to increase your conversions.”And then circling back to what you started with, the proof is in the pudding. Like, if we are not correlating with reality, like, people will not be using it. And, uh, thankfully, we see literally every day more users than the previous day. So, so right now, uh, right now- It's working. Yeah. I'm-- Right now my problem is how to pay for it all because the so our major thing is how to optimize the LLMs, do distillation, how to run the headless browsers, uh, and handful browsers, uh, uh, cheaper so that we can accommodate the increase in traffic.[00:42:47] swyx: Yeah. I, I understand that you, uh, you published a lot of technical detail at GTC, so I was just gonna bring it up a little bit. I think s- was this in, in con-conjunction with some kind of GTC presentation? Or something like that, right?[00:42:59] Mikhail Parakhin: Well, we, yeah, we, we did it in several place, but yeah, we had the engineering- Yeahblog, uh, as well. Yeah.[00:43:05] swyx: Yeah. So you're running, uh, GPT OSS. Uh,[00:43:08] Mikhail Parakhin: the, this is an older version. You know, now we run multimodal model. But yeah- Yeah ... GPT OSS, we still run GPT OSS as well for[00:43:15] swyx: And then you have the VMs, and you also have browser-based. I really like this one where it you said, “It violates almost every assumption that standard LLM serving is designed for.”And then you had like, basically orders of magnitude differences between everything.[00:43:29] Mikhail Parakhin: Exactly. Which is, which, uh, which was, you know, a bit of a challenge to implement, like when, like even simple things. Uh, be- since it violates all the assumptions, for example, multi-instance GPUs, like MIGs don't work as well.But we needed, uh, to get MIG to work because, ‘cause otherwise it's way too expensive. And so we had to deal with the, yeah, with, uh, lots of infrastructure and, and, uh, work with, uh, uh, Fireworks and CentML, uh, you know, to help with optimizations and browser-based, as you mentioned. Yeah, like, takes a village.[00:44:04] swyx: Okay. So there's a lot of like, I guess, experimentation in the infrastructure so far, and you've published more or less what you have here. I guess I'm, I'm less familiar with CentML. I, I don't do, uh, that much work in this, this part of the stack. But why was it the sort of preferred instance platform?[00:44:22] Mikhail Parakhin: There are really three probably top companies. There used to be, uh, uh- Three top companies, uh, at least I was aware of that did, uh, LM optimization. You know, together Fireworks and Santa ML, not necessarily in that order. Santa ML recently got acquired by NVIDIA. Uh, what they did is if you have a model and you want to optimize it to a specific prof-- uh, profile of usage, uh, they would go and do it.And, uh, we work with, with those companies, uh, this was work particularly in with Santa ML and NVIDIA to get them the best possible results out of it. And, and sometimes you, you have to retune depending on, like sometimes you want the maximum throughput, sometimes you want minimal latency, sometimes you want like the cheapest, right?And, yeah, or some combination. And so yeah, these are people who would come and help you.[00:45:14] swyx: I see. I see. Yeah, yeah. I'm familiar with these people for the LLM, you know, autoregressive stack. But the other interesting category of these optimizers is also the diffusion people, whereas like Fel and, you know, uh, Pruna recently has come up a lot as well, which I think is like really underappreciated, uh, at least by myself, because I, I thought, oh, all the workload would be LLMs, but actually there's a lot of diffusion as well.[00:45:38] Mikhail Parakhin: Exactly.[00:45:38] swyx: There's a lot here, so I, I, I... it's, it's, uh, it's, it's, it's hard to cover. But I, I do think like people underappreciate the importance of customer simulation, basically. I think this is something that I'm candidly still getting to terms with. Uh, you know, uh, you also-- your team also like prepared this, like, really nice diagram.Uh, I, I assume this is AI generated.[00:46:00] Mikhail Parakhin: Yeah, it looks-[00:46:01] swyx: Maybe it's not.[00:46:01] Mikhail Parakhin: Yeah, it looks, uh, Gemini-ish. Yeah, but, uh, uh, honestly, I, I don't know where, where the hell they generated. It looks, look, uh, looks like it's, uh, Google. But the interesting part, John, that, that, uh, we haven't covered, but I, I wanted to mention is if your store had previous customers, rather than it's a new store, you're like new merchant just launching things, it helps tremendously in just correlation and forecast.Yeah, we take your previous, uh, customer's behavior, and we create agents that replicate those specific distribution of, of customers that you get, and then we a- we apply those to your changes, and then that, that raised raw, you know, the re-- uh, just correlation with the add to cart events or to-- with conversion or whatever it, it, it may be, uh, quite dramatically.So, uh, replicating humans in general seems like an interesting, cool challenge.[00:46:58] swyx: As a shareholder, I think this is the-- like if people are Shopify shareholders, they should really deeply understand this because this is basically the moat. The, the more you use Shopify, the more it will just automatically improve, right?Like you're, you're doing the job for them.[00:47:13] Mikhail Parakhin: Yeah, that's what we started with. Like, uh- ... uh, otherwise, if you're just a startup, I wouldn't do it if, uh, you know, if it was my startup because Without the data, it, yeah, as, as you said, it's, it's exactly the case that, uh, whatever you say in prompt, that's, that's what the agents will be doing.[00:47:30] swyx: The statistician in me wants to like really satisfy the sort of, um, statistical intuition, I guess. Um, to me it's kind of, uh, the, the word that comes to mind is, um, ergodicity. Uh, so let's say a, a customer takes this path, customer takes this path, customer takes this path, right? Um, the... In my mind, the way I explain it is like, okay, here, here's the ninety-five percentile, here's the five percentile, and here's the median, right?Um, but to me, what SimGym is potentially doing is that it can, uh, modify... It can sort of model the sort of in-between sort of journeys as well, that, that maybe are dependent on the previous states. This may be like a very RL-type conclusion where like basically the summary statistics, if you only did naive AB testing, you only have the, the statistics at, at, at a certain point, and you only judge based on the sort of overall summary statistics.But here you can actually model trajectories. Does that make sense? Or-[00:48:31] Mikhail Parakhin: That makes total sense because like, well, that, that makes even more sense that maybe even you realize bec- because-[00:48:38] swyx: Okay. Please,[00:48:38] Mikhail Parakhin: please. Yes ... we do-- Yeah. The, so internally, uh, we have this system, we talked about it briefly once at NeurIPS.We have a huge HSTU-based system that models the whole companies, uh, and their possible paths. And like- Yeah ... what you are, what you are showing, like actually at any point of time, you can either model the user's behavior or you mo- can also think about, uh, the whole merchant as a company, as the entity that acts in the world.You can model that as well. And then you can do, can do counterfactuals. In your graph, like in your blue graph, uh, if you're... Imagine in the center there, uh, somewhere in the middle, you would have an intervention. I give that person a coupon, or I don't know, I send a personal thank you card, or give a discount in some- somewhere.And then you can, uh, then you can do forward rollouts from that counterfactual. So what would have happened with that intervention or without the intervention? And you can even ch- change where that intervention, uh, in time can happen, right? Like some- where, where in this journey. So we, we do this at the Shopify scale for our merchants, and then if we notice that something that they can be fixing, like there's a strong counterfactual, like we have Shopify policy, they basically get a notification like, “Hey, we think your...something is wrong with your-” I don't know, Canadian sales. Like, uh, it looks like it's misconfigured. Here's what you need to do. Or do you think like, uh, you have to set up this campaign with these parameters? And we do that at the buyer level to literally offer discounts or cashback or, or things to buyers.So this is-- I'm getting very excited. Like this is my sort of area of, uh, interest, I guess, and, and hobby. But being able to m-model something complex as human beings or companies and model counterfactuals on it, where you can have interventions in the future and optimize when to make intervention, what kind inter-- uh, what kind of intervention to make.It's such an unlock that previously was completely impossible. Like the-- it was, it was always dreamed of, but never... Like how would you even simulate it without LLMs or HTUs? I think very, very exciting times.[00:50:59] swyx: I just wanted to, uh, to maybe illustrate this. I, I'm not the best illustrator, but I, I am a conceptual statistics guy.And y-you know, you cannot just do this. Like this is a dimensionality AB test doesn't do, right? Like, uh, because it doesn't have the, the, the change over time, uh, stochastic nature, uh, and it doesn't have the sort of contextual like... Here's all the context to this point. Um, okay, cool. Um, that's SimGym.You're, you're gonna burn a lot of tokens on this thing. But you're, you're one of the, the only scale platforms in the world that can, uh, that can do this across a huge variety of workloads, right? I'm even curious on a sort of human, uh, research level of like, well, do, does retail behave d-differently from like clothing sales?D-does that behave differently from electronic sales? I, I don't know. I don't know what else you guys... The Kardashian shoppers, do they differ from like people who buy, uh, I don't know, cars and, uh, whatever.[00:51:55] Mikhail Parakhin: Well, very different, and different sensitivities and different modes of, uh, shopping and, and different levels of what's important.Now, to-totally, you can do aggregations at, uh, at a store level. You can do aggregations at a different, uh, category level. I don't know if, uh, you know, for our statisticians among us, I couldn't believe, but we-- recently we're looking at it, and we had to bring back, uh, CRPs, you know, Chinese restaurant process.It's a, like, way of aggregating and, like, naturally grow clustering. So across... Specifically to answer questions that, uh, like you were just posing on how, how if, if buyers behave different categories. And I'm like, “I haven't seen CRP since two thousand and one.” It's[00:52:37] swyx: so What? It's so- What is... No, I haven't, I haven't seen this.No. This is not in my training. Uh,[00:52:44] Mikhail Parakhin: but, but yeah, it, uh, uh, it actually, like the, the-- there was a very popular kind of theory, popular neurips HTML circles in early two thousands, uh, kind of nice. And now, now it has practical applications, uh- Yeah ... that we were resurrecting.[00:53:03] swyx: Yeah, amazing. Uh, I, I can see, I can see how this is like a, uh, a fun job for you where you get to apply all these things.Um, yeah, yeah, so super cool. Super cool. So, okay, so, so anyone who, who knows what CRPs are and has always wanted to use them at work, uh, they should, they should definitely join Shopify. Okay, so w-we have a lot and but I, I'm, I'm being mindful of the time. I, I do wanted to, to sort of cover some other things.Um, I-I'll give you a choice, UCP or Liquid?[00:53:30] Mikhail Parakhin: Liquid. I think, I think on UCP, you know, like UCP is very important for us and, and it just we are-- UCP, we have a structured, uh, discussions, and you can read about them, and we have, uh, blog posts, and we have a big release this week, in fact, like with our catalog.Oh,[00:53:46] swyx: okay.[00:53:46] Mikhail Parakhin: Uh, yeah,[00:53:46] swyx: but- Le-I mean, we, we can, we can discuss the, the, the release briefly because we'll release this after the-- after it's already announced so whatever. There's a catalog that you guys are doing?[00:53:55] Mikhail Parakhin: Yeah. So we are, we are- Okay ... we are bringing in capabilities of a whole, uh, Shopify catalog.Basically, you now you can search for products, you can do lookups by specific ID, you can do bulk lookups when you need to bring m-multiple products. You don't need to know in ad-in advance what you're trying to show or to sell or check out. Like, you can now, you can now have this decided at, at runtime, and this big area for investment for us for both non-personalized and personalized searches, trying to provide basically a win-window into whole universe of products that are being sold everywhere in the world.And Shopify is really not exactly, but almost like a super set of any-anything being sold. Now we are bringing it into UCP and, uh, and, uh, identity linking is another big thing for us, uh, so that you, you can use, uh, like Google or whatever, whatever identity you have, uh, they're minimizing friction.[00:54:56] swyx: Yeah. So[00:54:57] Mikhail Parakhin: yeah, big release for us.But Liquid AI of course we never talk about, and the problem might be more, more aligned with what we d-discussed previously on this chat.[00:55:07] swyx: Sure. The main thing that everyone understands about Liquid is that it is inspired by Worm, and I still don't know why. I'm curious on your explanation. I think you, you, uh, you can make things very approachable.And also I think like what is the potential of like the, the level of efficiency that you get out of Liquid?[00:55:23] Mikhail Parakhin: You- we all familiar with transformer architectures. And, uh, for the longest time, there was a competing architecture, it's called the state space models. So, so Sams, uh, you know, Chris, Chris Reyes, one of the pioneers and, and lots of startups, uh, trying to make those realities.They have, uh, significant benefits being main being, uh, being much faster and, uh, lower footprint and not quadratic in length, you know, sort of, uh, linear in, in, uh, in your context length. But with state space models- They never quite made it. Like they're used-- They have, uh, certain niches when they thrive, their hybrid architectures are useful, but they never quite made it.And liquid neural networks are, you can think of them as a next step, like, uh, sort of, uh, state-space model square. It's non-transformer architecture that's more complicated than sta-state space and really difficult to code if you-- if I'm being honest. But it's, um, very efficient. It's, uh, subline-- sub, uh, quadratic in, in length of your context.Uh, it's very compact way to represent things, and that's a liquid AI company. They... Their goal is to productize it, and very often you have this need, uh, when you need to have long context and small model, and you want to have low latency. Like in general, it's basically on par with transformers, and if you do hybrids with transformers, it's, it's even better.That's why we at Shopify, when we tried multiple and we constantly try multiple models, multiple companies, we found that for small, particularly with low latency applications, when you have low latency and/or if you need longer context lengths, liquid was the best. And so we still use the whole zoo and always like obviously test and use everything, uh, every open source model and, you know, it feels l

Dev Interrupted
The self-authoring wiki, beating brain fry, and Obsidian as memory is a trap

Dev Interrupted

Play Episode Listen Later Apr 17, 2026 38:22


Have you or a loved one been afflicted by "brain fry" after managing too many autonomous agents? This week on the Friday Deploy, Andrew and Ben explore the cognitive toll of orchestrating AI swarms and share Kelly Vaughn's expert strategies for avoiding burnout. The hosts also discuss Google's new campaign to punish websites that hijack the back button, the breakthrough of running Gemma 4 natively on mobile devices, and a new 8-step maturity model for building agentic data pipelines. Finally, they dive into a heated debate over whether Obsidian flat-files are a scalable memory solution for AI, comparing the methodology to Andrej Karpathy's new agent-compiled wiki system.Read the guide: The APEX FrameworkFollow the show:Subscribe to our Substack Follow us on LinkedInSubscribe to our YouTube ChannelLeave us a ReviewFollow the hosts:Follow AndrewFollow BenFollow DanFollow today's stories:Introducing a new spam policy for "back button hijacking"Google Gemma 4 Runs Natively on iPhone With Full Offline AI InferenceWater Town: The Agent Swarm Data StackStop Calling It Memory: The Problem with Every "AI + Obsidian" TutorialThe Wiki That Writes ItselfBreaking out of the "brain fry" spiral of AIAfter Burnout by Kelly VaughnOFFERSStart Free Trial: Get started with LinearB's AI productivity platform for free.Book a Demo: Learn how you can ship faster, improve DevEx, and lead with confidence in the AI era.LEARN ABOUT LINEARBAI Code Reviews: Automate reviews to catch bugs, security risks, and performance issues before they hit production.AI & Productivity Insights: Go beyond DORA with AI-powered recommendations and dashboards to measure and improve performance.AI-Powered Workflow Automations: Use AI-generated PR descriptions, smart routing, and other automations to reduce developer toil.MCP Server: Interact with your engineering data using natural language to build custom reports and get answers on the fly.

This Week in Startups
Bittensor Drama! TAO down 15%! | E2274

This Week in Startups

Play Episode Listen Later Apr 11, 2026 81:54


This Week In Startups is made possible by:Pilot - https://pilot.com/TWISTSquarespace -https://squarespace.com/TWISTNorthwest Registered Agent - https://northwestregisteredagent.com/TWISTPlaud - https://Plaud.ai/twistCrowdHealth - ⁠https://JoinCrowdHealth.com/twist⁠Today's show:TAO, the token associated with the Bittensor network, fell sharply last night after a major project declared it was taking its business elsewhere. Covenant AI had used Bittensor subnets 3, 39, and 89 before its abrupt and controversial decision to depart for other digital shores.Gareth Howells, a co-founder of Vidaio (Bittensor Subnet 85), joined Jason and Alex to share the community perspective regarding Covenant's decision to leave Bittensor, a choice that was especially bitter given that its recently-released Covenant-72B model trained using the network's decentralized capabilities was hailed at the time as a demonstration of the crypto-AI hybrid's potential. (Covenant's Sam Dare was a recent guest on TWiST.)Howells walked us through Vidaio's product (AI models for high-quality video work, like upscaling) Then it was time to bring new fan-favorite Ole Lehmann on. Ole took an Andrej Karpathy concept, and turned it into a skill that you can use in either OpenClaw or Claude Cowork. GuestsOle Lehmann: https://x.com/itsolelehmannThe AI Solopreneur newsletter: https://aisolo.beehiiv.com/Vidaio: https://vidaio.io/Vidaio on X: https://x.com/vidaio_Timestamps:0:00 Drama in the Bittensor space! Was there a rugpull?1:01 Plaud: If your work depends on conversations — interviews, meetings, calls — you need a Plaud NotePin. You can check it out at https://Plaud.ai/twist and use code TWIST for 10% off!1:47 Gareth Howells of subnet85 joins the show: https://x.com/GarethHowells9:56 Northwest Registered Agent - Get more when you start your business with Northwest. In 10 clicks and 10 minutes, you can form your company and walk away with a real business identity — Learn more at https://northwestregisteredagent.com/TWIST11:37 Gareth reacts to the Covenant situation… Did Sam sell tokens?14:33 How VidAIo (subnet85) optimizes video as cheaply as possible. https://vidaio.io/20:06 Squarespace - Use offer code TWIST to save 10% off your first purchase of a website or domain at https://squarespace.com/TWIST21:12 Demo: Gareth shows upscaled footage of our very own Lon Harris27:09 Who is actually doing the computing on Bittensor?28:15 How Bittensor recruits tech talent from emerging markets29:42 Pilot - Visit https://pilot.com/TWIST and get $1,200 off your first year.30:58 Jason's explains the Bittensor network. https://docs.bittensor.com/37:28 Jason addresses "pump and dump" allegations40:46 Ole Lehmann joins the show https://x.com/itsolelehmann45:31 Jason's $1,000 "bounty" for enhanced show notes49:35 Check out Ole's newsletter at aisolo.beehiiv.com !59:50 CrowdHealth - CrowdHealth lets you ditch the bureaucracy with a peer-to-peer funding platform for your healthcare. Get started for $99 per month for your first three months by using the code TWIST at https://JoinCrowdHealth.com/twist.1:00:46 Off Duty with Alex and J-Cal!1:04:46 Reacting to the Maul trailer https://www.youtube.com/watch?v=DkVepshZhGcSubscribe to the TWiST500 newsletter: https://ticker.thisweekinstartups.comCheck out the TWIST500: https://www.twist500.comSubscribe to This Week in Startups on Apple: https://rb.gy/v19fcpFollow Lon:X: https://x.com/lonsFollow Alex:X: https://x.com/alexLinkedIn: ⁠https://www.linkedin.com/in/alexwilhelmFollow Jason:X: https://twitter.com/JasonLinkedIn: https://www.linkedin.com/in/jasoncalacanisFollow TWiST:Twitter: https://twitter.com/TWiStartupsYouTube: https://www.youtube.com/thisweekinInstagram: https://www.instagram.com/thisweekinstartupsTikTok: https://www.tiktok.com/@thisweekinstartupsSubstack: https://twistartups.substack.com

Azeem Azhar's Exponential View
Karpathy's autoresearch could make scientists of us all

Azeem Azhar's Exponential View

Play Episode Listen Later Apr 1, 2026 21:02


Welcome to Exponential View, the show where I explore how exponential technologies such as AI are reshaping our future. I've been studying AI and exponential technologies at the frontier for over ten years. Each week, I share some of my analysis or speak with an expert guest to make light of a particular topic. To keep up with the Exponential transition, subscribe to this channel or to my newsletter: https://www.exponentialview.co/ ---- Published in early March 2026, Andrej Karpathy's autoresearch AI tool makes autonomous scientific experimentation cheap and easy — but it was designed to solve machine learning problems. I wanted to see if I could apply its loop architecture to my own work: refining my worldview, testing arguments, solving business problems. In this video, I share how I adapted Karpathy's autoresearch loops for problems that aren't easy to quantify, how to avoid the local minima trap, and the broader impact of these kinds of methods. I covered: (02:11) The Karpathy Loop: what is it and how does it work (07:54) Extending the loop into business and thinking (09:46) The local minima trap (12:20) The escape harness: getting beyond “good enough” (16:05) What I've learned after 30 days (18:47) The loop economy: from doing to judging ---- Where to find me: Exponential View newsletter: https://www.exponentialview.co/ Website: https://www.azeemazhar.com/ LinkedIn: https://www.linkedin.com/in/azeem/ Twitter/X: https://x.com/azeem Production by EPIIPLUS1. Production and research: Baba Films, Chantal Smith, Marija Gavrilov. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Risky Business
Risky Business #830 -- LiteLLM and security scanner supply chains compromised

Risky Business

Play Episode Listen Later Mar 25, 2026 63:53


On this week's show, Patrick Gray, Adam Boileau and James WIlson discuss the week's cybersecurity news. They talk through: TeamPCP's supply chain attack on Github, and they threw in an anti-Iran wiper, because why not?! Anthropic hooks up its models to just… use your whole computer After Stryker's Very Bad Day, CISA says maybe add some more controls around your Intune? Another iOS exploit kit shows up in the cyber bargain-bin The FTC decides to ban… all new home routers?! U wot m8?! Supermicro founder was personally sanction-busting Nvidia GPUs into China?! This week's episode is sponsored by enterprise browser maker, Island. Chief Customer Officer Bradon Rogers joins Pat to explain how its customers are using Island to control the use of personal AI services in regulated industries. This episode is also available on Youtube. Show notes ‘CanisterWorm' Springs Wiper Attack Targeting Iran TeamPCP deploys CanisterWorm on NPM following Trivy compromise Andrej Karpathy on X: "Software horror: litellm PyPI supply chain" attack Checkmarx KICS GitHub Action Compromised: Malware Injected in All Git Tags Felix Rieseberg on X: "Today, we're releasing a feature that allows Claude to control your computer" A Top Google Search Result for Claude Plugins Was Planted by Hackers Lockheed Martin targeted in alleged breach by pro-Iran hacktivist CISA urges companies to secure Microsoft Intune systems after hackers mass-wipe Stryker devices FBI seems to seize website tied to Iranian cyberattack on Stryker Stryker confirms cyberattack is contained and restoration underway Hundreds of Millions of iPhones Can Be Hacked With a New Tool Found in the Wild Someone has publicly leaked an exploit kit that can hack millions of iPhones Russia-linked hackers use advanced iPhone exploit to target Ukrainians Apple rolls out first 'background security' update for iPhones, iPads, and Macs to fix Safari bug Post by @wartranslated.bsky.social — Bluesky Signal's Creator Is Helping Encrypt Meta AI Hacker says they compromised millions of confidential police tips held by US company Millions of 'anonymous' crime tips exposed in massive Crime Stoppers hack Feds Disrupt IoT Botnets Behind Huge DDoS Attacks FCC bans import of consumer-grade routers amid national security concerns White House pours cold water on cyber ‘letters of marque' speculation Google launches threat disruption unit, stops short of calling it ‘offensive' Supermicro's cofounder was just arrested for allegedly smuggling $2.5 billion in GPUs to China Cyberattack on vehicle breathalyzer company leaves drivers stranded across the US Man pleads guilty to $8 million AI-generated music scheme Two Israelis AI generated "intelligence" and sold it to Iran

Elon Musk Pod
AI UPDATE: What is Vibe Coding, and how to make money with it

Elon Musk Pod

Play Episode Listen Later Mar 24, 2026 12:19


Vibe coding, a modern software development approach where users build applications by describing their goals in natural language to AI agents rather than writing manual code. Coined by Andrej Karpathy, the term represents a shift toward intuitive, prompt-based creation that allows even non-technical users to generate functional prototypes in record time. While tools like Cursor, Replit, and Natively enable rapid innovation and lower the barrier to entry for creators, experts emphasize that this method differs from professional software engineering. Traditional development remains essential for ensuring security, scalability, and deep architectural understanding in complex or high-stakes environments. Consequently, the industry is moving toward a hybrid model that balances the creative speed of "vibes" with the rigorous structure of agentic engineering. This evolution suggests a future where AI handles repetitive implementation while humans transition into the roles of high-level orchestrators and strategic supervisors.https://wilwaldon.com

Hashtag Trending
Agentic AI, Self-Improving AI Systems, Marcel Builds An Agent Network and Raccoon & Sovereign AI

Hashtag Trending

Play Episode Listen Later Mar 21, 2026 67:59


Agentic AI Goes Enterprise: NVIDIA's NeMo-CLAW, Small Local Models, and a Secure Alternative with Raccoon GPT. (https://raccoongpt.ca) Hashtag Trending would like to thank Meter for their support in bringing you this podcast. Meter delivers a complete networking stack, wired, wireless and cellular in one integrated solution that's built for performance and scale. You can find them at Meter.com/htt On Project Synapse, Jim Love, Marcel G, John Pinard, and guest Tony Kaye from Raccoon GPT  discuss a week of major AI developments, including: Jensen Huang's keynote highlighting NVIDIA's NeMo-CLAW security layer and an enterprise-focused, more secure approach to agent tools like Open CLAW.  They explore why agentic AI matters—moving from chat to automated actions—along with the risks of insecurity and shadow AI. The panel also covers the emergence of smaller models intended to run locally for secure workflows, Andrej Karpathy's auto-research self-learning experiment showing an 11% improvement, and Jim's week-long build of "Nexus," a personal secure agent with memory, journaling, Telegram and web access, voice via Whisper, digests, and a dashboard. Tony explains Raccoon GPT's Canada-based, privacy-first architecture for regulated organizations. 00:00 Show Intro Sponsor 00:18 Meet Tony K 00:57 Lightning Round Setup 01:51 Nvidia Nemo Claw 04:10 Why Agents Matter 05:36 Security And Shadow AI 12:04 Small Local Models 13:35 Open Source Model Shift 19:19 Self Learning Breakthroughs 26:07 Marcel Builds Nexus 34:01 Cutting Telegram Costs 34:58 Nexus Personal Workflow 37:24 Open Source Chaos 40:12 Raccoon Secure AI 46:34 Data Privacy Reality 50:30 Enterprise Agent Foundations 51:38 Models Architecture Costs 56:27 Sovereignty and Control 01:02:02 Agents Remake the Web 01:06:01 Wrap Up and Sponsor

No Priors: Artificial Intelligence | Machine Learning | Technology | Startups
Andrej Karpathy on Code Agents, AutoResearch, and the Loopy Era of AI

No Priors: Artificial Intelligence | Machine Learning | Technology | Startups

Play Episode Listen Later Mar 20, 2026 66:31


What happens when AI agents can design experiments, collect data, and improve — without a human in the loop? Andrej Karpathy joins Sarah Guo on the state of models, the future of engineering and education, thinking about impact on jobs, and his project AutoResearch: where agents close the loop on a piece of AI research (experimentation, training, and optimization, autonomously). 00:00 Andrej Karpathy Introduction 02:55 What Capability Limits Remain? 06:15 What Mastery of Coding Agents Looks Like 11:16 Second Order Effects of Natural Language Coding 15:51 Why AutoResearch  22:45 Relevant Skills in the AI Era 28:25 Model Speciation 32:30 Building More Collaboration Surfaces for Humans and AI 37:28 Analysis of Jobs Market Data 48:25 Open vs. Closed Source Models 53:51 Autonomous Robotics 1:00:59 MicroGPT and Agentic Education 1:05:40 Conclusion

The Marketing AI Show
#203: Anthropic vs. Pentagon Round 3, NYT AI vs. Humans Writing Test, Atlassian's AI-Era Layoffs & Grammarly's Expert Cloning Scandal

The Marketing AI Show

Play Episode Listen Later Mar 17, 2026 100:50


Anthropic has filed two federal lawsuits to block the Pentagon's supply chain risk designation and the back-and-forth on X between the Pentagon CTO and AI policy experts is revealing what this fight is really about. Paul and Mike unpack the politics, the implications, and why a deal is inevitable. Then: 86,000 people took the NYT's AI writing quiz and most preferred the machine. Paul shares his human-to-machine writing scale and asks the question that actually matters: not whether AI can write, but when should we let it? Plus Atlassian's 1,600 AI-driven layoffs, Amazon's AI-caused outages, McKinsey's chatbot getting hacked in two hours, and more. Show Notes: Access the show notes and show links here Click here to take this week's AI Pulse. Timestamps: 00:00:00 — Intro 00:03:11 — AI Pulse Survey Results 00:07:48 — Anthropic vs. Pentagon Round 3 00:30:02 — New York Times Releases Controversial "AI Writing Quality" Quiz 00:46:18 — Atlassian Layoffs and Job Loss Dashboard 00:58:49 — Adobe CEO Stepping Down 01:07:14 — Amazon AI-Related Outages and Engineering Struggles 01:14:28 — McKinsey AI Chatbot Hacked 01:19:49 — AI Politics Update 01:24:06 — Grammarly AI "Expert Review" Controversy 01:30:51 — Andrej Karpathy's Autoresearch Agent 01:34:47 — AI Product and Funding Updates This week's episode is sponsored by our 2026 State of AI Report. This year, we're going beyond marketing-specific research to uncover how AI is being adopted and utilized across the organization, and we need your help to create the most comprehensive report yet. It's a quick seven-minute lift. In return, you'll get the full report for free when it drops, plus a chance to win or extend a 12-month SmarterX AI Mastery Membership. Go to smarterx.ai/survey to share your input. That's smarterx.ai/survey  Visit our website Receive our weekly newsletter Join our community: Slack Community LinkedIn Twitter Instagram Facebook YouTube Looking for content and resources? Register for a free webinar Come to our next Marketing AI Conference Enroll in our AI Academy 

AI For Humans
AI Can Improve Itself Now. We're Sure That's Fine.

AI For Humans

Play Episode Listen Later Mar 13, 2026 59:21


AI just learned how to make itself smarter. That's not a hypothetical anymore. Recursive self-learning is here, and it's changing everything about how AI develops.   This week on AI For Humans, we break down Andrej Karpathy's new AutoResearch project and what recursive self-improvement actually means for the rest of us. Plus, Anthropic's massive Time magazine profile reveals just how fast Claude is writing its own code, Meta quietly acquired an AI agent social network called MoltBook, Replit drops V4, Perplexity launches computer use, Gemini finally shows up in Google Docs and Maps, Cloudflare does a full 180 on web scraping, Figure's robot cleans an entire living room, and there's a robot horse.    We're sure that's fine.   AI IS IMPROVING ITSELF AND WE'RE JUST SITTING HERE WATCHING.   #ai #artificialintelligence #aiforhumans  Come to our Discord: https://discord.gg/muD2TYgC8f Join our Patreon: https://www.patreon.com/AIForHumansShow AI For Humans Newsletter: https://aiforhumans.beehiiv.com/ Follow us for more on X @AIForHumansShow Join our TikTok @aiforhumansshow To book us for speaking, please visit our website: https://www.aiforhumans.show/ // Show Links // Karpathy's AutoResearch: Recursive Self-Learning https://x.com/karpathy/status/2031135152349524125?s=20 AutoResearch GitHub Repository https://github.com/karpathy/autoresearch Sam Altman on Multi-Day and Multi-Week AI Agent Work https://youtu.be/sTnl8O_BuuE?si=xaWYyqYbVJYzOvYZ HBR: When Using AI Leads to Brain Fry https://hbr.org/2026/03/when-using-ai-leads-to-brain-fry Anthropic's Big Time Magazine Profile: Claude, the Pentagon, and Disruption https://time.com/article/2026/03/11/anthropic-claude-disruptive-company-pentagon/ Claude's Rapid Shipping Pace https://x.com/claudeai/status/2032124273587077133?s=20 Paperclip Open Sourced: AI-Powered Company Management https://x.com/dotta/status/2029239759428780116?s=20 Meta Acquires MoltBook AI Agent Social Network https://www.axios.com/2026/03/10/meta-facebook-moltbook-agent-social-network Replit V4 Launch https://x.com/amasad/status/2031755113694679094?s=20 Perplexity Computer Use https://x.com/perplexity_ai/status/2031790180521427166?s=20 Claude Code Makes Videos Now https://x.com/josephdviviano/status/2031196768424132881?s=20 Gavin's Claude Code Video Experiment https://x.com/gavinpurcell/status/2031487595717226955?s=20 Gavin's Claude Code Bio Video https://x.com/gavinpurcell/status/2031620238689898770?s=20 Gemini Comes to Google Docs and More https://x.com/OfficialLoganK/status/2031374503599567113?s=20 Gemini in Google Maps: Ask Maps with Immersive Navigation https://blog.google/products-and-platforms/products/maps/ask-maps-immersive-navigation/ Gemini Embeddings https://x.com/OfficialLoganK/status/2031411916489298156?s=20 Runway Characters https://x.com/runwayml/status/2031028120971571687?s=20 Cloudflare Launches /Crawl So All Sites Can Be Scraped https://x.com/CloudflareDev/status/2031488099725754821?s=20 Figure Robot Does Full Autonomous Living Room Cleanup https://x.com/Figure_robot/status/2031038981333565949?s=20 Deep Robotics Robot Horse https://x.com/DeepRobotics_CN/status/2031910951465992535?s=20 Real-Time Skeletal Visualization with Three.js https://x.com/nick_bisesi/status/2031728629592289591?s=20 Taking Halo ISO and Getting It to Play on Mac https://x.com/JasonBotterill/status/2031855986303254926?s=20 AI Tennis Prediction https://x.com/phosphenq/status/2031400355167117498 Green Code YouTube Channel: AI Explainers https://www.youtube.com/@Green-Code LotR x Pawn Stars AI Video Mashup https://www.reddit.com/r/aivideo/comments/1rqgolw/wrong_universe_lotr_vs_pawn_stars_ai_mashup/  

Tank Talks
The Rundown 3/13/26: Canada's Defence Tech Push, Constellation's AI Test, and the Private Credit Mess

Tank Talks

Play Episode Listen Later Mar 13, 2026 24:40


In this episode of Tank Talks, Matt Cohen and John Ruffolo unpack a volatile moment across software, capital markets, AI, and Canadian industrial policy. The conversation opens with Constellation Software's AI-era challenge, as new president Mark Miller faces investor skepticism around whether legacy vertical market software can maintain its moat in a world increasingly shaped by AI-driven productivity, automation, and code generation.From there, Matt and John examine Salesforce's decision to raise billions in debt to fund share buybacks, questioning whether this is smart balance-sheet engineering or a red flag that large software companies are running out of offensive growth options. The episode then turns to the private credit market, where redemption gates, liquidity pressure, and fears around AI infrastructure lending raise deeper concerns about leverage, accounting, and systemic fragility.Back in Canada, the discussion shifts to the country's defence industrial strategy and why the real opportunity is not just traditional military spending, but dual-use investment across AI, quantum, satellites, aerospace, and strategic infrastructure. The episode closes with a look at Andrej Karpathy's open-source Auto Research project and what it signals about the speed of AI progress, the democratization of research capabilities, and the growing pressure on knowledge workers and software engineers to keep up.If software moats are weakening, private credit is wobbling, and defence dollars are becoming innovation dollars, where will the next real edge come from?Constellation Software, AI Pressure, and the Future of Vertical SaaS (00:43)Matt and John break down Constellation Software's latest numbers, the market's growing skepticism toward legacy software businesses, and the bigger question of whether mission-critical vertical SaaS can stay resilient as AI chips away at traditional moats. They explore why trusted workflows and proprietary data still matter, but also why even durable software businesses may face long-term pressure.Salesforce's $25 Billion Debt Bet and What It Really Signals (06:28)Matt and John unpack Salesforce's plan to raise massive debt for share buybacks, debating whether this is efficient capital structure management or a defensive move from a software giant with fewer compelling growth opportunities. The bigger issue is what this says about confidence, capital allocation, and the mood inside mature SaaS companies right now.Private Credit Redemption Gates and the Fear Beneath the Surface (10:49)A wave of redemption limits across major private credit funds becomes the next flashpoint. Matt and John explain why retail money flooded into the asset class, how managers were pushed into riskier lending, and why the underlying concern is no longer just liquidity management, but whether private credit has been pricing equity risk like it was safe debt.Canada's Defense Strategy Is Really a Dual-Use Tech Strategy (16:29)Matt and John shift to Canada's defense industrial strategy and the National Research Council's planned investment, arguing that the real opportunity is in dual-use innovation. Rather than thinking only in terms of tanks and submarines, John reframes defense spending as investment in AI, quantum, satellites, aerospace, and strategic infrastructure that can serve both government and enterprise customers.The AI Catch-Up Panic Is Real (21:26)Matt and John zoom out from markets and policy to the personal reality of AI acceleration. John admits he feels both energized and behind, capturing the exact tension many operators and investors feel as new tools emerge faster than most people can realistically absorb them.Andrej Karpathy Auto Research and the One-GPU Research Lab Moment (22:58)The episode closes with Andrej Karpathy's open-source Auto Research project and why it matters. Matt explains how autonomous research loops, overnight experimentation, and low-cost GPU access could dramatically speed up model tuning, product testing, and AI development, making advanced experimentation far more accessible than before.Connect with John Ruffolo on LinkedIn: https://ca.linkedin.com/in/joruffoloConnect with Matt Cohen on LinkedIn: https://ca.linkedin.com/in/matt-cohen1Visit the Ripple Ventures website: https://www.rippleventures.com/ This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit tanktalks.substack.com

Where It Happens
Autoresearch clearly explained (why it matters)

Where It Happens

Play Episode Listen Later Mar 11, 2026 24:21


I break down Andrej Karpathy's new open-source project, Autoresearch: what it is, how it works, and why some of the smartest people in tech are losing their minds over it. I walk through 10 concrete business ideas you can build on top of Autoresearch loops, from niche agent-in-a-box products to always-on A/B testing agencies. I also cover Karpathy's companion launch, Agent Hub, share community reactions, and show you step by step how to get started using Claude Code and a Colab GPU. I'm hosting a free workshop so you can build your business in the age of AI. Sign up here: https://startup-ideas-pod.link/build-with-ai-2026 Links Mentioned: Autoresearch Github: https://startup-ideas-pod.link/autoresearch Timestamps 00:00 – Intro 00:45 – How Autoresearch Actually Works 02:40 – Visual Walkthrough of the Autoresearch Loop 03:37 – Mental Model: Your Research Bot That Runs While You Sleep 05:26 – Idea 1: Niche Agent-in-a-Box Products 06:48 – Idea 2: A/B Testing for Marketing (Landing Pages & Ads) 08:45 – Idea 3: Research as a Service 09:43 – Idea 4: Power Tool Inside Your Own SaaS 10:49 – Idea 5: Agency That Runs 100× More Tests 12:05 – Idea 6: Auto Quant for Trading Ideas 13:44 – Idea 7: Always-On Lead Qualification & Follow-Up 14:21 – Idea 8: Finance Ops Autopilot for Businesses 15:09 – Idea 9: Internal Productivity Lab for Your Org 15:53 – Idea 10: Done-for-You Research & Due Diligence Shop 16:41 – Non business use cases 18:27 – Karpathy's Agent Hub Announcement 19:50 – How to Get Started with Autoresearch 22:21 – Final Thoughts Key Points Autoresearch is an open-source AI agent that sets a goal, runs experiments in a loop on a GPU, keeps the winners, and discards the rest — all while you sleep. You need an NVIDIA GPU to run it (tested on H100), but you can rent one cheaply through Lambda Labs, Vast AI, RunPod, Google Cloud, or Google Colab. The fastest way to get started is to use Claude Code to walk you through installation, then run it on Google Colab with a T4 GPU runtime. Ten business ideas built on Autoresearch span niches like SaaS optimization, A/B testing agencies, trading backtests, CRM lead scoring, and done-for-you due diligence. Karpathy also launched Agent Hub — essentially a GitHub designed for agent swarms to collaborate on the same codebase. The project already has 25,000+ GitHub stars and is growing fast; early movers who tinker now build an unfair advantage. The #1 tool to find startup ideas/trends - https://www.ideabrowser.com LCA helps Fortune 500s and fast-growing startups build their future - from Warner Music to Fortnite to Dropbox. We turn 'what if' into reality with AI, apps, and next-gen products https://latecheckout.agency/ The Vibe Marketer - Resources for people into vibe marketing/marketing with AI: https://www.thevibemarketer.com/ FIND ME ON SOCIAL X/Twitter: https://twitter.com/gregisenberg Instagram: https://instagram.com/gregisenberg/ LinkedIn: https://www.linkedin.com/in/gisenberg/

This Week in Startups
How agents will change banking forever | E2260

This Week in Startups

Play Episode Listen Later Mar 10, 2026 60:38


This Week In Startups is made possible by:Northwest Registered Agent - ⁠⁠northwestregisteredagent.com/twist⁠⁠Quo - ⁠⁠quo.com/TWiST⁠⁠Gusto - ⁠⁠Gusto.com/twist⁠⁠Plaud - http://Plaud.ai/twistAthena - https://www.athena.com/jcalToday's show:How long until AI models can improve AI models? Once possible, recursive self-improvement by AI technology could accelerate — forever. Thus far, humans (and their coding agents) are still driving AI progress. But a recent project by AI developer extraordinare Andrej Karpathy, called ‘autoresearcher', is turning heads as it shows that it is possible — in certain contexts — to allow AI agents to run successive coding experiments to improve specific elements of LLM performance. Call it an early demonstration of the future.OpenClaw is exploding in China, while here in the United States, AI is polling somewhere underneath the basement. AI in the United States is about as popular as ICE, which could create a political issue for the technology in the coming elections.Next? Three demos. First, NetXD's Suresh Ramamurthi showed off how he has built OpenClaw functionality to move money, Rohan Arun showed off PhoneClaw automation on Android devices from an AR headset, and Eugene Stuckless gave us a taste of what Eir is building. Our takeaway? OpenClaw is still boring its way into our digital lives, one new skill or tool at a time!GUESTS:Suresh Ramamurthi: ⁠⁠https://x.com/sureshr7⁠⁠Rohan Arun: ⁠⁠https://x.com/Viewforge/⁠⁠Eugene Stuckless: ⁠⁠https://x.com/eugene_eir_inc⁠⁠Timestamps:0:00 — ‘Autoresearcher' and the future of AI improvements6:52 — Why people around the world are flocking to OpenClaw7:57 — Plaud - If your work depends on conversations — interviews, meetings, calls — you need a Plaud NotePin. You can check it out at Plaud.ai/twist and use code TWIST for 10% off!9:46 — Gusto - Check out the online payroll and benefits experts with software built specifically for small business and startups. Try Gusto today and get three months FREE at Gusto.com/twist.12:57 — The changing American social contract20:15 — Quo - Quo (formerly OpenPhone) gives you a clean, modern way to handle every customer call, text, and thread all in one place. Try it free at quo.com/TWIST23:50 — Why China is all-in on AI (and Europe isn't)26:26 — How to keep your job in the AI era28:05 — Northwest Registered Agent - Get more when you start your business with Northwest. In 10 clicks and 10 minutes, you can form your company and walk away with a real business identity — Learn more at www.northwestregisteredagent.com/twist29:38 — Athena - Get $2,000 off your first EA at https://www.athena.com/jcal34:42 — Demo: Suresh Ramamurthi of NetXD42:47 — Demo: Rohan Arun of PhoneClaw47:35 — Why bringing OpenClaw to your smartphone is what's next49:49 — Demo: Eugene Stuckless of Eir56:45 — How can we make smarter, more efficient agents?Subscribe to the TWiST500 newsletter: ⁠⁠https://ticker.thisweekinstartups.com⁠⁠Check out the TWIST500: ⁠⁠https://www.twist500.com⁠⁠Subscribe to This Week in Startups on Apple: ⁠⁠https://rb.gy/v19fcp⁠⁠Follow Lon:X: ⁠⁠https://x.com/lons⁠⁠Follow Alex:X: ⁠⁠https://x.com/alex⁠⁠LinkedIn: ⁠⁠⁠https://www.linkedin.com/in/alexwilhelm⁠⁠Follow Jason:X: ⁠⁠https://twitter.com/Jason⁠⁠LinkedIn: ⁠⁠https://www.linkedin.com/in/jasoncalacanis⁠⁠Great TWIST interviews: ⁠⁠Will Guidara,⁠⁠ ⁠⁠Eoghan McCabe⁠⁠, ⁠⁠Steve Huffman⁠⁠, ⁠⁠Brian Chesky⁠⁠, ⁠⁠Bob Moesta,⁠⁠ ⁠⁠Aaron Levie⁠⁠, ⁠⁠Sophia Amoruso⁠⁠, ⁠⁠Reid Hoffman⁠⁠, ⁠⁠Frank Slootman⁠⁠, ⁠⁠Billy McFarland⁠⁠Check out Jason's suite of newsletters: ⁠⁠https://substack.com/@calacanis⁠⁠Follow TWiST:Twitter: ⁠⁠https://twitter.com/TWiStartups⁠⁠YouTube: ⁠⁠https://www.youtube.com/thisweekin⁠⁠Instagram: ⁠⁠https://www.instagram.com/thisweekinstartups⁠⁠TikTok: ⁠⁠https://www.tiktok.com/@thisweekinstartups⁠⁠Substack: ⁠⁠https://twistartups.substack.com

The AI Breakdown: Daily Artificial Intelligence News and Discussions
Autoresearch, Agent Loops and the Future of Work

The AI Breakdown: Daily Artificial Intelligence News and Discussions

Play Episode Listen Later Mar 9, 2026 25:43


Andrej Karpathy released autoresearch this weekend — a system where an AI agent runs experiments to improve a language model overnight, keeping what works and discarding what doesn't, while the human sleeps. The project itself is fascinating, but what's more interesting is what it shares with the Ralph Wiggum coding loop pattern and a broader shift happening across domains — from software to sales to finance — where the human's job becomes writing the strategy document and defining "better," and the agent does the iterating.Brought to you by:KPMG – Agentic AI is powering a potential $3 trillion productivity shift, and KPMG's new paper, Agentic AI Untangled, gives leaders a clear framework to decide whether to build, buy, or borrow—download it at ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.kpmg.us/Navigate⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Mercury - Modern banking for business and now personal accounts. Learn more at ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://mercury.com/personal-banking⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Rackspace Technology - Build, test and scale intelligent workloads faster with Rackspace AI Launchpad - ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠http://rackspace.com/ailaunchpad⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Blitzy - Want to accelerate enterprise software development velocity by 5x? ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://blitzy.com/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Optimizely Agents in Action - Join the virtual event (with me!) free March 4 - ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://www.optimizely.com/insights/agents-in-action/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠AssemblyAI - The best way to build Voice AI apps - ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://www.assemblyai.com/brief⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠LandfallIP - AI to Navigate the Patent Process - https://landfallip.com/Robots & Pencils - Cloud-native AI solutions that power results ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://robotsandpencils.com/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠The Agent Readiness Audit from Superintelligent - Go to ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://besuper.ai/ ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠to request your company's agent readiness score.The AI Daily Brief helps you understand the most important news and discussions in AI. Subscribe to the podcast version of The AI Daily Brief wherever you listen: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://pod.link/1680633614⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Our Newsletter is BACK: ⁠⁠⁠⁠⁠⁠⁠⁠⁠https://aidailybrief.beehiiv.com/⁠⁠⁠⁠⁠⁠⁠⁠⁠Interested in sponsoring the show? sponsors@aidailybrief.ai

Lenny's Podcast: Product | Growth | Career
Getting paid to vibe code: Inside the new AI-era job | Lazar Jovanovic (Professional Vibe Coder)

Lenny's Podcast: Product | Growth | Career

Play Episode Listen Later Feb 8, 2026 102:30


Lazar Jovanovic is a full-time professional vibe coder at Lovable. His job is to build both internal tools and customer-facing products purely using AI, while not having a coding background. In this conversation, he breaks down the tactics, workflows, and framework that let him ship production-quality products using only AI.We discuss:1. Why having no coding background can be an advantage when building with AI2. Why most of your time should go to planning and chat mode, not prompting3. What to do when you get stuck: his 4x4 debugging workflow4. The PRD and Markdown file system that keeps AI agents aligned across complex builds5. Why kicking off four or five parallel prototypes is the best way to clarify your thinking6. Why design skills and taste are going to be the most important skills in the future7. His “genie and three wishes” mental model for making the most of AI's limitations8. How product, engineering, and design roles are converging—and what that means for your career—Brought to you by:Strella—The AI-powered customer research platform: https://strella.io/lennySamsara—Saving lives with AI built for physical operations: https://samsara.com/lennyWorkOS—Modern identity platform for B2B SaaS, free up to 1 million MAUs: https://workos.com/lenny—Episode transcript: https://www.lennysnewsletter.com/p/getting-paid-to-vibe-code—Archive of all Lenny's Podcast transcripts: https://www.dropbox.com/scl/fo/yxi4s2w998p1gvtpu4193/AMdNPR8AOw0lMklwtnC0TrQ?rlkey=j06x0nipoti519e0xgm23zsn9&st=ahz0fj11&dl=0—Where to find Lazar Jovanovic:• X: https://x.com/lakikentaki• LinkedIn: https://www.linkedin.com/in/lazar-jovanovic• YouTube: https://www.youtube.com/@50in50challenge• Starter Story course: https://build.starterstory.com/build/ai-build-accelerator?via=lazar (code LAZAR15 for 15% off)—Where to find Lenny:• Newsletter: https://www.lennysnewsletter.com• X: https://twitter.com/lennysan• LinkedIn: https://www.linkedin.com/in/lennyrachitsky/—In this episode, we cover:(00:00) Introduction to Lazar and professional vibe coding(04:53) What a professional vibe coder actually does day-to-day(09:26) Why non-technical backgrounds can be an advantage(12:24) The importance of self-awareness(14:42) His “genie and three wishes” mental model(17:43) Developing taste and judgment in the age of AI(21:46) The parallel project approach for better outcomes(29:30) Creating dynamic context windows with PRDs(36:56) Why elite vibe coders focus on planning, not coding(44:43) Creating MD files to guide AI development(50:57) Why prototyping still matters(56:50) Why “good enough” is no longer good enough(01:00:53) The future of engineering in an AI world(01:05:14) What to do when you get stuck: his 4x4 debugging workflow(01:14:27) Helping agents learn from their mistakes(01:15:35) Why watching agent output is more important than code(01:19:08) The incredible pace of AI development(01:22:55) Why emotional intelligence will become more valuable(01:28:30) How to become a professional vibe coder(01:30:10) Why building in public is the fastest path to opportunities(01:37:03) Final thoughts on focusing on quality over tech stack—Referenced:• The new AI growth playbook for 2026: How Lovable hit $200M ARR in one year | Elena Verna (Head of Growth): https://www.lennysnewsletter.com/p/the-new-ai-growth-playbook-for-2026-elena-verna• Elena Verna on how B2B growth is changing, product-led growth, product-led sales, why you should go freemium not trial, what features to make free, and much more: https://www.lennysnewsletter.com/p/elena-verna-on-why-every-company• The ultimate guide to product-led sales | Elena Verna: https://www.lennysnewsletter.com/p/the-ultimate-guide-to-product-led• 10 growth tactics that never work | Elena Verna (Amplitude, Miro, Dropbox, SurveyMonkey): https://www.lennysnewsletter.com/p/10-growth-tactics-that-never-work-elena-verna• Lovable: https://lovable.dev• Lovable + Shopify: https://lovable.dev/shopify• Everyone's an engineer now: Inside v0's mission to create a hundred million builders | Guillermo Rauch (founder and CEO of Vercel, creators of v0 and Next.js): https://www.lennysnewsletter.com/p/everyones-an-engineer-now-guillermo-rauch• Mobbin: https://mobbin.com• Dribbble: https://dribbble.com• 21st.dev: https://21st.dev• Lovable base prompt generator: https://chatgpt.com/g/g-67e1da2c9c988191b52b61084438e8ee-lovable-base-prompt• Lovable PRD generator: https://chatgpt.com/g/g-67e1e85fbeac8191a69b95c6d5c42ef6-lovable-prd-generator• Felix Haas's newsletter: https://designplusai.com• Bauhaus: https://en.wikipedia.org/wiki/Bauhaus• Glassmorphism: https://www.figma.com/community/plugin/1197106608665398190/glassmorphism• UI style guide: http://uistyle.lovable.app• Cloudflare: https://www.cloudflare.com• Ben Tossell on X: https://x.com/bentossell• The rise of Cursor: The $300M ARR AI tool that engineers can't stop using | Michael Truell (co-founder and CEO): https://www.lennysnewsletter.com/p/the-rise-of-cursor-michael-truell• Peter Thiel says AI will be ‘worse' for math nerds than for writers: https://www.businessinsider.com/peter-thiel-ai-worse-for-math-professionals-than-writers-2024-4• Andrej Karpathy on X: https://x.com/karpathy• The 100-person AI lab that became Anthropic and Google's secret weapon | Edwin Chen (Surge AI): https://www.lennysnewsletter.com/p/surge-ai-edwin-chen• Why experts writing AI evals is creating the fastest-growing companies in history | Brendan Foody (CEO of Mercor): https://www.lennysnewsletter.com/p/experts-writing-ai-evals-brendan-foody• Slumdog Millionaire: https://www.imdb.com/title/tt1010048—Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email podcast@lennyrachitsky.com.—Lenny may be an investor in the companies discussed. To hear more, visit www.lennysnewsletter.com

This Week in Tech (Audio)
TWiT 1065: AI Action Park - DeepSeek's mHC Model Training Breakthrough!

This Week in Tech (Audio)

Play Episode Listen Later Jan 5, 2026 167:46


Happy New Year! NVIDIA just spent $20 billion to hollow out an AI company for its brains, while Meta and Google scramble to scoop up fresh talent before AI gets "too weird to manage." Who's winning, who's left behind, and what do these backroom deals mean for the future of artificial intelligence? Andrej Karpathy admits programmers cannot keep pace with AI advances Economic uncertainty in AI despite massive stock market influence Google, Anthropic, and Microsoft drive AI productization for business and consumers OpenAI, Claude, and Gemini battle for consumer AI dominance Journalism struggles to keep up with AI realities and misinformation tools Concerns mount over AI energy, water, and environmental impact narratives Meta buys Manus, expands AI agent ambitions with Llama model OpenAI posts high-stress "Head of Preparedness" job worth $555K+ Training breakthroughs: DeepSeek's mHC and comparisons to Action Park U.S. lawmakers push broad, controversial internet censorship bills Age verification and bans spark state laws, VPN workaround explosion U.S. drone ban labeled protectionist as industry faces tech shortages FCC security initiatives falter; Cyber Trust Mark program scrapped Waymo robotaxis stall in blackouts, raising AV urban planning issues School cellphone bans expose kids' struggle with analog clocks MetroCard era ends in NYC as tap-to-pay takes over subway access RAM, VRAM, and GPU prices soar as AI and gaming squeeze supply CES preview: Samsung QD-OLED TV, Sony AFEELA car, gadget show hype Remembering Stewart Cheifet and Computer Chronicles' legacy Host: Leo Laporte Guests: Dan Patterson and Joey de Villa Download or subscribe to This Week in Tech at https://twit.tv/shows/this-week-in-tech Join Club TWiT for Ad-Free Podcasts! Support what you love and get ad-free audio and video feeds, a members-only Discord, and exclusive content. Join today: https://twit.tv/clubtwit Sponsors: zscaler.com/security canary.tools/twit - use code: TWIT monarch.com with code TWIT Melissa.com/twit redis.io