POPULARITY
The Blueprint Show - Unlocking the Future of E-commerce with AI Summary In this episode of Seller Sessions, Danny McMillan and Andrew Joseph Bell explore the intersection of AI and e-commerce, with a focus on Amazon's technological advancements. They examine Amazon science papers versus patents, discuss challenges with large language models, and highlight the importance of semantic intent in product recommendations. The conversation explores the evolution from keyword optimization to understanding customer purchase intentions, showcasing how AI tools like Rufus are transforming the shopping experience. The hosts provide practical strategies for sellers to optimize listings and harness AI for improved product visibility and sales. Key Takeaways Amazon science papers predict future e-commerce trends. AI integration is accelerating in Amazon's ecosystem. Understanding semantic intent is crucial for product recommendations. The shift from keywords to purchase intentions is significant. Rufus enhances the shopping experience with AI planning capabilities. Sellers should focus on customer motivations in their listings. Creating compelling product content is essential for visibility. Custom GPTs can optimize product listings effectively. Inference pathways help align products with customer goals. Asking the right questions is key to leveraging AI effectively. Sound Bites "Understanding semantic intent is crucial." "You can bend AI to your will." "Asking the right questions opens doors." Chapters 00:00 Introduction to Seller Sessions and New Season 00:33 Exploring Amazon Science Papers vs. Patents 01:27 Understanding Rufus and AI in E-commerce 02:52 Challenges in Large Language Models and Product Recommendations 07:09 Research Contributions and Implications for Sellers 10:31 Strategies for Leveraging AI in Product Listings 12:42 The Future of Shopping with AI and Amazon's Innovations 16:14 Practical Examples: Using AI for Product Optimization 22:29 Building Tools for Enhanced E-commerce Experiences 25:38 Product Naming and Features Exploration 27:44 Understanding Inference Pathways in Product Descriptions 30:36 Building Tools for AI Prompting and Automation 38:58 Bending AI to Your Will: Creativity and Imagination 48:10 Practical Applications of AI in Business Automation
Send us a textSupport the showWhatsApp: +66 (Thailand) 06 3359 0002Emails: Arseniobuck@icloud.com ////// arseniobuck2014@outlook.comInstagram: https://www.instagram.com/thearsenioseslpodcast/Second Instagram: https://www.instagram.com/arsenioseslpodcastt/ Facebook: https://www.facebook.com/ArseniosESLPodcast/ Youtube: https://www.youtube.com/channel/UCIzp4EdbJVMhhSnq_0u4ntA
Alex Edmans, a professor of finance at London Business School, tells us how to avoid the Ladder of Misinference by examining how narratives, statistics, and articles can mislead, especially when they align with our preconceived notions and confirm what we believe is true, assume is true, and wish were true.Alex Edmans May Contain LiesWhat to Test in a Post Trust WorldHow Minds ChangeDavid McRaney's TwitterDavid McRaney's BlueSkyYANSS TwitterYANSS FacebookNewsletterKittedPatreon
In this conversation, Jay Goldberg and Austin Lyons discuss Nvidia's recent earnings report, the future of AI and inference, and the dynamics of the AI market, including the impact of China on Nvidia's revenue. They explore the differences between consumer and enterprise workloads, the role of financing in AI server sales, and the challenges of realizing ROI from AI investments. The discussion also touches on real-world applications of AI in business and the future of AI integration in consumer products.
China may have been the big headline out of Nvidia's quarter, with 28 mentions on the earnings call, but right behind was inference at 27. It represents the next wave of AI, models that generate responses after getting trained, and could unlock a major new growth engine for Nvidia.
Build and run your AI apps and agents at scale with Azure. Orchestrate multi-agent apps and high-scale inference solutions using open-source and proprietary models, no infrastructure management needed. With Azure, connect frameworks like Semantic Kernel to models from DeepSeek, Llama, OpenAI's GPT-4o, and Sora, without provisioning GPUs or writing complex scheduling logic. Just submit your prompt and assets, and the models do the rest. Using Azure's Model as a Service, access cutting-edge models, including brand-new releases like DeepSeek R1 and Sora, as managed APIs with autoscaling and built-in security. Whether you're handling bursts of demand, fine-tuning models, or provisioning compute, Azure provides the capacity, efficiency, and flexibility you need. With industry-leading AI silicon, including H100s, GB200s, and advanced cooling, your solutions can run with the same power and scale behind ChatGPT. Mark Russinovich, Azure CTO, Deputy CISO, and Microsoft Technical Fellow, joins Jeremy Chapman to share how Azure's latest AI advancements and orchestration capabilities unlock new possibilities for developers. ► QUICK LINKS: 00:00 - Build and run AI apps and agents in Azure 00:26 - Narrated video generation example with multi-agentic, multi-model app 03:17 - Model as a Service in Azure 04:02 - Scale and performance 04:55 - Enterprise grade security 05:17 - Latest AI silicon available on Azure 06:29 - Inference at scale 07:27 - Everyday AI and agentic solutions 08:36 - Provisioned Throughput 10:55 - Fractional GPU Allocation 12:13 - What's next for Azure? 12:44 - Wrap up ► Link References For more information, check out https://aka.ms/AzureAI ► Unfamiliar with Microsoft Mechanics? As Microsoft's official video series for IT, you can watch and share valuable content and demos of current and upcoming tech from the people who build it at Microsoft. • Subscribe to our YouTube: https://www.youtube.com/c/MicrosoftMechanicsSeries • Talk with other IT Pros, join us on the Microsoft Tech Community: https://techcommunity.microsoft.com/t5/microsoft-mechanics-blog/bg-p/MicrosoftMechanicsBlog • Watch or listen from anywhere, subscribe to our podcast: https://microsoftmechanics.libsyn.com/podcast ► Keep getting this insider knowledge, join us on social: • Follow us on Twitter: https://twitter.com/MSFTMechanics • Share knowledge on LinkedIn: https://www.linkedin.com/company/microsoft-mechanics/ • Enjoy us on Instagram: https://www.instagram.com/msftmechanics/ • Loosen up with us on TikTok: https://www.tiktok.com/@msftmechanics
In this episode of The Shoulder Physio Podcast, Dr Jared Powell sits down with Dr Mervyn Travers, physiotherapist, S&C coach, and researcher, to explore one of the most compelling frameworks in contemporary pain science: active inference. They discuss how this predictive brain model helps explain persistent musculoskeletal pain, why traditional exercise-based interventions might miss the mark, and how clinicians can use movement and context to shift a patient's pain experience. Merv blends philosophy, neuroscience, and clinical pragmatism in a way that's accessible, challenging, and highly relevant. Key talking points: What is active inference and how does it relate to predictive processing? The role of prior beliefs, culture, and clinical language in shaping pain Movement experimentation as a tool for model updating and recovery Why it's time to rethink how we prescribe exercise in pain rehab Clinical implications from landmark studies within the field that lend themselves to active inference A call for compassion, curiosity, and nuance in patient care Check out the Shoulder Physio Online Course here Connect with Jared and guests: Jared on Instagram: @shoulder_physio Jared on X: @jaredpowell12 Merv website: Home - Optimise Rehab Merv X: @mervtravers Merv Instagram: @optimise_rehab See our Disclaimer here: The Shoulder Physio - Disclaimer
Ephesians Series: Ephesians 4:17-The Contents of Ephesians 4:17 Presents an Inference from the Contents of Ephesians 4:7-16-Lesson # 251
Ephesians Series: Ephesians 4:17-The Contents of Ephesians 4:17 Presents an Inference from the Contents of Ephesians 4:7-16-Lesson # 251
New episode with my good friends Sholto Douglas & Trenton Bricken. Sholto focuses on scaling RL and Trenton researches mechanistic interpretability, both at Anthropic.We talk through what's changed in the last year of AI research; the new RL regime and how far it can scale; how to trace a model's thoughts; and how countries, workers, and students should prepare for AGI.See you next year for v3. Here's last year's episode, btw. Enjoy!Watch on YouTube; listen on Apple Podcasts or Spotify.----------SPONSORS* WorkOS ensures that AI companies like OpenAI and Anthropic don't have to spend engineering time building enterprise features like access controls or SSO. It's not that they don't need these features; it's just that WorkOS gives them battle-tested APIs that they can use for auth, provisioning, and more. Start building today at workos.com.* Scale is building the infrastructure for safer, smarter AI. Scale's Data Foundry gives major AI labs access to high-quality data to fuel post-training, while their public leaderboards help assess model capabilities. They also just released Scale Evaluation, a new tool that diagnoses model limitations. If you're an AI researcher or engineer, learn how Scale can help you push the frontier at scale.com/dwarkesh.* Lighthouse is THE fastest immigration solution for the technology industry. They specialize in expert visas like the O-1A and EB-1A, and they've already helped companies like Cursor, Notion, and Replit navigate U.S. immigration. Explore which visa is right for you at lighthousehq.com/ref/Dwarkesh.To sponsor a future episode, visit dwarkesh.com/advertise.----------TIMESTAMPS(00:00:00) – How far can RL scale?(00:16:27) – Is continual learning a key bottleneck?(00:31:59) – Model self-awareness(00:50:32) – Taste and slop(01:00:51) – How soon to fully autonomous agents?(01:15:17) – Neuralese(01:18:55) – Inference compute will bottleneck AGI(01:23:01) – DeepSeek algorithmic improvements(01:37:42) – Why are LLMs ‘baby AGI' but not AlphaZero?(01:45:38) – Mech interp(01:56:15) – How countries should prepare for AGI(02:10:26) – Automating white collar work(02:15:35) – Advice for students Get full access to Dwarkesh Podcast at www.dwarkesh.com/subscribe
In this episode, Carrie explores whether Inference-based Cognitive Behavioral Therapy (ICBT) is a good fit for individuals struggling with OCD—especially those who haven't found success with exposure and response prevention (ERP). Episode Highlights:The key differences between ERP and ICBT, and why ICBT may be a better fit for certain individuals with OCD.How ICBT helps unpack the reasoning behind obsessions rather than just managing behaviors.Why ICBT can be especially valuable for Christians seeking faith-sensitive OCD treatment.The limitations and challenges of ERP, including dropout rates and religious exposure concerns.What it takes to succeed with ICBT, including a willingness to deeply engage with the learning and healing process. Join the waitlist for the Christians Learning ICBT training: https://carriebock.com/training/ Explore Carrie's services and courses: carriebock.com/services/ carriebock.com/resources/Follow us on Instagram: www.instagram.com/christianfaithandocd/and like our Facebook page: https://www.facebook.com/christianfaithandocd for the latest updates and sneak peeks.
Welcome to this week's episode of The Mixtape with Scott. Today's podcast guest is our 127th guest on the show—Vitor Possebom, Assistant Professor in the Department of Economics at the Fundação Getulio Vargas. Vitor's research sits at the intersection of two areas — econometrics and causal inference, and policy evaluation in Latin America, particularly Brazil. His contributions revolve around refining and extending tools for estimating causal effects in observational data, especially under common data imperfections like selection bias, measurement error, and treatment effect heterogeneity.* Sample selection and marginal treatment effects (e.g., “Identifying Marginal Treatment Effects in the Presence of Sample Selection” (Journal of Econometrics), “Crime and Mismeasured Punishment” (Review of Economics and Statistics))* Misclassification and measurement error (e.g., “Potato Potahto in the FAO-GAEZ Productivity Measures?”)* Inference and sensitivity in synthetic control methods (e.g., “Cherry Picking with Synthetic Controls”, “Synthetic Control Method: Inference, Sensitivity Analysis and Confidence Sets”)* Probability of causation in non-experimental settings (e.g., “Probability of Causation with Sample Selection”)I invited Vitor onto the podcast because of his creative contributions to causal inference, as he fits into a larger informal series I've been for the last several years on causal inference in general. In today's conversation, we talk about Vitor's path from Brazil to Yale University and then back. Vitor, thank you so much for joining us.Scott's Mixtape Substack is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber. Get full access to Scott's Mixtape Substack at causalinf.substack.com/subscribe
An airhacks.fm conversation with Juan Fumero (@snatverk) about: tornadovm as a Java parallel framework for accelerating data parallelization on GPUs and other hardware, first GPU experiences with ELSA Winner and Voodoo cards, explanation of TornadoVM as a plugin to existing JDKs that uses Graal as a library, TornadoVM's programming model with @parallel and @reduce annotations for parallelizable code, introduction of kernel API for lower-level GPU programming, TornadoVM's ability to dynamically reconfigure and select the best hardware for workloads, implementation of LLM inference acceleration with TornadoVM, challenges in accelerating Llama models on GPUs, introduction of tensor types in TornadoVM to support FP8 and FP16 operations, shared buffer capabilities for GPU memory management, comparison of Java Vector API performance versus GPU acceleration, discussion of model quantization as a potential use case for TornadoVM, exploration of Deep Java Library (DJL) and its ND array implementation, potential standardization of tensor types in Java, integration possibilities with Project Babylon and its Code Reflection capabilities, TornadoVM's execution plans and task graphs for defining accelerated workloads, ability to run on multiple GPUs with different backends simultaneously, potential enterprise applications for LLMs in Java including model distillation for domain-specific models, discussion of Foreign Function & Memory API integration in TornadoVM, performance comparison between different GPU backends like OpenCL and CUDA, collaboration with Intel Level Zero oneAPI and integrated graphics support, future plans for RISC-V support in TornadoVM Juan Fumero on twitter: @snatverk
Agentic AI is equally as daunting as it is dynamic. So…… how do you not screw it up? After all, the more robust and complex agentic AI becomes, the more room there is for error. Luckily, we've got Dr. Maryam Ashoori to guide our agentic ways. Maryam is the Senior Director of Product Management of watsonx at IBM. She joined us at IBM Think 2025 to break down agentic AI done right. Newsletter: Sign up for our free daily newsletterMore on this Episode: Episode PageJoin the discussion: Have a question? Join the convo here.Upcoming Episodes: Check out the upcoming Everyday AI Livestream lineupWebsite: YourEverydayAI.comEmail The Show: info@youreverydayai.comConnect with Jordan on LinkedInTopics Covered in This Episode:Agentic AI Benefits for EnterprisesWatson X's New Features & AnnouncementsAI-Powered Enterprise Solutions at IBMResponsible Implementation of Agentic AILLMs in Enterprise Cost OptimizationDeployment and Scalability EnhancementsAI's Impact on Developer ProductivityProblem-Solving with Agentic AITimestamps:00:00 AI Agents: A Business Imperative06:14 "Optimizing Enterprise Agent Strategy"09:15 Enterprise Leaders' AI Mindset Shift09:58 Focus on Problem-Solving with Technology13:34 "Boost Business with LLMs"16:48 "Understanding and Managing AI Risks"Keywords:Agentic AI, AI agents, Agent lifecycle, LLMs taking actions, WatsonX.ai, Product management, IBM Think conference, Business leaders, Enterprise productivity, WatsonX platform, Custom AI solutions, Environmental Intelligence Suite, Granite Code models, AI-powered code assistant, Customer challenges, Responsible AI implementation, Transparency and traceability, Observability, Optimization, Larger compute, Cost performance optimization, Chain of thought reasoning, Inference time scaling, Deployment service, Scalability of enterprise, Access control, Security requirements, Non-technical users, AI-assisted coding, Developer time-saving, Function calling, Tool calling, Enterprise data integration, Solving enterprise problems, Responsible implementation, Human in the loop, Automation, IBM savings, Risk assessment, Empowering workforce.Send Everyday AI and Jordan a text message. (We can't reply back unless you leave contact info)
“We want every layer — chip, system, software — because when you own the stack you can outrun a GPU cluster by 40-70x,” Cerebras CEO Andrew Feldman says. In this episode of Tech Disruptors, Cerebras returns to the Bloomberg Intelligence podcast studios as Feldman joins Bloomberg Intelligence's Kunjan Sobhani and Mandeep Singh to explain the progress from “biggest chip” to “fastest inference cloud.” Feldman unpacks the WSE-3 upgrade, six new data-center builds and fresh Meta and IBM deals that aim to deliver sub-second answers at a fraction of GPU cost, plus Feldman's views on scaling laws, synthetic data and the looming power crunch.
In the Tech.eu podcast, Fractile founder Walter Goodwin discusses Fractile's AI inference chips which he claims can run LLMs faster and more energy efficient than Nvidia's GPUs.
What if your LLM could think ahead—preparing answers before questions are even asked?In this week's paper read, we dive into a groundbreaking new paper from researchers at Letta, introducing sleep-time compute: a novel technique that lets models do their heavy lifting offline, well before the user query arrives. By predicting likely questions and precomputing key reasoning steps, sleep-time compute dramatically reduces test-time latency and cost—without sacrificing performance.We explore new benchmarks—Stateful GSM-Symbolic, Stateful AIME, and the multi-query extension of GSM—that show up to 5x lower compute at inference, 2.5x lower cost per query, and up to 18% higher accuracy when scaled.You'll also see how this method applies to realistic agent use cases and what makes it most effective.If you care about LLM efficiency, scalability, or cutting-edge research.Explore more AI research, or sign up to hear the next session live: arize.com/ai-research-papersLearn more about AI observability and evaluation, join the Arize AI Slack community or get the latest on LinkedIn and X.
In this episode of AI Basics, Jason sits down with Amin Vahdat, VP of ML at Google Cloud, to unpack the mind-blowing infrastructure behind modern AI. They dive into how Google's TPUs power massive queries, why 2025 is the “Year of Inference,” and how startups can now build what once felt impossible. From real-time agents to exponential speed gains, this is a look inside the AI engine that's rewriting the future.*Timestamps:(0:00) Jason introduces today's guest Amin Vahdat(3:18) Data movement implications for founders and historical bandwidth perspective(5:29) The shift to inference and AI infrastructure trends in startups and enterprises(8:40) Evolution of productivity and potential of low-code/no-code development(11:20) AI infrastructure pricing, cost efficiency, and historical innovation(17:53) Google's TPU technology and infrastructure scale(23:21) Building AI agents for startup evaluation and supervised associate agents(26:08) Documenting decisions for AI learning and early AI agent development*Uncover more valuable insights from AI leaders in Google Cloud's 'Future of AI: Perspectives for Startups' report. Discover what 23 AI industry leaders think about the future of AI—and how it impacts your business. Read their perspectives here: https://goo.gle/futureofai*Check out all of the Startup Basics episodes here: https://thisweekinstartups.com/basicsCheck out Google Cloud: https://cloud.google.com/*Follow Amin:LinkedIn: https://www.linkedin.com/in/vahdat/?trk=public_post_feed-actor-name*Follow Jason:X: https://twitter.com/JasonLinkedIn: https://www.linkedin.com/in/jasoncalacanis*Follow TWiST:Twitter: https://twitter.com/TWiStartupsYouTube: https://www.youtube.com/thisweekinInstagram: https://www.instagram.com/thisweekinstartupsTikTok: https://www.tiktok.com/@thisweekinstartupsSubstack: https://twistartups.substack.com
This episode is sponsored by the DFINITY Foundation. DFINITY Foundation's mission is to develop and contribute technology that enables the Internet Computer (ICP) blockchain and its ecosystem, aiming to shift cloud computing into a fully decentralized state. Find out more at https://internetcomputer.org/ In this episode of Eye on AI, we sit down with Sid Sheth, CEO and Co-Founder of d-Matrix, to explore how his company is revolutionizing AI inference hardware and taking on industry giants like NVIDIA. Sid shares his journey from building multi-billion-dollar businesses in semiconductors to founding d-Matrix—a startup focused on generative AI inference, chiplet-based architecture, and ultra-low latency AI acceleration. We break down: Why the future of AI lies in inference, not training How d-Matrix's Corsair PCIe accelerator outperforms NVIDIA's H200 The role of in-memory compute and high bandwidth memory in next-gen AI chips How d-Matrix integrates seamlessly into hyperscaler and enterprise cloud environments Why AI infrastructure is becoming heterogeneous and what that means for developers The global outlook on inference chips—from the US to APAC and beyond How Sid plans to build the next NVIDIA-level company from the ground up. Whether you're building in AI infrastructure, investing in semiconductors, or just curious about the future of generative AI at scale, this episode is packed with value. Stay Updated: Craig Smith on X:https://x.com/craigss Eye on A.I. on X: https://x.com/EyeOn_AI (00:00) Intro (02:46) Introducing Sid Sheth (05:27) Why He Started d-Matrix (07:28) Lessons from Building a $2.5B Chip Business (11:52) How d-Matrix Prototypes New Chips (15:06) Working with Hyperscalers Like Google & Amazon (17:27) What's Inside the Corsair AI Accelerator (21:12) How d-Matrix Beats NVIDIA on Chip Efficiency (24:10) The Memory Bandwidth Advantage Explained (26:27) Running Massive AI Models at High Speed (30:20) Why Inference Isn't One-Size-Fits-All (32:40) The Future of AI Hardware (36:28) Supporting Llama 3 and Other Open Models (40:16) Is the Inference Market Big Enough? (43:21) Why the US Is Still the Key Market (46:39) Can India Compete in the AI Chip Race? (49:09) Will China Catch Up on AI Hardware?
Send us a textIn this special episode of Sidecar Sync, we dive into the future of AI infrastructure with Ian Andrews, Chief Revenue Officer at Groq (that's Groq with a Q!). Ian shares the story behind Groq's rise, how their LPU chip challenges Nvidia's dominance, and why fast, low-cost, high-quality inference is about to unlock entirely new categories of AI-powered applications. We talk about the human side of prompting, the evolving skillset needed to work with large language models, and what agents and reasoning models mean for the future of knowledge work. Plus, Ian shares how Groq uses AI internally, including an incredible story about an AI-generated RFP audit that caught things humans missed. Tune in for practical insights, forward-looking trends, and plenty of laughs along the way.
Nick Kirby and Trace Fowler break down the Cincinnati Reds' Wednesday night loss to the Seattle Mariners. They analyze Nick Martinez's lackluster start, Elly De La Cruz's defensive miscues, and the contentious batter interference call on Austin Hays late in the game, among other key moments. Nick also recaps the Reds minor league action and preview Thursday's pitching matchup between Brady Singer and Emerson Hancock. Today's Episode on YouTube: https://www.youtube.com/watch?v=t4ihlaIz8J8&t=3032s DSC Commodities: https://deepsouthcommodities.com/ CALL OR TEXT 988 FOR HELP DAY OR NIGHT: https://mantherapy.org/get-help/national-resources/164/lifeline-crisis-chat OTHER CHATTERBOX PROGRAMING: Off The Bench: https://otbthombrennaman.podbean.com/ Chatterbox Bengals: https://podcasts.apple.com/us/podcast/chatterbox-bengals-a-cincinnati-bengals-nfl-podcast/id1652732141 Chatterbox Bearcats: https://chatterboxbearcats.podbean.com/ Dialed In with Thom Brennaman: https://www.youtube.com/playlist?list=PLjPJjEFaBD7VLxmcTTWV0ubHu_cSFdEDU Chatterbox Man on the Street: https://www.youtube.com/watch?v=3Ye-HjJdmmQ&list=PLjPJjEFaBD7V0GOh595LyjumA0bZaqwh9&pp=iAQB
Today we have Hassan back on the show. Hassan was one of our first guests for Huddle when he was working at Vercel, but since then, he's joined Together AI, one of the hottest companies in the world. They just raised a massive series B round. Hassan joins us to talk about Together AI, inference optimization and building AI applications. We touch on a bunch of topics like customer uses of AI, best practices for building apps, and what's next for Together AI. Timestamps 01:42 Opportunity at Together AI 04:26 Together raised a big round 06:06 Vision Behind Together AI 08:32 Problems in running Open Source Models 11:40 Speed For Inference 14:24 Fine Tuning 19:23 One or Two Models or a Combination of them 21:32 Serverless 22:21 Cold Start issues? 27:46 How much data do you need? 30:00 Balancing Reliability and Cost 34:07 How customers are using Together 42:36 Agent Recipes 47:03 Typical Mistakes buiilding AI apps
In this riveting episode of the OCD Whisperer podcast, host Kristina Orlova sits down with Mike Parker, a licensed clinical social worker and the creator of the popular YouTube channel OCD Space. Together, they embark on a deep dive into the world of OCD and the transformative power of Inference based cognitive-behavioral therapy (ICBT). But what happens when doubt becomes the driving force behind every thought? And how can someone trapped in the cycle of obsessional doubt ever learn to trust their own mind again? Mike Parker pulls back the curtain on the insidious nature of "obsessional doubt," a phenomenon that leaves individuals questioning their every thought, memory, and perception. Why do those with OCD feel compelled to seek reassurance over and over, even when they know it offers only fleeting relief? And how does this relentless doubt keep them locked in a prison of their own mind? As the conversation deepens, Kristina and Mike explore the critical differences between ICBT and exposure and response prevention (ERP). But here's the burning question: Can understanding the origin of obsessive thoughts be the key to breaking free from their grip? Mike sheds light on how inferential confusion and obsessional doubt drive OCD. This episode is a masterclass in navigating the labyrinth of OCD treatment. Will listeners walk away with a newfound understanding of how to confront their doubts? Or will the complexities of the human mind leave them questioning everything they thought they knew? Tune in to uncover the answers—and perhaps, a path to freedom. In This Episode [00:02] Introduction to the episode [00:56] Understanding ICBT [02:00] Obsessional doubt explained [02:21] Differentiating ICBT from ERP [03:36] The nature of obsessional doubt [05:58] Reassurance-seeking behavior [09:25] Understanding internal evidence [11:27] The role of self-knowledge [13:31] General facts vs. personal context [14:49] Handling real mistakes [16:40] Exploring early memories [17:46] Understanding obsessional doubt [19:22] Childhood influences on OCD [20:28] Clarifying ICBT vs. psychodynamic therapy [21:44] Focus of inference-based CBT [22:41] Cognitive distortions in OCD [25:34] Re-evaluating daily routines [27:06] Timeframe for progress in treatment [29:22] Complicating factors in OCD treatment Notable Quotes [00:02:42] "Obsessional doubt is a core process identified in OCD when you're doing I-CBT. It's a thought process where someone with OCD knows something but doesn't trust themselves enough to stick with what they know, leading them to question, dismiss, and seek more information than they have." - Michael Parker [00:18:26] "We can start to see how long the client has been telling themselves an obsessional story about themselves... It was all logged in there and then all put together, but if we go back, we can see this actually never meant you should be locked into never-ending doubt." - Michael Parker [00:23:39]"I-CBT is primarily a cognitive therapy... The focus really is figuring out why you reject information, why you don't trust it... Let's figure out why you doubted." — Michael Parker Our Guest Mike Parker, LCSW, is a licensed clinical social worker and private practice therapist based in Pittsburgh, Pennsylvania. He specializes in treating obsessive-compulsive disorder (OCD) using cognitive-behavioral therapy (CBT) and inference-based cognitive therapy (I-CBT). As the host of the OCD Space YouTube channel, Mike is dedicated to educating individuals and mental health professionals on effective OCD treatment approaches. He is passionate about helping clients understand and overcome obsessional doubt while also training fellow therapists in evidence-based interventions. With a focus on empowering individuals to trust themselves and break free from the cycle of compulsions, Mike continues to be a leading voice in the OCD treatment community. Resources & Links Kristina Orlova, LMFT Instagram YouTube OCD CBT Journal Tracker and Planner Website Mike Parker Website LinkedIn YouTube Cognitive Therapy for OCD Disclaimer Please note, while our host is a licensed marriage and family therapist specializing in OCD and anxiety disorders in the state of California, this podcast is for educational purposes only and should not be considered a substitute for therapy. Stay tuned for bi-weekly episodes filled with valuable insights and tips for managing OCD and anxiety. And remember, keep going in the meantime. See you in the next episode!
If you're in SF: Join us for the Claude Plays Pokemon hackathon this Sunday!If you're not: Fill out the 2025 State of AI Eng survey for $250 in Amazon cards!We are SO excited to share our conversation with Dharmesh Shah, co-founder of HubSpot and creator of Agent.ai.A particularly compelling concept we discussed is the idea of "hybrid teams" - the next evolution in workplace organization where human workers collaborate with AI agents as team members. Just as we previously saw hybrid teams emerge in terms of full-time vs. contract workers, or in-office vs. remote workers, Dharmesh predicts that the next frontier will be teams composed of both human and AI members. This raises interesting questions about team dynamics, trust, and how to effectively delegate tasks between human and AI team members.The discussion of business models in AI reveals an important distinction between Work as a Service (WaaS) and Results as a Service (RaaS), something Dharmesh has written extensively about. While RaaS has gained popularity, particularly in customer support applications where outcomes are easily measurable, Dharmesh argues that this model may be over-indexed. Not all AI applications have clearly definable outcomes or consistent economic value per transaction, making WaaS more appropriate in many cases. This insight is particularly relevant for businesses considering how to monetize AI capabilities.The technical challenges of implementing effective agent systems are also explored, particularly around memory and authentication. Shah emphasizes the importance of cross-agent memory sharing and the need for more granular control over data access. He envisions a future where users can selectively share parts of their data with different agents, similar to how OAuth works but with much finer control. This points to significant opportunities in developing infrastructure for secure and efficient agent-to-agent communication and data sharing.Other highlights from our conversation* The Evolution of AI-Powered Agents – Exploring how AI agents have evolved from simple chatbots to sophisticated multi-agent systems, and the role of MCPs in enabling that.* Hybrid Digital Teams and the Future of Work – How AI agents are becoming teammates rather than just tools, and what this means for business operations and knowledge work.* Memory in AI Agents – The importance of persistent memory in AI systems and how shared memory across agents could enhance collaboration and efficiency.* Business Models for AI Agents – Exploring the shift from software as a service (SaaS) to work as a service (WaaS) and results as a service (RaaS), and what this means for monetization.* The Role of Standards Like MCP – Why MCP has been widely adopted and how it enables agent collaboration, tool use, and discovery.* The Future of AI Code Generation and Software Engineering – How AI-assisted coding is changing the role of software engineers and what skills will matter most in the future.* Domain Investing and Efficient Markets – Dharmesh's approach to domain investing and how inefficiencies in digital asset markets create business opportunities.* The Philosophy of Saying No – Lessons from "Sorry, You Must Pass" and how prioritization leads to greater productivity and focus.Timestamps* 00:00 Introduction and Guest Welcome* 02:29 Dharmesh Shah's Journey into AI* 05:22 Defining AI Agents* 06:45 The Evolution and Future of AI Agents* 13:53 Graph Theory and Knowledge Representation* 20:02 Engineering Practices and Overengineering* 25:57 The Role of Junior Engineers in the AI Era* 28:20 Multi-Agent Systems and MCP Standards* 35:55 LinkedIn's Legal Battles and Data Scraping* 37:32 The Future of AI and Hybrid Teams* 39:19 Building Agent AI: A Professional Network for Agents* 40:43 Challenges and Innovations in Agent AI* 45:02 The Evolution of UI in AI Systems* 01:00:25 Business Models: Work as a Service vs. Results as a Service* 01:09:17 The Future Value of Engineers* 01:09:51 Exploring the Role of Agents* 01:10:28 The Importance of Memory in AI* 01:11:02 Challenges and Opportunities in AI Memory* 01:12:41 Selective Memory and Privacy Concerns* 01:13:27 The Evolution of AI Tools and Platforms* 01:18:23 Domain Names and AI Projects* 01:32:08 Balancing Work and Personal Life* 01:35:52 Final Thoughts and ReflectionsTranscriptAlessio [00:00:04]: Hey everyone, welcome back to the Latent Space podcast. This is Alessio, partner and CTO at Decibel Partners, and I'm joined by my co-host Swyx, founder of Small AI.swyx [00:00:12]: Hello, and today we're super excited to have Dharmesh Shah to join us. I guess your relevant title here is founder of Agent AI.Dharmesh [00:00:20]: Yeah, that's true for this. Yeah, creator of Agent.ai and co-founder of HubSpot.swyx [00:00:25]: Co-founder of HubSpot, which I followed for many years, I think 18 years now, gonna be 19 soon. And you caught, you know, people can catch up on your HubSpot story elsewhere. I should also thank Sean Puri, who I've chatted with back and forth, who's been, I guess, getting me in touch with your people. But also, I think like, just giving us a lot of context, because obviously, My First Million joined you guys, and they've been chatting with you guys a lot. So for the business side, we can talk about that, but I kind of wanted to engage your CTO, agent, engineer side of things. So how did you get agent religion?Dharmesh [00:01:00]: Let's see. So I've been working, I'll take like a half step back, a decade or so ago, even though actually more than that. So even before HubSpot, the company I was contemplating that I had named for was called Ingenisoft. And the idea behind Ingenisoft was a natural language interface to business software. Now realize this is 20 years ago, so that was a hard thing to do. But the actual use case that I had in mind was, you know, we had data sitting in business systems like a CRM or something like that. And my kind of what I thought clever at the time. Oh, what if we used email as the kind of interface to get to business software? And the motivation for using email is that it automatically works when you're offline. So imagine I'm getting on a plane or I'm on a plane. There was no internet on planes back then. It's like, oh, I'm going through business cards from an event I went to. I can just type things into an email just to have them all in the backlog. When it reconnects, it sends those emails to a processor that basically kind of parses effectively the commands and updates the software, sends you the file, whatever it is. And there was a handful of commands. I was a little bit ahead of the times in terms of what was actually possible. And I reattempted this natural language thing with a product called ChatSpot that I did back 20...swyx [00:02:12]: Yeah, this is your first post-ChatGPT project.Dharmesh [00:02:14]: I saw it come out. Yeah. And so I've always been kind of fascinated by this natural language interface to software. Because, you know, as software developers, myself included, we've always said, oh, we build intuitive, easy-to-use applications. And it's not intuitive at all, right? Because what we're doing is... We're taking the mental model that's in our head of what we're trying to accomplish with said piece of software and translating that into a series of touches and swipes and clicks and things like that. And there's nothing natural or intuitive about it. And so natural language interfaces, for the first time, you know, whatever the thought is you have in your head and expressed in whatever language that you normally use to talk to yourself in your head, you can just sort of emit that and have software do something. And I thought that was kind of a breakthrough, which it has been. And it's gone. So that's where I first started getting into the journey. I started because now it actually works, right? So once we got ChatGPT and you can take, even with a few-shot example, convert something into structured, even back in the ChatGP 3.5 days, it did a decent job in a few-shot example, convert something to structured text if you knew what kinds of intents you were going to have. And so that happened. And that ultimately became a HubSpot project. But then agents intrigued me because I'm like, okay, well, that's the next step here. So chat's great. Love Chat UX. But if we want to do something even more meaningful, it felt like the next kind of advancement is not this kind of, I'm chatting with some software in a kind of a synchronous back and forth model, is that software is going to do things for me in kind of a multi-step way to try and accomplish some goals. So, yeah, that's when I first got started. It's like, okay, what would that look like? Yeah. And I've been obsessed ever since, by the way.Alessio [00:03:55]: Which goes back to your first experience with it, which is like you're offline. Yeah. And you want to do a task. You don't need to do it right now. You just want to queue it up for somebody to do it for you. Yes. As you think about agents, like, let's start at the easy question, which is like, how do you define an agent? Maybe. You mean the hardest question in the universe? Is that what you mean?Dharmesh [00:04:12]: You said you have an irritating take. I do have an irritating take. I think, well, some number of people have been irritated, including within my own team. So I have a very broad definition for agents, which is it's AI-powered software that accomplishes a goal. Period. That's it. And what irritates people about it is like, well, that's so broad as to be completely non-useful. And I understand that. I understand the criticism. But in my mind, if you kind of fast forward months, I guess, in AI years, the implementation of it, and we're already starting to see this, and we'll talk about this, different kinds of agents, right? So I think in addition to having a usable definition, and I like yours, by the way, and we should talk more about that, that you just came out with, the classification of agents actually is also useful, which is, is it autonomous or non-autonomous? Does it have a deterministic workflow? Does it have a non-deterministic workflow? Is it working synchronously? Is it working asynchronously? Then you have the different kind of interaction modes. Is it a chat agent, kind of like a customer support agent would be? You're having this kind of back and forth. Is it a workflow agent that just does a discrete number of steps? So there's all these different flavors of agents. So if I were to draw it in a Venn diagram, I would draw a big circle that says, this is agents, and then I have a bunch of circles, some overlapping, because they're not mutually exclusive. And so I think that's what's interesting, and we're seeing development along a bunch of different paths, right? So if you look at the first implementation of agent frameworks, you look at Baby AGI and AutoGBT, I think it was, not Autogen, that's the Microsoft one. They were way ahead of their time because they assumed this level of reasoning and execution and planning capability that just did not exist, right? So it was an interesting thought experiment, which is what it was. Even the guy that, I'm an investor in Yohei's fund that did Baby AGI. It wasn't ready, but it was a sign of what was to come. And so the question then is, when is it ready? And so lots of people talk about the state of the art when it comes to agents. I'm a pragmatist, so I think of the state of the practical. It's like, okay, well, what can I actually build that has commercial value or solves actually some discrete problem with some baseline of repeatability or verifiability?swyx [00:06:22]: There was a lot, and very, very interesting. I'm not irritated by it at all. Okay. As you know, I take a... There's a lot of anthropological view or linguistics view. And in linguistics, you don't want to be prescriptive. You want to be descriptive. Yeah. So you're a goals guy. That's the key word in your thing. And other people have other definitions that might involve like delegated trust or non-deterministic work, LLM in the loop, all that stuff. The other thing I was thinking about, just the comment on Baby AGI, LGBT. Yeah. In that piece that you just read, I was able to go through our backlog and just kind of track the winter of agents and then the summer now. Yeah. And it's... We can tell the whole story as an oral history, just following that thread. And it's really just like, I think, I tried to explain the why now, right? Like I had, there's better models, of course. There's better tool use with like, they're just more reliable. Yep. Better tools with MCP and all that stuff. And I'm sure you have opinions on that too. Business model shift, which you like a lot. I just heard you talk about RAS with MFM guys. Yep. Cost is dropping a lot. Yep. Inference is getting faster. There's more model diversity. Yep. Yep. I think it's a subtle point. It means that like, you have different models with different perspectives. You don't get stuck in the basin of performance of a single model. Sure. You can just get out of it by just switching models. Yep. Multi-agent research and RL fine tuning. So I just wanted to let you respond to like any of that.Dharmesh [00:07:44]: Yeah. A couple of things. Connecting the dots on the kind of the definition side of it. So we'll get the irritation out of the way completely. I have one more, even more irritating leap on the agent definition thing. So here's the way I think about it. By the way, the kind of word agent, I looked it up, like the English dictionary definition. The old school agent, yeah. Is when you have someone or something that does something on your behalf, like a travel agent or a real estate agent acts on your behalf. It's like proxy, which is a nice kind of general definition. So the other direction I'm sort of headed, and it's going to tie back to tool calling and MCP and things like that, is if you, and I'm not a biologist by any stretch of the imagination, but we have these single-celled organisms, right? Like the simplest possible form of what one would call life. But it's still life. It just happens to be single-celled. And then you can combine cells and then cells become specialized over time. And you have much more sophisticated organisms, you know, kind of further down the spectrum. In my mind, at the most fundamental level, you can almost think of having atomic agents. What is the simplest possible thing that's an agent that can still be called an agent? What is the equivalent of a kind of single-celled organism? And the reason I think that's useful is right now we're headed down the road, which I think is very exciting around tool use, right? That says, okay, the LLMs now can be provided a set of tools that it calls to accomplish whatever it needs to accomplish in the kind of furtherance of whatever goal it's trying to get done. And I'm not overly bothered by it, but if you think about it, if you just squint a little bit and say, well, what if everything was an agent? And what if tools were actually just atomic agents? Because then it's turtles all the way down, right? Then it's like, oh, well, all that's really happening with tool use is that we have a network of agents that know about each other through something like an MMCP and can kind of decompose a particular problem and say, oh, I'm going to delegate this to this set of agents. And why do we need to draw this distinction between tools, which are functions most of the time? And an actual agent. And so I'm going to write this irritating LinkedIn post, you know, proposing this. It's like, okay. And I'm not suggesting we should call even functions, you know, call them agents. But there is a certain amount of elegance that happens when you say, oh, we can just reduce it down to one primitive, which is an agent that you can combine in complicated ways to kind of raise the level of abstraction and accomplish higher order goals. Anyway, that's my answer. I'd say that's a success. Thank you for coming to my TED Talk on agent definitions.Alessio [00:09:54]: How do you define the minimum viable agent? Do you already have a definition for, like, where you draw the line between a cell and an atom? Yeah.Dharmesh [00:10:02]: So in my mind, it has to, at some level, use AI in order for it to—otherwise, it's just software. It's like, you know, we don't need another word for that. And so that's probably where I draw the line. So then the question, you know, the counterargument would be, well, if that's true, then lots of tools themselves are actually not agents because they're just doing a database call or a REST API call or whatever it is they're doing. And that does not necessarily qualify them, which is a fair counterargument. And I accept that. It's like a good argument. I still like to think about—because we'll talk about multi-agent systems, because I think—so we've accepted, which I think is true, lots of people have said it, and you've hopefully combined some of those clips of really smart people saying this is the year of agents, and I completely agree, it is the year of agents. But then shortly after that, it's going to be the year of multi-agent systems or multi-agent networks. I think that's where it's going to be headed next year. Yeah.swyx [00:10:54]: Opening eyes already on that. Yeah. My quick philosophical engagement with you on this. I often think about kind of the other spectrum, the other end of the cell spectrum. So single cell is life, multi-cell is life, and you clump a bunch of cells together in a more complex organism, they become organs, like an eye and a liver or whatever. And then obviously we consider ourselves one life form. There's not like a lot of lives within me. I'm just one life. And now, obviously, I don't think people don't really like to anthropomorphize agents and AI. Yeah. But we are extending our consciousness and our brain and our functionality out into machines. I just saw you were a Bee. Yeah. Which is, you know, it's nice. I have a limitless pendant in my pocket.Dharmesh [00:11:37]: I got one of these boys. Yeah.swyx [00:11:39]: I'm testing it all out. You know, got to be early adopters. But like, we want to extend our personal memory into these things so that we can be good at the things that we're good at. And, you know, machines are good at it. Machines are there. So like, my definition of life is kind of like going outside of my own body now. I don't know if you've ever had like reflections on that. Like how yours. How our self is like actually being distributed outside of you. Yeah.Dharmesh [00:12:01]: I don't fancy myself a philosopher. But you went there. So yeah, I did go there. I'm fascinated by kind of graphs and graph theory and networks and have been for a long, long time. And to me, we're sort of all nodes in this kind of larger thing. It just so happens that we're looking at individual kind of life forms as they exist right now. But so the idea is when you put a podcast out there, there's these little kind of nodes you're putting out there of like, you know, conceptual ideas. Once again, you have varying kind of forms of those little nodes that are up there and are connected in varying and sundry ways. And so I just think of myself as being a node in a massive, massive network. And I'm producing more nodes as I put content or ideas. And, you know, you spend some portion of your life collecting dots, experiences, people, and some portion of your life then connecting dots from the ones that you've collected over time. And I found that really interesting things happen and you really can't know in advance how those dots are necessarily going to connect in the future. And that's, yeah. So that's my philosophical take. That's the, yes, exactly. Coming back.Alessio [00:13:04]: Yep. Do you like graph as an agent? Abstraction? That's been one of the hot topics with LandGraph and Pydantic and all that.Dharmesh [00:13:11]: I do. The thing I'm more interested in terms of use of graphs, and there's lots of work happening on that now, is graph data stores as an alternative in terms of knowledge stores and knowledge graphs. Yeah. Because, you know, so I've been in software now 30 plus years, right? So it's not 10,000 hours. It's like 100,000 hours that I've spent doing this stuff. And so I've grew up with, so back in the day, you know, I started on mainframes. There was a product called IMS from IBM, which is basically an index database, what we'd call like a key value store today. Then we've had relational databases, right? We have tables and columns and foreign key relationships. We all know that. We have document databases like MongoDB, which is sort of a nested structure keyed by a specific index. We have vector stores, vector embedding database. And graphs are interesting for a couple of reasons. One is, so it's not classically structured in a relational way. When you say structured database, to most people, they're thinking tables and columns and in relational database and set theory and all that. Graphs still have structure, but it's not the tables and columns structure. And you could wonder, and people have made this case, that they are a better representation of knowledge for LLMs and for AI generally than other things. So that's kind of thing number one conceptually, and that might be true, I think is possibly true. And the other thing that I really like about that in the context of, you know, I've been in the context of data stores for RAG is, you know, RAG, you say, oh, I have a million documents, I'm going to build the vector embeddings, I'm going to come back with the top X based on the semantic match, and that's fine. All that's very, very useful. But the reality is something gets lost in the chunking process and the, okay, well, those tend, you know, like, you don't really get the whole picture, so to speak, and maybe not even the right set of dimensions on the kind of broader picture. And it makes intuitive sense to me that if we did capture it properly in a graph form, that maybe that feeding into a RAG pipeline will actually yield better results for some use cases, I don't know, but yeah.Alessio [00:15:03]: And do you feel like at the core of it, there's this difference between imperative and declarative programs? Because if you think about HubSpot, it's like, you know, people and graph kind of goes hand in hand, you know, but I think maybe the software before was more like primary foreign key based relationship, versus now the models can traverse through the graph more easily.Dharmesh [00:15:22]: Yes. So I like that representation. There's something. It's just conceptually elegant about graphs and just from the representation of it, they're much more discoverable, you can kind of see it, there's observability to it, versus kind of embeddings, which you can't really do much with as a human. You know, once they're in there, you can't pull stuff back out. But yeah, I like that kind of idea of it. And the other thing that's kind of, because I love graphs, I've been long obsessed with PageRank from back in the early days. And, you know, one of the kind of simplest algorithms in terms of coming up, you know, with a phone, everyone's been exposed to PageRank. And the idea is that, and so I had this other idea for a project, not a company, and I have hundreds of these, called NodeRank, is to be able to take the idea of PageRank and apply it to an arbitrary graph that says, okay, I'm going to define what authority looks like and say, okay, well, that's interesting to me, because then if you say, I'm going to take my knowledge store, and maybe this person that contributed some number of chunks to the graph data store has more authority on this particular use case or prompt that's being submitted than this other one that may, or maybe this one was more. popular, or maybe this one has, whatever it is, there should be a way for us to kind of rank nodes in a graph and sort them in some, some useful way. Yeah.swyx [00:16:34]: So I think that's generally useful for, for anything. I think the, the problem, like, so even though at my conferences, GraphRag is super popular and people are getting knowledge, graph religion, and I will say like, it's getting space, getting traction in two areas, conversation memory, and then also just rag in general, like the, the, the document data. Yeah. It's like a source. Most ML practitioners would say that knowledge graph is kind of like a dirty word. The graph database, people get graph religion, everything's a graph, and then they, they go really hard into it and then they get a, they get a graph that is too complex to navigate. Yes. And so like the, the, the simple way to put it is like you at running HubSpot, you know, the power of graphs, the way that Google has pitched them for many years, but I don't suspect that HubSpot itself uses a knowledge graph. No. Yeah.Dharmesh [00:17:26]: So when is it over engineering? Basically? It's a great question. I don't know. So the question now, like in AI land, right, is the, do we necessarily need to understand? So right now, LLMs for, for the most part are somewhat black boxes, right? We sort of understand how the, you know, the algorithm itself works, but we really don't know what's going on in there and, and how things come out. So if a graph data store is able to produce the outcomes we want, it's like, here's a set of queries I want to be able to submit and then it comes out with useful content. Maybe the underlying data store is as opaque as a vector embeddings or something like that, but maybe it's fine. Maybe we don't necessarily need to understand it to get utility out of it. And so maybe if it's messy, that's okay. Um, that's, it's just another form of lossy compression. Uh, it's just lossy in a way that we just don't completely understand in terms of, because it's going to grow organically. Uh, and it's not structured. It's like, ah, we're just gonna throw a bunch of stuff in there. Let the, the equivalent of the embedding algorithm, whatever they called in graph land. Um, so the one with the best results wins. I think so. Yeah.swyx [00:18:26]: Or is this the practical side of me is like, yeah, it's, if it's useful, we don't necessarilyDharmesh [00:18:30]: need to understand it.swyx [00:18:30]: I have, I mean, I'm happy to push back as long as you want. Uh, it's not practical to evaluate like the 10 different options out there because it takes time. It takes people, it takes, you know, resources, right? Set. That's the first thing. Second thing is your evals are typically on small things and some things only work at scale. Yup. Like graphs. Yup.Dharmesh [00:18:46]: Yup. That's, yeah, no, that's fair. And I think this is one of the challenges in terms of implementation of graph databases is that the most common approach that I've seen developers do, I've done it myself, is that, oh, I've got a Postgres database or a MySQL or whatever. I can represent a graph with a very set of tables with a parent child thing or whatever. And that sort of gives me the ability, uh, why would I need anything more than that? And the answer is, well, if you don't need anything more than that, you don't need anything more than that. But there's a high chance that you're sort of missing out on the actual value that, uh, the graph representation gives you. Which is the ability to traverse the graph, uh, efficiently in ways that kind of going through the, uh, traversal in a relational database form, even though structurally you have the data, practically you're not gonna be able to pull it out in, in useful ways. Uh, so you wouldn't like represent a social graph, uh, in, in using that kind of relational table model. It just wouldn't scale. It wouldn't work.swyx [00:19:36]: Uh, yeah. Uh, I think we want to move on to MCP. Yeah. But I just want to, like, just engineering advice. Yeah. Uh, obviously you've, you've, you've run, uh, you've, you've had to do a lot of projects and run a lot of teams. Do you have a general rule for over-engineering or, you know, engineering ahead of time? You know, like, because people, we know premature engineering is the root of all evil. Yep. But also sometimes you just have to. Yep. When do you do it? Yes.Dharmesh [00:19:59]: It's a great question. This is, uh, a question as old as time almost, which is what's the right and wrong levels of abstraction. That's effectively what, uh, we're answering when we're trying to do engineering. I tend to be a pragmatist, right? So here's the thing. Um, lots of times doing something the right way. Yeah. It's like a marginal increased cost in those cases. Just do it the right way. And this is what makes a, uh, a great engineer or a good engineer better than, uh, a not so great one. It's like, okay, all things being equal. If it's going to take you, you know, roughly close to constant time anyway, might as well do it the right way. Like, so do things well, then the question is, okay, well, am I building a framework as the reusable library? To what degree, uh, what am I anticipating in terms of what's going to need to change in this thing? Uh, you know, along what dimension? And then I think like a business person in some ways, like what's the return on calories, right? So, uh, and you look at, um, energy, the expected value of it's like, okay, here are the five possible things that could happen, uh, try to assign probabilities like, okay, well, if there's a 50% chance that we're going to go down this particular path at some day, like, or one of these five things is going to happen and it costs you 10% more to engineer for that. It's basically, it's something that yields a kind of interest compounding value. Um, as you get closer to the time of, of needing that versus having to take on debt, which is when you under engineer it, you're taking on debt. You're going to have to pay off when you do get to that eventuality where something happens. One thing as a pragmatist, uh, so I would rather under engineer something than over engineer it. If I were going to err on the side of something, and here's the reason is that when you under engineer it, uh, yes, you take on tech debt, uh, but the interest rate is relatively known and payoff is very, very possible, right? Which is, oh, I took a shortcut here as a result of which now this thing that should have taken me a week is now going to take me four weeks. Fine. But if that particular thing that you thought might happen, never actually, you never have that use case transpire or just doesn't, it's like, well, you just save yourself time, right? And that has value because you were able to do other things instead of, uh, kind of slightly over-engineering it away, over-engineering it. But there's no perfect answers in art form in terms of, uh, and yeah, we'll, we'll bring kind of this layers of abstraction back on the code generation conversation, which we'll, uh, I think I have later on, butAlessio [00:22:05]: I was going to ask, we can just jump ahead quickly. Yeah. Like, as you think about vibe coding and all that, how does the. Yeah. Percentage of potential usefulness change when I feel like we over-engineering a lot of times it's like the investment in syntax, it's less about the investment in like arc exacting. Yep. Yeah. How does that change your calculus?Dharmesh [00:22:22]: A couple of things, right? One is, um, so, you know, going back to that kind of ROI or a return on calories, kind of calculus or heuristic you think through, it's like, okay, well, what is it going to cost me to put this layer of abstraction above the code that I'm writing now, uh, in anticipating kind of future needs. If the cost of fixing, uh, or doing under engineering right now. Uh, we'll trend towards zero that says, okay, well, I don't have to get it right right now because even if I get it wrong, I'll run the thing for six hours instead of 60 minutes or whatever. It doesn't really matter, right? Like, because that's going to trend towards zero to be able, the ability to refactor a code. Um, and because we're going to not that long from now, we're going to have, you know, large code bases be able to exist, uh, you know, as, as context, uh, for a code generation or a code refactoring, uh, model. So I think it's going to make it, uh, make the case for under engineering, uh, even stronger. Which is why I take on that cost. You just pay the interest when you get there, it's not, um, just go on with your life vibe coded and, uh, come back when you need to. Yeah.Alessio [00:23:18]: Sometimes I feel like there's no decision-making in some things like, uh, today I built a autosave for like our internal notes platform and I literally just ask them cursor. Can you add autosave? Yeah. I don't know if it's over under engineer. Yep. I just vibe coded it. Yep. And I feel like at some point we're going to get to the point where the models kindDharmesh [00:23:36]: of decide where the right line is, but this is where the, like the, in my mind, the danger is, right? So there's two sides to this. One is the cost of kind of development and coding and things like that stuff that, you know, we talk about. But then like in your example, you know, one of the risks that we have is that because adding a feature, uh, like a save or whatever the feature might be to a product as that price tends towards zero, are we going to be less discriminant about what features we add as a result of making more product products more complicated, which has a negative impact on the user and navigate negative impact on the business. Um, and so that's the thing I worry about if it starts to become too easy, are we going to be. Too promiscuous in our, uh, kind of extension, adding product extensions and things like that. It's like, ah, why not add X, Y, Z or whatever back then it was like, oh, we only have so many engineering hours or story points or however you measure things. Uh, that least kept us in check a little bit. Yeah.Alessio [00:24:22]: And then over engineering, you're like, yeah, it's kind of like you're putting that on yourself. Yeah. Like now it's like the models don't understand that if they add too much complexity, it's going to come back to bite them later. Yep. So they just do whatever they want to do. Yeah. And I'm curious where in the workflow that's going to be, where it's like, Hey, this is like the amount of complexity and over-engineering you can do before you got to ask me if we should actually do it versus like do something else.Dharmesh [00:24:45]: So you know, we've already, let's like, we're leaving this, uh, in the code generation world, this kind of compressed, um, cycle time. Right. It's like, okay, we went from auto-complete, uh, in the GitHub co-pilot to like, oh, finish this particular thing and hit tab to a, oh, I sort of know your file or whatever. I can write out a full function to you to now I can like hold a bunch of the context in my head. Uh, so we can do app generation, which we have now with lovable and bolt and repletage. Yeah. Association and other things. So then the question is, okay, well, where does it naturally go from here? So we're going to generate products. Make sense. We might be able to generate platforms as though I want a platform for ERP that does this, whatever. And that includes the API's includes the product and the UI, and all the things that make for a platform. There's no nothing that says we would stop like, okay, can you generate an entire software company someday? Right. Uh, with the platform and the monetization and the go-to-market and the whatever. And you know, that that's interesting to me in terms of, uh, you know, what, when you take it to almost ludicrous levels. of abstract.swyx [00:25:39]: It's like, okay, turn it to 11. You mentioned vibe coding, so I have to, this is a blog post I haven't written, but I'm kind of exploring it. Is the junior engineer dead?Dharmesh [00:25:49]: I don't think so. I think what will happen is that the junior engineer will be able to, if all they're bringing to the table is the fact that they are a junior engineer, then yes, they're likely dead. But hopefully if they can communicate with carbon-based life forms, they can interact with product, if they're willing to talk to customers, they can take their kind of basic understanding of engineering and how kind of software works. I think that has value. So I have a 14-year-old right now who's taking Python programming class, and some people ask me, it's like, why is he learning coding? And my answer is, is because it's not about the syntax, it's not about the coding. What he's learning is like the fundamental thing of like how things work. And there's value in that. I think there's going to be timeless value in systems thinking and abstractions and what that means. And whether functions manifested as math, which he's going to get exposed to regardless, or there are some core primitives to the universe, I think, that the more you understand them, those are what I would kind of think of as like really large dots in your life that will have a higher gravitational pull and value to them that you'll then be able to. So I want him to collect those dots, and he's not resisting. So it's like, okay, while he's still listening to me, I'm going to have him do things that I think will be useful.swyx [00:26:59]: You know, part of one of the pitches that I evaluated for AI engineer is a term. And the term is that maybe the traditional interview path or career path of software engineer goes away, which is because what's the point of lead code? Yeah. And, you know, it actually matters more that you know how to work with AI and to implement the things that you want. Yep.Dharmesh [00:27:16]: That's one of the like interesting things that's happened with generative AI. You know, you go from machine learning and the models and just that underlying form, which is like true engineering, right? Like the actual, what I call real engineering. I don't think of myself as a real engineer, actually. I'm a developer. But now with generative AI. We call it AI and it's obviously got its roots in machine learning, but it just feels like fundamentally different to me. Like you have the vibe. It's like, okay, well, this is just a whole different approach to software development to so many different things. And so I'm wondering now, it's like an AI engineer is like, if you were like to draw the Venn diagram, it's interesting because the cross between like AI things, generative AI and what the tools are capable of, what the models do, and this whole new kind of body of knowledge that we're still building out, it's still very young, intersected with kind of classic engineering, software engineering. Yeah.swyx [00:28:04]: I just described the overlap as it separates out eventually until it's its own thing, but it's starting out as a software. Yeah.Alessio [00:28:11]: That makes sense. So to close the vibe coding loop, the other big hype now is MCPs. Obviously, I would say Cloud Desktop and Cursor are like the two main drivers of MCP usage. I would say my favorite is the Sentry MCP. I can pull in errors and then you can just put the context in Cursor. How do you think about that abstraction layer? Does it feel... Does it feel almost too magical in a way? Do you think it's like you get enough? Because you don't really see how the server itself is then kind of like repackaging theDharmesh [00:28:41]: information for you? I think MCP as a standard is one of the better things that's happened in the world of AI because a standard needed to exist and absent a standard, there was a set of things that just weren't possible. Now, we can argue whether it's the best possible manifestation of a standard or not. Does it do too much? Does it do too little? I get that, but it's just simple enough to both be useful and unobtrusive. It's understandable and adoptable by mere mortals, right? It's not overly complicated. You know, a reasonable engineer can put a stand up an MCP server relatively easily. The thing that has me excited about it is like, so I'm a big believer in multi-agent systems. And so that's going back to our kind of this idea of an atomic agent. So imagine the MCP server, like obviously it calls tools, but the way I think about it, so I'm working on my current passion project is agent.ai. And we'll talk more about that in a little bit. More about the, I think we should, because I think it's interesting not to promote the project at all, but there's some interesting ideas in there. One of which is around, we're going to need a mechanism for, if agents are going to collaborate and be able to delegate, there's going to need to be some form of discovery and we're going to need some standard way. It's like, okay, well, I just need to know what this thing over here is capable of. We're going to need a registry, which Anthropic's working on. I'm sure others will and have been doing directories of, and there's going to be a standard around that too. How do you build out a directory of MCP servers? I think that's going to unlock so many things just because, and we're already starting to see it. So I think MCP or something like it is going to be the next major unlock because it allows systems that don't know about each other, don't need to, it's that kind of decoupling of like Sentry and whatever tools someone else was building. And it's not just about, you know, Cloud Desktop or things like, even on the client side, I think we're going to see very interesting consumers of MCP, MCP clients versus just the chat body kind of things. Like, you know, Cloud Desktop and Cursor and things like that. But yeah, I'm very excited about MCP in that general direction.swyx [00:30:39]: I think the typical cynical developer take, it's like, we have OpenAPI. Yeah. What's the new thing? I don't know if you have a, do you have a quick MCP versus everything else? Yeah.Dharmesh [00:30:49]: So it's, so I like OpenAPI, right? So just a descriptive thing. It's OpenAPI. OpenAPI. Yes, that's what I meant. So it's basically a self-documenting thing. We can do machine-generated, lots of things from that output. It's a structured definition of an API. I get that, love it. But MCPs sort of are kind of use case specific. They're perfect for exactly what we're trying to use them for around LLMs in terms of discovery. It's like, okay, I don't necessarily need to know kind of all this detail. And so right now we have, we'll talk more about like MCP server implementations, but We will? I think, I don't know. Maybe we won't. At least it's in my head. It's like a back processor. But I do think MCP adds value above OpenAPI. It's, yeah, just because it solves this particular thing. And if we had come to the world, which we have, like, it's like, hey, we already have OpenAPI. It's like, if that were good enough for the universe, the universe would have adopted it already. There's a reason why MCP is taking office because marginally adds something that was missing before and doesn't go too far. And so that's why the kind of rate of adoption, you folks have written about this and talked about it. Yeah, why MCP won. Yeah. And it won because the universe decided that this was useful and maybe it gets supplanted by something else. Yeah. And maybe we discover, oh, maybe OpenAPI was good enough the whole time. I doubt that.swyx [00:32:09]: The meta lesson, this is, I mean, he's an investor in DevTools companies. I work in developer experience at DevRel in DevTools companies. Yep. Everyone wants to own the standard. Yeah. I'm sure you guys have tried to launch your own standards. Actually, it's Houseplant known for a standard, you know, obviously inbound marketing. But is there a standard or protocol that you ever tried to push? No.Dharmesh [00:32:30]: And there's a reason for this. Yeah. Is that? And I don't mean, need to mean, speak for the people of HubSpot, but I personally. You kind of do. I'm not smart enough. That's not the, like, I think I have a. You're smart. Not enough for that. I'm much better off understanding the standards that are out there. And I'm more on the composability side. Let's, like, take the pieces of technology that exist out there, combine them in creative, unique ways. And I like to consume standards. I don't like to, and that's not that I don't like to create them. I just don't think I have the, both the raw wattage or the credibility. It's like, okay, well, who the heck is Dharmesh, and why should we adopt a standard he created?swyx [00:33:07]: Yeah, I mean, there are people who don't monetize standards, like OpenTelemetry is a big standard, and LightStep never capitalized on that.Dharmesh [00:33:15]: So, okay, so if I were to do a standard, there's two things that have been in my head in the past. I was one around, a very, very basic one around, I don't even have the domain, I have a domain for everything, for open marketing. Because the issue we had in HubSpot grew up in the marketing space. There we go. There was no standard around data formats and things like that. It doesn't go anywhere. But the other one, and I did not mean to go here, but I'm going to go here. It's called OpenGraph. I know the term was already taken, but it hasn't been used for like 15 years now for its original purpose. But what I think should exist in the world is right now, our information, all of us, nodes are in the social graph at Meta or the professional graph at LinkedIn. Both of which are actually relatively closed in actually very annoying ways. Like very, very closed, right? Especially LinkedIn. Especially LinkedIn. I personally believe that if it's my data, and if I would get utility out of it being open, I should be able to make my data open or publish it in whatever forms that I choose, as long as I have control over it as opt-in. So the idea is around OpenGraph that says, here's a standard, here's a way to publish it. I should be able to go to OpenGraph.org slash Dharmesh dot JSON and get it back. And it's like, here's your stuff, right? And I can choose along the way and people can write to it and I can prove. And there can be an entire system. And if I were to do that, I would do it as a... Like a public benefit, non-profit-y kind of thing, as this is a contribution to society. I wouldn't try to commercialize that. Have you looked at AdProto? What's that? AdProto.swyx [00:34:43]: It's the protocol behind Blue Sky. Okay. My good friend, Dan Abramov, who was the face of React for many, many years, now works there. And he actually did a talk that I can send you, which basically kind of tries to articulate what you just said. But he does, he loves doing these like really great analogies, which I think you'll like. Like, you know, a lot of our data is behind a handle, behind a domain. Yep. So he's like, all right, what if we flip that? What if it was like our handle and then the domain? Yep. So, and that's really like your data should belong to you. Yep. And I should not have to wait 30 days for my Twitter data to export. Yep.Dharmesh [00:35:19]: you should be able to at least be able to automate it or do like, yes, I should be able to plug it into an agentic thing. Yeah. Yes. I think we're... Because so much of our data is... Locked up. I think the trick here isn't that standard. It is getting the normies to care.swyx [00:35:37]: Yeah. Because normies don't care.Dharmesh [00:35:38]: That's true. But building on that, normies don't care. So, you know, privacy is a really hot topic and an easy word to use, but it's not a binary thing. Like there are use cases where, and we make these choices all the time, that I will trade, not all privacy, but I will trade some privacy for some productivity gain or some benefit to me that says, oh, I don't care about that particular data being online if it gives me this in return, or I don't mind sharing this information with this company.Alessio [00:36:02]: If I'm getting, you know, this in return, but that sort of should be my option. I think now with computer use, you can actually automate some of the exports. Yes. Like something we've been doing internally is like everybody exports their LinkedIn connections. Yep. And then internally, we kind of merge them together to see how we can connect our companies to customers or things like that.Dharmesh [00:36:21]: And not to pick on LinkedIn, but since we're talking about it, but they feel strongly enough on the, you know, do not take LinkedIn data that they will block even browser use kind of things or whatever. They go to great, great lengths, even to see patterns of usage. And it says, oh, there's no way you could have, you know, gotten that particular thing or whatever without, and it's, so it's, there's...swyx [00:36:42]: Wasn't there a Supreme Court case that they lost? Yeah.Dharmesh [00:36:45]: So the one they lost was around someone that was scraping public data that was on the public internet. And that particular company had not signed any terms of service or whatever. It's like, oh, I'm just taking data that's on, there was no, and so that's why they won. But now, you know, the question is around, can LinkedIn... I think they can. Like, when you use, as a user, you use LinkedIn, you are signing up for their terms of service. And if they say, well, this kind of use of your LinkedIn account that violates our terms of service, they can shut your account down, right? They can. And they, yeah, so, you know, we don't need to make this a discussion. By the way, I love the company, don't get me wrong. I'm an avid user of the product. You know, I've got... Yeah, I mean, you've got over a million followers on LinkedIn, I think. Yeah, I do. And I've known people there for a long, long time, right? And I have lots of respect. And I understand even where the mindset originally came from of this kind of members-first approach to, you know, a privacy-first. I sort of get that. But sometimes you sort of have to wonder, it's like, okay, well, that was 15, 20 years ago. There's likely some controlled ways to expose some data on some member's behalf and not just completely be a binary. It's like, no, thou shalt not have the data.swyx [00:37:54]: Well, just pay for sales navigator.Alessio [00:37:57]: Before we move to the next layer of instruction, anything else on MCP you mentioned? Let's move back and then I'll tie it back to MCPs.Dharmesh [00:38:05]: So I think the... Open this with agent. Okay, so I'll start with... Here's my kind of running thesis, is that as AI and agents evolve, which they're doing very, very quickly, we're going to look at them more and more. I don't like to anthropomorphize. We'll talk about why this is not that. Less as just like raw tools and more like teammates. They'll still be software. They should self-disclose as being software. I'm totally cool with that. But I think what's going to happen is that in the same way you might collaborate with a team member on Slack or Teams or whatever you use, you can imagine a series of agents that do specific things just like a team member might do, that you can delegate things to. You can collaborate. You can say, hey, can you take a look at this? Can you proofread that? Can you try this? You can... Whatever it happens to be. So I think it is... I will go so far as to say it's inevitable that we're going to have hybrid teams someday. And what I mean by hybrid teams... So back in the day, hybrid teams were, oh, well, you have some full-time employees and some contractors. Then it was like hybrid teams are some people that are in the office and some that are remote. That's the kind of form of hybrid. The next form of hybrid is like the carbon-based life forms and agents and AI and some form of software. So let's say we temporarily stipulate that I'm right about that over some time horizon that eventually we're going to have these kind of digitally hybrid teams. So if that's true, then the question you sort of ask yourself is that then what needs to exist in order for us to get the full value of that new model? It's like, okay, well... You sort of need to... It's like, okay, well, how do I... If I'm building a digital team, like, how do I... Just in the same way, if I'm interviewing for an engineer or a designer or a PM, whatever, it's like, well, that's why we have professional networks, right? It's like, oh, they have a presence on likely LinkedIn. I can go through that semi-structured, structured form, and I can see the experience of whatever, you know, self-disclosed. But, okay, well, agents are going to need that someday. And so I'm like, okay, well, this seems like a thread that's worth pulling on. That says, okay. So I... So agent.ai is out there. And it's LinkedIn for agents. It's LinkedIn for agents. It's a professional network for agents. And the more I pull on that thread, it's like, okay, well, if that's true, like, what happens, right? It's like, oh, well, they have a profile just like anyone else, just like a human would. It's going to be a graph underneath, just like a professional network would be. It's just that... And you can have its, you know, connections and follows, and agents should be able to post. That's maybe how they do release notes. Like, oh, I have this new version. Whatever they decide to post, it should just be able to... Behave as a node on the network of a professional network. As it turns out, the more I think about that and pull on that thread, the more and more things, like, start to make sense to me. So it may be more than just a pure professional network. So my original thought was, okay, well, it's a professional network and agents as they exist out there, which I think there's going to be more and more of, will kind of exist on this network and have the profile. But then, and this is always dangerous, I'm like, okay, I want to see a world where thousands of agents are out there in order for the... Because those digital employees, the digital workers don't exist yet in any meaningful way. And so then I'm like, oh, can I make that easier for, like... And so I have, as one does, it's like, oh, I'll build a low-code platform for building agents. How hard could that be, right? Like, very hard, as it turns out. But it's been fun. So now, agent.ai has 1.3 million users. 3,000 people have actually, you know, built some variation of an agent, sometimes just for their own personal productivity. About 1,000 of which have been published. And the reason this comes back to MCP for me, so imagine that and other networks, since I know agent.ai. So right now, we have an MCP server for agent.ai that exposes all the internally built agents that we have that do, like, super useful things. Like, you know, I have access to a Twitter API that I can subsidize the cost. And I can say, you know, if you're looking to build something for social media, these kinds of things, with a single API key, and it's all completely free right now, I'm funding it. That's a useful way for it to work. And then we have a developer to say, oh, I have this idea. I don't have to worry about open AI. I don't have to worry about, now, you know, this particular model is better. It has access to all the models with one key. And we proxy it kind of behind the scenes. And then expose it. So then we get this kind of community effect, right? That says, oh, well, someone else may have built an agent to do X. Like, I have an agent right now that I built for myself to do domain valuation for website domains because I'm obsessed with domains, right? And, like, there's no efficient market for domains. There's no Zillow for domains right now that tells you, oh, here are what houses in your neighborhood sold for. It's like, well, why doesn't that exist? We should be able to solve that problem. And, yes, you're still guessing. Fine. There should be some simple heuristic. So I built that. It's like, okay, well, let me go look for past transactions. You say, okay, I'm going to type in agent.ai, agent.com, whatever domain. What's it actually worth? I'm looking at buying it. It can go and say, oh, which is what it does. It's like, I'm going to go look at are there any published domain transactions recently that are similar, either use the same word, same top-level domain, whatever it is. And it comes back with an approximate value, and it comes back with its kind of rationale for why it picked the value and comparable transactions. Oh, by the way, this domain sold for published. Okay. So that agent now, let's say, existed on the web, on agent.ai. Then imagine someone else says, oh, you know, I want to build a brand-building agent for startups and entrepreneurs to come up with names for their startup. Like a common problem, every startup is like, ah, I don't know what to call it. And so they type in five random words that kind of define whatever their startup is. And you can do all manner of things, one of which is like, oh, well, I need to find the domain for it. What are possible choices? Now it's like, okay, well, it would be nice to know if there's an aftermarket price for it, if it's listed for sale. Awesome. Then imagine calling this valuation agent. It's like, okay, well, I want to find where the arbitrage is, where the agent valuation tool says this thing is worth $25,000. It's listed on GoDaddy for $5,000. It's close enough. Let's go do that. Right? And that's a kind of composition use case that in my future state. Thousands of agents on the network, all discoverable through something like MCP. And then you as a developer of agents have access to all these kind of Lego building blocks based on what you're trying to solve. Then you blend in orchestration, which is getting better and better with the reasoning models now. Just describe the problem that you have. Now, the next layer that we're all contending with is that how many tools can you actually give an LLM before the LLM breaks? That number used to be like 15 or 20 before you kind of started to vary dramatically. And so that's the thing I'm thinking about now. It's like, okay, if I want to... If I want to expose 1,000 of these agents to a given LLM, obviously I can't give it all 1,000. Is there some intermediate layer that says, based on your prompt, I'm going to make a best guess at which agents might be able to be helpful for this particular thing? Yeah.Alessio [00:44:37]: Yeah, like RAG for tools. Yep. I did build the Latent Space Researcher on agent.ai. Okay. Nice. Yeah, that seems like, you know, then there's going to be a Latent Space Scheduler. And then once I schedule a research, you know, and you build all of these things. By the way, my apologies for the user experience. You realize I'm an engineer. It's pretty good.swyx [00:44:56]: I think it's a normie-friendly thing. Yeah. That's your magic. HubSpot does the same thing.Alessio [00:45:01]: Yeah, just to like quickly run through it. You can basically create all these different steps. And these steps are like, you know, static versus like variable-driven things. How did you decide between this kind of like low-code-ish versus doing, you know, low-code with code backend versus like not exposing that at all? Any fun design decisions? Yeah. And this is, I think...Dharmesh [00:45:22]: I think lots of people are likely sitting in exactly my position right now, coming through the choosing between deterministic. Like if you're like in a business or building, you know, some sort of agentic thing, do you decide to do a deterministic thing? Or do you go non-deterministic and just let the alum handle it, right, with the reasoning models? The original idea and the reason I took the low-code stepwise, a very deterministic approach. A, the reasoning models did not exist at that time. That's thing number one. Thing number two is if you can get... If you know in your head... If you know in your head what the actual steps are to accomplish whatever goal, why would you leave that to chance? There's no upside. There's literally no upside. Just tell me, like, what steps do you need executed? So right now what I'm playing with... So one thing we haven't talked about yet, and people don't talk about UI and agents. Right now, the primary interaction model... Or they don't talk enough about it. I know some people have. But it's like, okay, so we're used to the chatbot back and forth. Fine. I get that. But I think we're going to move to a blend of... Some of those things are going to be synchronous as they are now. But some are going to be... Some are going to be async. It's just going to put it in a queue, just like... And this goes back to my... Man, I talk fast. But I have this... I only have one other speed. It's even faster. So imagine it's like if you're working... So back to my, oh, we're going to have these hybrid digital teams. Like, you would not go to a co-worker and say, I'm going to ask you to do this thing, and then sit there and wait for them to go do it. Like, that's not how the world works. So it's nice to be able to just, like, hand something off to someone. It's like, okay, well, maybe I expect a response in an hour or a day or something like that.Dharmesh [00:46:52]: In terms of when things need to happen. So the UI around agents. So if you look at the output of agent.ai agents right now, they are the simplest possible manifestation of a UI, right? That says, oh, we have inputs of, like, four different types. Like, we've got a dropdown, we've got multi-select, all the things. It's like back in HTML, the original HTML 1.0 days, right? Like, you're the smallest possible set of primitives for a UI. And it just says, okay, because we need to collect some information from the user, and then we go do steps and do things. And generate some output in HTML or markup are the two primary examples. So the thing I've been asking myself, if I keep going down that path. So people ask me, I get requests all the time. It's like, oh, can you make the UI sort of boring? I need to be able to do this, right? And if I keep pulling on that, it's like, okay, well, now I've built an entire UI builder thing. Where does this end? And so I think the right answer, and this is what I'm going to be backcoding once I get done here, is around injecting a code generation UI generation into, the agent.ai flow, right? As a builder, you're like, okay, I'm going to describe the thing that I want, much like you would do in a vibe coding world. But instead of generating the entire app, it's going to generate the UI that exists at some point in either that deterministic flow or something like that. It says, oh, here's the thing I'm trying to do. Go generate the UI for me. And I can go through some iterations. And what I think of it as a, so it's like, I'm going to generate the code, generate the code, tweak it, go through this kind of prompt style, like we do with vibe coding now. And at some point, I'm going to be happy with it. And I'm going to hit save. And that's going to become the action in that particular step. It's like a caching of the generated code that I can then, like incur any inference time costs. It's just the actual code at that point.Alessio [00:48:29]: Yeah, I invested in a company called E2B, which does code sandbox. And they powered the LM arena web arena. So it's basically the, just like you do LMS, like text to text, they do the same for like UI generation. So if you're asking a model, how do you do it? But yeah, I think that's kind of where.Dharmesh [00:48:45]: That's the thing I'm really fascinated by. So the early LLM, you know, we're understandably, but laughably bad at simple arithmetic, right? That's the thing like my wife, Normies would ask us, like, you call this AI, like it can't, my son would be like, it's just stupid. It can't even do like simple arithmetic. And then like we've discovered over time that, and there's a reason for this, right? It's like, it's a large, there's, you know, the word language is in there for a reason in terms of what it's been trained on. It's not meant to do math, but now it's like, okay, well, the fact that it has access to a Python interpreter that I can actually call at runtime, that solves an entire body of problems that it wasn't trained to do. And it's basically a form of delegation. And so the thought that's kind of rattling around in my head is that that's great. So it's, it's like took the arithmetic problem and took it first. Now, like anything that's solvable through a relatively concrete Python program, it's able to do a bunch of things that I couldn't do before. Can we get to the same place with UI? I don't know what the future of UI looks like in a agentic AI world, but maybe let the LLM handle it, but not in the classic sense. Maybe it generates it on the fly, or maybe we go through some iterations and hit cache or something like that. So it's a little bit more predictable. Uh, I don't know, but yeah.Alessio [00:49:48]: And especially when is the human supposed to intervene? So, especially if you're composing them, most of them should not have a UI because then they're just web hooking to somewhere else. I just want to touch back. I don't know if you have more comments on this.swyx [00:50:01]: I was just going to ask when you, you said you got, you're going to go back to code. What
The Twenty Minute VC: Venture Capital | Startup Funding | The Pitch
Andrew Feldman is the Co-Founder and CEO @ Cerebras, the fastest AI inference + training platform in the world. In Sept 2024 the company filed to go public off the back of a rumoured $1BN deal with G42 in the UAE. Andrew is the leading expert for all things inference. In Today's Episode We Discuss: 04:23 Where Was AI Landscape in 2015 When Cerebras Founded 05:57 NVIDIA's Biggest Strength Has Become Their Biggest Weakness 07:09 What Happens to the Cost of Inference? 08:55 Why Are AI Algorithms So Inefficient? 20:30 Why is it Total BS That We Have Hit Scaling Laws? 23:07 What Will Be the Ratio of Synthetic to Human Data Used in 5 Years? 31:37 What Specifically Was So Impressive About Deepseek? 31:51 Why is Distillation Not Wrong and OpenAI Need to Look in the Mirror? 32:34 Where Will Value Accrue in a World of AI? 34:08 How Will NVIDIA's Market Position Change Over the Next Five Years? 39:59 Why is the CUDA Lockin for NVIDIA BS? What is Their Weakness? 40:46 Why is Trump Better for Business than Biden? 49:41 Do We Underestimate China in a World of AI? 52:33 What is the Most Underappreciated Segment of AI? 54:00 Quickfire Round
SAP and Enterprise Trends Podcasts from Jon Reed (@jonerp) of diginomica.com
Active Inference AI is not necessarily in conflict with LLMs, but Active Inference is definitely a different approach - one that challenges AI assumptions and opens up new thinking. It's been almost six months since Denise Holt shook up Constellation's Connected Enterprise event with a keynote on how Active Inference AI is changing the field. Since then, much has happened. In this audio-only podcast, Jon Reed gets the latest from Denise Holt, including upcoming Active Inference and Spatial Web milestones, and why the latter may prove crucial for grounding AI in world models. We also cover the educational coursework Holt has launched, and how listeners can learn more.
Active Inference AI is not necessarily in conflict with LLMs, but Active Inference is definitely a different approach - one that challenges AI assumptions and opens up new thinking. It's been almost six months since Denise Holt shook up Constellation's Connected Enterprise event with a keynote on how Active Inference AI is changing the field. Since then, much has happened. In this audio-only podcast, Jon Reed gets the latest from Denise Holt, including upcoming Active Inference and Spatial Web milestones, and why the latter may prove crucial for grounding AI in world models. We also cover the educational coursework Holt has launched, and how listeners can learn more.
Welcome to the first episode in our 2025 season of Intel on AI. We sit down with Luke Norris, founder and CEO of Kamiwaza AI to explore how enterprises can unlock trillions of AI inferences per day with hardware-agnostic, scalable AI infrastructure. We discuss the Fifth Industrial Revolution and the rise of AI-native enterprises, why AI inference—not training—is the real bottleneck, and how the new AI supply chain is evolving with custom silicon, cloud interoperability, and multi-vendor strategies. Plus, we dive into the AI Leverage Index (ALI) as a key metric for enterprise intelligence, the rise of autonomous AI agents, and the shift toward inference-based AI monetization. Don't miss this conversation on the future of AI-driven business. Subscribe for more AI insights!
Open Tech Talks : Technology worth Talking| Blogging |Lifestyle
Welcome to your weekly AI Newsletter from AITechCircle! I'm building and implementing AI solutions and sharing everything I learn along the way... Check out the updates from this week! Please take a moment to share them with a friend or colleague who might benefit from these valuable insights! Today at a Glance: Increasing the adoption and effective use of AI and Generative AI technologies in various use cases or business processes Generative AI Use cases repository AI Weekly news and updates covering newly released LLMs Courses and events to attend How to Grow AI/Generative AI Inference? This week, I read about Chet Holmes's Buyer's Pyramid. This was my first time seeing it, and its source is 'The Larger Market Formula.' Why did this get my attention? I am linking this to the situation of the Generative AI Landscape, which is going on in the market where we all started realizing sooner enough that whatever investment is Going on the AI Infrastructure is about what and how we will use it, and we need to have an AI Infrastructure in place to do the Inference. The full article read over here: https://aitechcircle.kit.com/posts/generative-ai-engagement-pyramid-for-growing-ai-inference
// GUEST //X: https://x.com/satmojoeWhat's the Problem? X: https://x.com/SatsVsFiatWebsite: https://www.satsvsfiat.com/ // SPONSORS //The Farm at Okefenokee: https://okefarm.com/iCoin: https://icointechnology.com/breedloveHeart and Soil Supplements (use discount code BREEDLOVE): https://heartandsoil.co/In Wolf's Clothing: https://wolfnyc.com/Blockware Solutions: https://mining.blockwaresolutions.com/breedloveOn Ramp: https://onrampbitcoin.com/?grsf=breedloveMindlab Pro: https://www.mindlabpro.com/breedloveCoinbits: https://coinbits.app/breedlove // PRODUCTS I ENDORSE //Protect your mobile phone from SIM swap attacks: https://www.efani.com/breedloveNoble Protein (discount code BREEDLOVE for 15% off): https://nobleorigins.com/Lineage Provisions (use discount code BREEDLOVE): https://lineageprovisions.com/?ref=breedlove_22Colorado Craft Beef (use discount code BREEDLOVE): https://coloradocraftbeef.com/ // SUBSCRIBE TO THE CLIPS CHANNEL //https://www.youtube.com/@robertbreedloveclips2996/videos // OUTLINE //0:00 - WiM Episode Trailer1:12 - “What's the Problem?”10:04 - The Pernicious Cycle of Keynesian Systems26:15 - Breaking Out of the Fiat World28:41 - The Farm at Okefenokee30:00 - iCoin Bitcoin Wallet31:32 - Bitcoiner Openness, Disagreeability, and Humility32:46 - Many Problems are Downstream of Broken Money35:41 - Explaining Bitcoin to the Layperson40:07 - Money Printing Enables Theft to Fund War47:07 - Heart and Soil Supplements48:07 - Helping Lightning Startups with In Wolf's Clothing49:01 - All Government Spending is Capital Misallocation58:00 - Fight the System or Defund the System1:04:00 - Ikigai and What Individual Bitcoiners Can Do1:16:28 - Mine Bitcoin with Blockware Solutions1:17:47 - OnRamp Bitcoin Custody1:19:10 - Personality Dispositions of Bitcoiners and Broader Inclusion1:27:47 - Elon Musk and Business Bitcoin Adoption1:34:32 - Bitcoin Adoption is a Positive Feedback Loop1:42:05 - Mind Lab Pro Supplements1:43:13 - Buy Bitcoin with Coinbits1:44:41 - Physics, Inference, and Bitcoin1:52:05 - Joe Bryan's Orange-Pill Paradigm Shift1:58:41 - Financial and Linguistic Liberation2:00:51 - The Inevitability of Bitcoin?2:03:55 - Reactions to “What's the Problem?” “What's the Problem?”2:15:12 - Introduction 2:16:45 – The Problems We All Face 2:17:32 – The Island: A Story of Two Sides 2:21:06 – A Free Market with Perfect Money 2:24:18 – The Government Arrives… 2:26:37 – Manipulation of the Money Supply 2:35:12 – An Ever-Growing Crisis 2:38:48 – The Inevitable Collapse of Government Money 2:45:52 – The Real World Problems with our Money 2:51:09 – What's the Solution? Bitcoin 2:52:57 – How to Learn More about Bitcoin and Stay in Touch // PODCAST //Podcast Website: https://whatismoneypodcast.com/Apple Podcast: https://podcasts.apple.com/us/podcast/the-what-is-money-show/id1541404400Spotify: https://open.spotify.com/show/25LPvm8EewBGyfQQ1abIsERSS Feed: https://feeds.simplecast.com/MLdpYXYI // SUPPORT THIS CHANNEL //Bitcoin: 3D1gfxKZKMtfWaD1bkwiR6JsDzu6e9bZQ7Sats via Strike: https://strike.me/breedlove22Dollars via Paypal: https://www.paypal.com/paypalme/RBreedloveDollars via Venmo: https://account.venmo.com/u/Robert-Breedlove-2 // SOCIAL //Breedlove X: https://x.com/Breedlove22WiM? X: https://x.com/WhatisMoneyShowLinkedin: https://www.linkedin.com/in/breedlove22/Instagram: https://www.instagram.com/breedlove_22/TikTok: https://www.tiktok.com/@breedlove22Substack: https://breedlove22.substack.com/All My Current Work: https://linktr.ee/robertbreedlove
Support the show to get full episodes, full archive, and join the Discord community. The Transmitter is an online publication that aims to deliver useful information, insights and tools to build bridges across neuroscience and advance research. Visit thetransmitter.org to explore the latest neuroscience news and perspectives, written by journalists and scientists. Read more about our partnership. Sign up for the “Brain Inspired” email alerts to be notified every time a new “Brain Inspired” episode is released. To explore more neuroscience news and perspectives, visit thetransmitter.org. The concept of a schema goes back at least to the philosopher Immanuel Kant in the 1700s, who use the term to refer to a kind of built-in mental framework to organize sensory experience. But it was the psychologist Frederic Bartlett in the 1930s who used the term schema in a psychological sense, to explain how our memories are organized and how new information gets integrated into our memory. Fast forward another 100 years to today, and we have a podcast episode with my guest today, Alison Preston, who runs the Preston Lab at the University of Texas at Austin. On this episode, we discuss her neuroscience research explaining how our brains might carry out the processing that fits with our modern conception of schemas, and how our brains do that in different ways as we develop from childhood to adulthood. I just said, "our modern conception of schemas," but like everything else, there isn't complete consensus among scientists exactly how to define schema. Ali has her own definition. She shares that, and how it differs from other conceptions commonly used. I like Ali's version and think it should be adopted, in part because it helps distinguish schemas from a related term, cognitive maps, which we've discussed aplenty on brain inspired, and can sometimes be used interchangeably with schemas. So we discuss how to think about schemas versus cognitive maps, versus concepts, versus semantic information, and so on. Last episode Ciara Greene discussed schemas and how they underlie our memories, and learning, and predictions, and how they can lead to inaccurate memories and predictions. Today Ali explains how circuits in the brain might adaptively underlie this process as we develop, and how to go about measuring it in the first place. Preston Lab Twitter: @preston_lab Related papers: Concept formation as a computational cognitive process. Schema, Inference, and Memory. Developmental differences in memory reactivation relate to encoding and inference in the human brain. Read the transcript. 0:00 - Intro 6:51 - Schemas 20:37 - Schemas and the developing brain 35:03 - Information theory, dimensionality, and detail 41:17 - Geometry of schemas 47:26 - Schemas and creativity 50:29 - Brain connection pruning with development 1:02:46 - Information in brains 1:09:20 - Schemas and development in AI
Welcome to another episode of Supra Insider. This time, Marc and Ben sat down with Randhir Vieira, Chief Product Officer at Omada Health and a Supra facilitator. Randhir has led product teams at companies like Headspace and brings a deep passion for leadership, communication, and the often-overlooked skill of listening.In this episode, we explore why listening is the foundation of strong leadership, how to give feedback that actually lands, and why the best product leaders focus on building psychological safety in their teams. Randhir also shares practical frameworks for improving team collaboration, handling difficult conversations, and fostering a culture of trust.If you're a product leader looking to sharpen your leadership skills, a founder building a team, or someone navigating tricky workplace dynamics, this conversation is packed with actionable takeaways.All episodes of the podcast are also available on Spotify, Apple and YouTube.New to the pod? Subscribe below to get the next episode in your inbox
Hagay Lupesko is the SVP for AI Inference at Cerebras Systems. Subscribe to the Gradient Flow Newsletter
This Week in Machine Learning & Artificial Intelligence (AI) Podcast
Today, we're joined by Ron Diamant, chief architect for Trainium at Amazon Web Services, to discuss hardware acceleration for generative AI and the design and role of the recently released Trainium2 chip. We explore the architectural differences between Trainium and GPUs, highlighting its systolic array-based compute design, and how it balances performance across key dimensions like compute, memory bandwidth, memory capacity, and network bandwidth. We also discuss the Trainium tooling ecosystem including the Neuron SDK, Neuron Compiler, and Neuron Kernel Interface (NKI). We also dig into the various ways Trainum2 is offered, including Trn2 instances, UltraServers, and UltraClusters, and access through managed services like AWS Bedrock. Finally, we cover sparsity optimizations, customer adoption, performance benchmarks, support for Mixture of Experts (MoE) models, and what's next for Trainium. The complete show notes for this episode can be found at https://twimlai.com/go/720.
The Twenty Minute VC: Venture Capital | Startup Funding | The Pitch
Steeve Morin is the Founder & CEO @ ZML, a next-generation inference engine enabling peak performance on a wide range of chips. Prior to founding ZML, Steeve was the VP Engineering at Zenly for 7 years leading eng to millions of users and an acquisition by Snap. In Today's Episode We Discuss: 04:17 How Will Inference Change and Evolve Over the Next 5 Years 09:17 Challenges and Innovations in AI Hardware 15:38 The Economics of AI Compute 18:01 Training vs. Inference: Infrastructure Needs 25:08 The Future of AI Chips and Market Dynamics 34:43 Nvidia's Market Position and Competitors 38:18 Challenges of Incremental Gains in the Market 39:12 The Zero Buy-In Strategy 39:34 Switching Between Compute Providers 40:40 The Importance of a Top-Down Strategy for Microsoft and Google 41:42 Microsoft's Strategy with AMD 45:50 Data Center Investments and Training 46:40 How to Succeed in AI: The Triangle of Products, Data, and Compute 48:25 Scaling Laws and Model Efficiency 49:52 Future of AI Models and Architectures 57:08 Retrieval Augmented Generation (RAG) 01:00:52 Why OpenAI's Position is Not as Strong as People Think 01:06:47 Challenges in AI Hardware Supply
What is all the hype around Deep Research? In episode 43 of Mixture of Experts, host Tim Hwang is joined by Kate Soule, Volkmar Uhlig and Shobhit Varshney. This week, we discuss reasoning model features coming out of companies like OpenAI's Deep Research, Google Gemini, Perplexity, xAI's Grok-3 and more! Next, OpenAI is rumored to release an inference chip, but how likely is this to be a success in the AI chip game? Then, we analyze the capabilities of small vision-language models (VLMs). Finally, a startup, Firecrawl, released a job posting in search of an AI agent. Is this the future for AI tools in the workforce? Tune-in to today's Mixture of Experts to find out. 00:01 – Intro 00:35 – Deep Research 11:58 – OpenAI inference chip 22:17 – Small VLMs 32:31 – AI agent job posting The opinions expressed in this podcast are solely those of the participants and do not necessarily reflect the views of IBM or any other organization or entity.
In this episode, Ryan Greenblatt, Chief Scientist at Redwood Research, discusses various facets of AI safety and alignment. He delves into recent research on alignment faking, covering experiments involving different setups such as system prompts, continued pre-training, and reinforcement learning. Ryan offers insights on methods to ensure AI compliance, including giving AIs the ability to voice objections and negotiate deals. The conversation also touches on the future of AI governance, the risks associated with AI development, and the necessity of international cooperation. Ryan shares his perspective on balancing AI progress with safety, emphasizing the need for transparency and cautious advancement. Ryan's work (with co-authors at Anthropic) on Alignment Faking: https://www.lesswrong.com/posts/njAZwT8nkHnjipJku/alignment-faking-in-large-language-models Ryan's work on striking deals with AIs: https://www.lesswrong.com/posts/7C4KJot4aN8ieEDoz/will-alignment-faking-claude-accept-a-deal-to-reveal-its Ryan's critique of Anthropic's RSP work: https://www.lesswrong.com/posts/6tjHf5ykvFqaNCErH/anthropic-s-responsible-scaling-policy-and-long-term-benefit?commentId=NyqcvZifqznNGKxdT SPONSORS: Oracle Cloud Infrastructure (OCI): Oracle's next-generation cloud platform delivers blazing-fast AI and ML performance with 50% less for compute and 80% less for outbound networking compared to other cloud providers. OCI powers industry leaders like Vodafone and Thomson Reuters with secure infrastructure and application development capabilities. New U.S. customers can get their cloud bill cut in half by switching to OCI before March 31, 2024 at https://oracle.com/cognitive NetSuite: Over 41,000 businesses trust NetSuite by Oracle, the #1 cloud ERP, to future-proof their operations. With a unified platform for accounting, financial management, inventory, and HR, NetSuite provides real-time insights and forecasting to help you make quick, informed decisions. Whether you're earning millions or hundreds of millions, NetSuite empowers you to tackle challenges and seize opportunities. Download the free CFO's guide to AI and machine learning at https://netsuite.com/cognitive Shopify: Shopify is revolutionizing online selling with its market-leading checkout system and robust API ecosystem. Its exclusive library of cutting-edge AI apps empowers e-commerce businesses to thrive in a competitive market. Cognitive Revolution listeners can try Shopify for just $1 per month at https://shopify.com/cognitive RECOMMENDED PODCAST:
The Twenty Minute VC: Venture Capital | Startup Funding | The Pitch
Jonathan Ross is the Founder & CEO of Groq, the creator of the world's first Language Processing Unit (LPUTM). Prior to Groq, Jonathan began what became Google's Tensor Processing Unit (TPU) as a 20% project where he designed and implemented the core elements of the first-generation TPU chip. Jonathan next joined Google X's Rapid Eval Team, the initial stage of the famed “Moonshots Factory”, where he devised and incubated new Bets (Units) for Google's parent company, Alphabet. In Today's Episode We Discuss: 04:20 Interview with Jonathan Ross Begins 04:59 Scaling Laws and AI Model Training 06:22 Synthetic Data and Model Efficiency 12:01 Inference vs. Training Costs: Why NVIDIA Loses Inference 17:06 The Future of AI Inference: Efficiency and Cost 18:15 Chip Supply and Scaling Concerns 20:57 Energy Efficiency in AI Computation 25:40 Why Most Dollars Into Datacenters Will Be Lost 31:05 Meta, Google, and Microsoft's Data Center Investments 41:11 Distribution of Value in the AI Economy 42:10 Stages of Startup Success 43:17 The AI Investment Bubble 45:00 The Keynesian Beauty Contest in VC 48:40 NVIDIA's Role in the AI Ecosystem 53:39 China's AI Strategy and Global Implications 57:51 Europe's Potential in the AI Revolution 01:10:14 Future Predictions and AI's Impact on Society
Language Failure: How Words Shape Our RealityA Long-Form Summary of the PodcastOpening Hook: The Illusion of RealityImagine walking through downtown San Francisco. On your phone, you see pristine streets and a bustling city. But when you look up, the reality is starkly different—crumbling infrastructure, vacant storefronts, and widespread urban decay. This isn't an episode of Black Mirror; it's what happened in 2023 when San Francisco created a Potemkin village—a facade meant to impress foreign dignitaries while hiding the city's deeper issues.This phenomenon isn't just about urban aesthetics; it signals something deeper: the failure of language to accurately reflect reality. When we manipulate language, we manipulate perception, and when perception detaches from truth, society begins to collapse.The San Francisco example proves that we know what a functional city looks like—we can manufacture an illusion of order when necessary—but we don't maintain it. Instead, we mask the problem rather than solving it. This mirrors the broader theme of the podcast: language, like infrastructure, is breaking down, and instead of repairing it, we disguise its failure with illusions.The Problem: The Breakdown of RealityWhat happens when our words and perceptions no longer match reality?We see this in:Infrastructure decay: Baltimore's bridge collapse, failing subway systems, and deteriorating roads.Media and distraction: Instead of addressing problems, we divert our attention—scrolling through TikTok instead of engaging with real-world issues.Social and political discourse: Headlines inflame emotions, but we rarely engage with the underlying facts.We live in a loop of anxiety and escape, toggling between existential threats and dopamine-fueled distractions. This is not just modern life—it's a historical pattern that has preceded societal collapse before.Historical Warning Signs: Orwell, Cuenco, and the Soviet UnionMost people remember 1984 for its themes of surveillance and thought control. But Orwell also illustrated a world where physical reality itself was decaying—the elevators don't work, the food rations shrink, and yet, the Party insists everything is improving.Michael Cuenco builds on this idea in his 2021 essay, Victory Is Not Possible, arguing that today's culture wars function in the same way as Orwell's language control. The ruling elite isn't just lying—it's actively shrinking language, making dissent impossible because people lack the vocabulary to express opposition.The Soviet Union offers another chilling parallel. Adam Curtis's documentary, HyperNormalisation, explores how, in the USSR's final years, everyone knew the official narrative was false—record-breaking harvests were announced while store shelves were empty. But rather than resist, people played along, creating a world where fantasy replaced reality.The result? A world where illusions become more real than facts. People, exhausted by the gap between truth and propaganda, retreated into cynicism, vodka, and pop culture.Today, we are experiencing a similar detachment from reality—not through authoritarian control, but through semantic drift, emotional manipulation, and digital distractions.The Mechanism: How Language Becomes UntetheredHow does language lose its connection to reality? Through concept creep and false logic.Concept Creep (Semantic Drift)Words broaden in meaning, diluting their original precision.Example: Trauma once meant a physical wound (1850s), but by 1895, William James and Freud extended it to psychological wounds. Today, it describes any discomfort—I was traumatized by cold coffee.Hyperbole and Semantic InflationOveruse weakens terms: Abuse now includes neglect, fascism is applied to trivial disagreements, bullying can refer to mere criticism.Example: Courage once meant facing real danger, but now can mean avoiding offense.Semantic InversionWords flip in meaning—what was once good can become bad and vice versa.Example: Freedom increasingly means freedom from reality and consequences rather than actual agency.When words become unanchored from objective meaning, they create ideological vacuums—leaving us drifting like astronauts in space, weightless, disconnected, and incapable of grappling with reality.The Ladder of False Logic: How We Convince Ourselves of LiesThe Ladder of Inference, or false logic, explains how we trick ourselves into believing distorted realities:Observable Facts – A politician says, Education is declining despite higher spending.Selected Data – You focus on a single phrase that confirms your bias.Interpretation – This sounds like something a dictator would say.Assumption – They must have a hidden agenda.Conclusion – They're trying to destroy public education.Belief – They are evil and must be stopped.Action – Post an outraged rant online, comparing them to Hitler.Each step takes you further from reality—until your worldview becomes purely ideological, detached from objective facts.At this point, we are radicals—not because of some external manipulation, but because we self-radicalized through unchecked emotional reasoning.The Philosophical Root: Kant's Detachment from RealityMatthew Crawford critiques Immanuel Kant, arguing that his philosophy set the stage for modern detachment from reality.Kant suggested true freedom means acting according to self-imposed rational laws, independent of external influences.This led to a view of reality as subjective—where internal logic overrides external truth.Instead of grounding ourselves in the real world, we live in a mental space station, floating free but becoming increasingly weak and incapable of dealing with reality.Astronauts in zero gravity may enjoy their detachment, but their bones and muscles deteriorate. Likewise, the more detached we are from reality, the weaker our ability to engage with it becomes.Contemporary Crisis: The Illusion EconomyModern financial markets and politics operate not on productivity or value, but on perception and emotion.Stocks rise and fall based on optimism, not output.Presidential campaigns are waged on vibes, not policy.Social movements focus on interpretations rather than material outcomes.In San Francisco, the government hid homelessness rather than solving it. This is how language manipulation replaces action.When words detach from material reality, truth becomes contingent, and society drifts into ideological orbit.The Challenge of Re-Entry: Reclaiming RealityHow do we return from orbit and reconnect words with truth?Verify personally – Base beliefs on direct observation, not media narratives.Reality-check assumptions – Climb down the ladder of inference before reacting.Resist semantic drift – Demand precision in language.Closing: The Fight for TruthWe are at a turning point. We can either continue floating in ideological orbit, or we can re-enter reality.Re-entry is painful. It requires effort, humility, and engagement with the material world—but it's necessary.In the next episode, we'll explore specific tools for resisting semantic drift and maintaining a clear connection to reality.Until then, stay grounded.
In this episode of The Effective Statistician, I talk with Justin Belair about the process of writing a book on causal inference, one of the hottest topics in statistics. Justin explains how he discovered causal inference, what motivated him to write a hands-on technical book, and how he balances theory, real-world applications, and coding exercises. Having written a book myself, I know the challenges firsthand, so we dive into the strategies that make writing more effective. If you want to learn more about causal inference, apply practical tools in your work, or even write a book yourself, this episode has plenty of insights and inspiration. Tune in and join the conversation!
In this episode of BoaT, I interview my good friend Ishan Dhanani, who is a MLE for Inference at NVIDIA. Less than 2 years ago, Ishan graduated with an Economics degree from Texas A&M. Since then, he has dropped out of Columbia, been acquired twice (once by Brev.dev, once by NVIDIA), and moved across the country.We discuss how you can become technical, the future of AI, and much more.EnjoyFOLLOW ISHAN:https://x.com/0xishandCONNECT WITH ME
Welcome to episode 290 of The Cloud Pod – where the forecast is always cloudy! It's a full house this week – and a good thing too, since there's a lot of news! Justin, Jonathan, Ryan, and Matthew are all in the house to bring you news on DeepSeek, OpenVox, CloudWatch, and more. Titles we almost went with this week: The cloud pod wonders if azure is still hung over from new years Stratoshark sends the Cloud pod to the stratosphere Cutting-Edge Chinese “Reasoning” Model Rivals OpenAI… and it’s FREE?! Wireshark turns 27, Cloud Pod Hosts feel old Operator: DeepSeek is here to kill OpenAI Time for a deepthink on buying all that Nvidia stock AWS Token Service finally goes cloud native The CloudPod wonders if OpenAI’s Operator can order its own $200 subscription A big thanks to this week's sponsor: We're sponsorless! Want to get your brand, company, or service in front of a very enthusiastic group of cloud news seekers? You've come to the right place! Send us an email or hit us up on our slack channel for more info. AI IS Going Great – Or How ML Makes All Its Money 01:29 Introducing the GenAI Platform: Simplifying AI Development for All If you’re struggling to find that AI GPU capacity, Digital Ocean is pleased to announce their DigitalOcean GenAI Platform is now available to everyone. The platform aims to democratize AI development, empowering everyone – from solo developers to large teams – to leverage the transformative potential of generative AI. On the Gen AI platform you can: Build Scalable AI Agents Seamlessly integrate with workflows Leverage guardrails Optimize Efficiency. Some of the use cases they are highlighting are chatbots, e-commerce assistance, support automation, business insights, AI-Driven CRMs, Personalized Learning and interactive tools. 02:23 Jonathan – “Inference cost is really the big driver there. So once you once you build something that’s that’s done, but it’s nice to see somebody focusing on delivering it as a service rather than, you know, a $50 an hour compute for training models. This is right where they need to be.” 04:21 OpenAI: Introducing Operator We have thoughts about the name of this service… OpenAI is releasing the preview version of their agent that can use a web browser to perform tasks for you. The new version is available to OpenAI pro users. OpenAI says it’s currently a research preview, meaning it has limitations and will evolve based on your feedback. Operator can handle various browser tasks such as filling out forms, ordering groceries, and even creating memes.
This Week in Machine Learning & Artificial Intelligence (AI) Podcast
Today, we're joined by Chris Lott, senior director of engineering at Qualcomm AI Research to discuss accelerating large language model inference. We explore the challenges presented by the LLM encoding and decoding (aka generation) and how these interact with various hardware constraints such as FLOPS, memory footprint and memory bandwidth to limit key inference metrics such as time-to-first-token, tokens per second, and tokens per joule. We then dig into a variety of techniques that can be used to accelerate inference such as KV compression, quantization, pruning, speculative decoding, and leveraging small language models (SLMs). We also discuss future directions for enabling on-device agentic experiences such as parallel generation and software tools like Qualcomm AI Orchestrator. The complete show notes for this episode can be found at https://twimlai.com/go/717.
The Twenty Minute VC: Venture Capital | Startup Funding | The Pitch
Jonathan Ross is the Co-Founder and CEO of Groq, providing fast AI inference. Prior to founding Groq, Jonathan started Google's TPU effort where he designed and implemented the core elements of the original chip. Jonathan then joined Google X's Rapid Eval Team, the initial stage of the famed “Moonshots factory,” where he devised and incubated new Bets (Units) for Alphabet. The 10 Most Important Questions on Deepseek: How did Deepseek innovate in a way that no other model provider has done? Do we believe that they only spent $6M to train R1? Should we doubt their claims on limited H100 usage? Is Josh Kushner right that this is a potential violation of US export laws? Is Deepseek an instrument used by the CCP to acquire US consumer data? How does Deepseek being open-source change the nature of this discussion? What should OpenAI do now? What should they not do? Does Deepseek hurt or help Meta who already have their open-source efforts with Lama? Will this market follow Satya Nadella's suggestion of Jevon's Paradox? How much more efficient will foundation models become? What does this mean for the $500BN Stargate project announced last week?
This is our monthly conversation on topics in AI and Technology with Paco Nathan, the founder of Derwen, a boutique consultancy focused on Data and AI.Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/Subscribe: Apple • Spotify • Overcast • Pocket Casts • AntennaPod • Podcast Addict • Amazon • RSS.Detailed show notes - with links to many references - can be found on The Data Exchange web site.
The consequences of Internet content restriction. The measured risks of 3rd-party browser extensions. The consequences of SonicWall's unpatched 9.8 firewall severity. The incredible number of still-unencrypted email servers. SonicWall vulnerability patching Shadowserver Foundation & eMail Encryption Salt Typhoon Evicted HIPAA gets a long-needed cybersecurity upgrade. The EU standardizes on USB-C for power charging. What? Believe it or not, a CATCHA you solve by playing DOOM. And... what I learned from three weeks of study of AI Show Notes - https://www.grc.com/sn/SN-1007-Notes.pdf Hosts: Steve Gibson and Leo Laporte Download or subscribe to Security Now at https://twit.tv/shows/security-now. Get episodes ad-free with Club TWiT at https://twit.tv/clubtwit You can submit a question to Security Now at the GRC Feedback Page. For 16kbps versions, transcripts, and notes (including fixes), visit Steve's site: grc.com, also the home of the best disk maintenance and recovery utility ever written Spinrite 6. Sponsors: bitwarden.com/twit expressvpn.com/securitynow veeam.com threatlocker.com for Security Now
The consequences of Internet content restriction. The measured risks of 3rd-party browser extensions. The consequences of SonicWall's unpatched 9.8 firewall severity. The incredible number of still-unencrypted email servers. SonicWall vulnerability patching Shadowserver Foundation & eMail Encryption Salt Typhoon Evicted HIPAA gets a long-needed cybersecurity upgrade. The EU standardizes on USB-C for power charging. What? Believe it or not, a CATCHA you solve by playing DOOM. And... what I learned from three weeks of study of AI Show Notes - https://www.grc.com/sn/SN-1007-Notes.pdf Hosts: Steve Gibson and Leo Laporte Download or subscribe to Security Now at https://twit.tv/shows/security-now. Get episodes ad-free with Club TWiT at https://twit.tv/clubtwit You can submit a question to Security Now at the GRC Feedback Page. For 16kbps versions, transcripts, and notes (including fixes), visit Steve's site: grc.com, also the home of the best disk maintenance and recovery utility ever written Spinrite 6. Sponsors: bitwarden.com/twit expressvpn.com/securitynow veeam.com threatlocker.com for Security Now
The consequences of Internet content restriction. The measured risks of 3rd-party browser extensions. The consequences of SonicWall's unpatched 9.8 firewall severity. The incredible number of still-unencrypted email servers. SonicWall vulnerability patching Shadowserver Foundation & eMail Encryption Salt Typhoon Evicted HIPAA gets a long-needed cybersecurity upgrade. The EU standardizes on USB-C for power charging. What? Believe it or not, a CATCHA you solve by playing DOOM. And... what I learned from three weeks of study of AI Show Notes - https://www.grc.com/sn/SN-1007-Notes.pdf Hosts: Steve Gibson and Leo Laporte Download or subscribe to Security Now at https://twit.tv/shows/security-now. Get episodes ad-free with Club TWiT at https://twit.tv/clubtwit You can submit a question to Security Now at the GRC Feedback Page. For 16kbps versions, transcripts, and notes (including fixes), visit Steve's site: grc.com, also the home of the best disk maintenance and recovery utility ever written Spinrite 6. Sponsors: bitwarden.com/twit expressvpn.com/securitynow veeam.com threatlocker.com for Security Now
In today's episode Kuzco founder Sam Hogan explores AI's major infrastructure shift, explaining why inference computing is set to dominate 99% of AI compute within five years. He details how Kuzco is building a global marketplace for idle GPU power, leveraging crypto for payments and verification. The conversation evolves from technical infrastructure to the emergence of AI agents, their potential impacts, and concludes with insights on preserving human value and authentic connections in an increasingly AI-driven world. Resources https://kuzco.xyz/ https://x.com/kuzco_xyz - - Start your day with crypto news, analysis and data from Katherine Ross and David Canellis. Subscribe to the Empire newsletter: https://blockworks.co/newsletter/empire?utm_source=podcasts Follow Sam: https://x.com/0xSamHogan Follow Jason: https://twitter.com/JasonYanowitz Follow Santiago: https://twitter.com/santiagoroel Follow Empire: https://twitter.com/theempirepod Subscribe on YouTube: https://tinyurl.com/4fdhhb2j Subscribe on Apple: https://tinyurl.com/mv4frfv7 Subscribe on Spotify: https://tinyurl.com/wbaypprw Get top market insights and the latest in crypto news. Subscribe to Blockworks Daily Newsletter: https://blockworks.co/newsletter/ - - Timestamps: (00:00) Introduction (00:26) AI Inference Marketplace (05:10) Training vs Inference (09:37) Why Found Kuzco (13:26) Why Crypto (23:47) Grand Product Vision (27:17) Ai Agent Explosion (38:39) Human vs AI Interaction (51:43) Evolution of Human Skillsets - - Disclaimer: Nothing said on Empire is a recommendation to buy or sell securities or tokens. This podcast is for informational purposes only, and any views expressed by anyone on the show are solely our opinions, not financial advice. Santiago, Jason, and our guests may hold positions in the companies, funds, or projects discussed.
Timestamps: (0:00) Jason and Sunny kick off the show (1:26) Discussing Grok's recent developments and Middle East tech inspiration (3:51) Global business challenges and Jason's 2025 announcements (7:53) Gemini app demo and ChatGPT 4 comparison (9:57) Lemon.io - Get 15% off your first 4 weeks of developer time at https://Lemon.io/twist (11:24) Evaluating ChatGPT 4.0 Pro limitations and robo-taxi fleet costs (12:46) Differences between ChatGPT models and future improvements (19:57) OpenPhone - Get 20% off your first six months at https://www.openphone.com/twist (20:58) LLM focus shift and robo taxi cost model experimentation (23:01) Importance of prompt engineering in AI tools adoption (25:31) Workplace AI adoption challenges and generational tech differences (27:22) New productivity hacks and Gemini app's growth (30:27) Zendesk - Get six months free at https://www.zendesk.com/twist (34:06) Startup support programs and Google Gemini's research capabilities (37:08) AI model performance evaluation and simplification (45:02) Meta's Llama 3.370b launch and AI industry impact (46:37) Infrastructure costs, competitive landscape, and AI-generated content evolution (55:44) Trust and bias in language models, news analysis startup ideas * Subscribe to the TWiST500 newsletter: https://ticker.thisweekinstartups.com Check out the TWIST500: https://www.twist500.com Subscribe to This Week in Startups on Apple: https://rb.gy/v19fcp * Mentioned on the show: Check out Sora here: https://sora.com/ Check out Kling here: https://klingai.com/ Check out Groq: https://groq.com/ * Follow Sunny: X: https://x.com/sundeep LinkedIn: https://www.linkedin.com/in/sundeepm * Follow Jason: X: https://twitter.com/Jason LinkedIn: https://www.linkedin.com/in/jasoncalacanis * Thank you to our partners: (9:57) Lemon.io - Get 15% off your first 4 weeks of developer time at https://Lemon.io/twist (19:57) OpenPhone - Get 20% off your first six months at https://www.openphone.com/twist (30:27) Zendesk - Get six months free at https://www.zendesk.com/twist * Great TWIST interviews: Will Guidara, Eoghan McCabe, Steve Huffman, Brian Chesky, Bob Moesta, Aaron Levie, Sophia Amoruso, Reid Hoffman, Frank Slootman, Billy McFarland * Check out Jason's suite of newsletters: https://substack.com/@calacanis * Follow TWiST: Twitter: https://twitter.com/TWiStartups YouTube: https://www.youtube.com/thisweekin Instagram: https://www.instagram.com/thisweekinstartups TikTok: https://www.tiktok.com/@thisweekinstartups Substack: https://twistartups.substack.com * Subscribe to the Founder University Podcast: https://www.youtube.com/@founderuniversity1916