Podcasts about langchain

127PODCASTS
215EPISODES
49mAVG DURATION
5WEEKLY NEW EPISODES
Nov 10, 2025LATEST

POPULARITY

20172018201920202021202220232024

Best podcasts about langchain

Latent Space: The AI Engineer Podcast â€” CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

19 episodes with langchain

Thinking Elixir Podcast

7 episodes with langchain

GraphStuff.FM: The Neo4j Graph Database Developer Podcast

6 episodes with langchain

Syntax - Tasty Web Development Treats

2 episodes with langchain

Talk Python To Me - Python conversations for passionate developers

2 episodes with langchain

Real World Serverless with theburningmonk

3 episodes with langchain

Zero Knowledge

2 episodes with langchain

This Day in AI Podcast

2 episodes with langchain

PodRocket - A web development podcast from LogRocket

2 episodes with langchain

The New Stack Podcast

3 episodes with langchain

2 episodes with langchain

Dead Cat

2 episodes with langchain

airhacks.fm podcast with adam bien

2 episodes with langchain

AI Unraveled: Latest AI News & Trends, Master GPT, Gemini, Generative AI, LLMs, Prompting, GPT Store

2 episodes with langchain

programmier.bar – der Podcast für App- und Webentwicklung

3 episodes with langchain

AIA Podcast

3 episodes with langchain

Web Reactiva

2 episodes with langchain

Two Voice Devs

11 episodes with langchain

Latest podcast episodes about langchain

20VC: Benchmark's Newest General Partner Ev Randle on Why Margins Matter Less in AI | Why Mega Funds Will Not Produce Good Returns | OpenAI vs Anthropic: What Happens and Who Wins Coding | Investing Lessons from Peter Thiel and Mamoon Hamid

The Twenty Minute VC: Venture Capital | Startup Funding | The Pitch

Play Episode Listen Later Nov 10, 2025 85:43

Ev Randle is a General Partner @ Benchmark, one of the best funds in venture capital. In their latest fund, they have Mercor ($10BN valuation), Sierra ($10BN valuation), Firework ($4BN valuation), Legora ($2Bn valuation) and Langchain ($1.4Bn valuation). To put this in multiples on invested capital, that is a 60x, two 30x and two 20x. Before Benchmark, Ev was a Partner @ Kleiner Perkins and before Kleiner, Ev was an investor at Founders Fund and Bond. AGENDA: 05:25 Biggest Investing Lessons from Peter Thiel, Mary Meeker and Mamoon Hamid 14:36 OpenAI Will Be a $TRN Company & OpenAI or Anthropic: Who Wins Coding? 22:27 Why We Should Not Focus on Margin But Gross Dollar Per Customer 30:25 Why AI Labs are the Biggest Threat to AI App Companies 44:26 Do Benchmark Fire Founders? If so… Truly the Best Partner? 54:38 People, Product, Market: Rank 1-3 and Why? 57:36 Why the Mega Funds Have Just Replaced Tiger 01:04:08 GC, Lightspeed and a16z Cannot Do 5x on Their Funds… 01:14:09 Single Biggest Threat to Benchmark

SED News: AMD's Big OpenAI Deal, Intel's Struggles, and Apple's AI Long Game

Software Engineering Daily

Play Episode Listen Later Nov 4, 2025 49:02

SED News is a monthly podcast from Software Engineering Daily where hosts Gregor Vand and Sean Falconer unpack the biggest stories shaping software engineering, Silicon Valley, and the broader tech industry. In this episode, they cover the $1.7B acquisition of Security AI, LangChain's massive valuation, and the surprise $300M funding” round for Periodic Labs. They The post SED News: AMD's Big OpenAI Deal, Intel's Struggles, and Apple's AI Long Game appeared first on Software Engineering Daily.

apple struggle silicon valley intel openai long game 300m 7b langchain software engineering daily news amd

SED News: AMD's Big OpenAI Deal, Intel's Struggles, and Apple's AI Long Game

Podcast – Software Engineering Daily

Play Episode Listen Later Nov 4, 2025 49:02

apple struggle silicon valley intel openai long game 300m 7b langchain software engineering daily news amd

⚡️ Ship AI recap: Agents, Workflows, and Python — w/ Vercel CTO Malte Ubl

Latent Space: The AI Engineer Podcast â€” CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Oct 31, 2025

In this conversation with Malte Ubl, CTO of Vercel (http://x.com/cramforce), we explore how the company is pioneering the infrastructure for AI-powered development through their comprehensive suite of tools including workflows, AI SDK, and the newly announced agent ecosystem. Malte shares insights into Vercel's philosophy of "dogfooding" - never shipping abstractions they haven't battle-tested themselves - which led to extracting their AI SDK from v0 and building production agents that handle everything from anomaly detection to lead qualification. The discussion dives deep into Vercel's new Workflow Development Kit, which brings durable execution patterns to serverless functions, allowing developers to write code that can pause, resume, and wait indefinitely without cost. Malte explains how this enables complex agent orchestration with human-in-the-loop approvals through simple webhook patterns, making it dramatically easier to build reliable AI applications. We explore Vercel's strategic approach to AI agents, including their DevOps agent that automatically investigates production anomalies by querying observability data and analyzing logs - solving the recall-precision problem that plagues traditional alerting systems. Malte candidly discusses where agents excel today (meeting notes, UI changes, lead qualification) versus where they fall short, emphasizing the importance of finding the "sweet spot" by asking employees what they hate most about their jobs. The conversation also covers Vercel's significant investment in Python support, bringing zero-config deployment to Flask and FastAPI applications, and their vision for security in an AI-coded world where developers "cannot be trusted." Malte shares his perspective on how CTOs must transform their companies for the AI era while staying true to their core competencies, and why maintaining strong IC (individual contributor) career paths is crucial as AI changes the nature of software development. What was launched at Ship AI 2025: AI SDK 6.0 & Agent Architecture Agent Abstraction Philosophy: AI SDK 6 introduces an agent abstraction where you can "define once, deploy everywhere". How does this differ from existing agent frameworks like LangChain or AutoGPT? What specific pain points did you observe in production that led to this design? Human-in-the-Loop at Scale: The tool approval system with needsApproval: true gates actions until human confirmation. How do you envision this working at scale for companies with thousands of agent executions? What's the queue management and escalation strategy? Type Safety Across Models: AI SDK 6 promises "end-to-end type safety across models and UI". Given that different LLMs have varying capabilities and output formats, how do you maintain type guarantees when swapping between providers like OpenAI, Anthropic, or Mistral? Workflow Development Kit (WDK) Durability as Code: The use workflow primitive makes any TypeScript function durable with automatic retries, progress persistence, and observability. What's happening under the hood? Are you using event sourcing, checkpoint/restart, or a different pattern? Infrastructure Provisioning: Vercel automatically detects when a function is durable and dynamically provisions infrastructure in real-time. What signals are you detecting in the code, and how do you determine the optimal infrastructure configuration (queue sizes, retry policies, timeout values)? Vercel Agent (beta) Code Review Validation: The Agent reviews code and proposes "validated patches". What does "validated" mean in this context? Are you running automated tests, static analysis, or something more sophisticated? AI Investigations: Vercel Agent automatically opens AI investigations when it detects performance or error spikes using real production data. What data sources does it have access to? How does it distinguish between normal variance and actual anomalies? Python Support (For the first time, Vercel now supports Python backends natively.) Marketplace & Agent Ecosystem Agent Network Effects: The Marketplace now offers agents like CodeRabbit, Corridor, Sourcery, and integrations with Autonoma, Braintrust, Browser Use. How do you ensure these third-party agents can't access sensitive customer data? What's the security model? "An Agent on Every Desk" Program Vercel launched a new program to help companies identify high-value use cases and build their first production AI agents. It provides consultations, reference templates, and hands-on support to go from idea to deployed agent

Live Oak Bank's George Werbacher on AI As SecOps' Single Pane of Glass

Detection at Scale

Play Episode Listen Later Oct 28, 2025 31:46

George Werbacher, Head of Security Operations at Live Oak Bank, reviews the practical realities of implementing AI agents in security operations, sharing his journey from exploring tools like Cursor and Claude Code to building custom agents in-house. He also reflects on the challenges of moving from local development to production-ready systems with proper durability and retry logic. The conversation explores how AI is changing the security analyst role from alert analysis to deeper investigation work, why SOAR platforms face significant disruption, and how MCP servers enable natural language interactions across security tools. George offers pragmatic advice on cutting through AI hype, emphasizing that agents augment rather than replace human expertise while dramatically lowering barriers to automation and query language mastery. Through technical insights and leadership perspective, George illuminates how security teams can embrace AI to improve operational efficiency and mean time to detect without inflating budgets, while maintaining the critical human judgment that effective security demands. Topics discussed: Understanding AI's role in augmenting security analysts rather than replacing them, shifting roles toward investigation and threat hunting. Building custom AI agents using Python and exploring frameworks like LangChain to solve specific SecOps use cases. Managing moving agents from local development to production, including retry logic, failbacks, and durability requirements. Implementing MCP servers to enable natural language interactions with security tools, eliminating the need to learn multiple query languages. Navigating AI hype by focusing on solving specific problems and understanding what agents can realistically accomplish. Predicting SOAR platform disruption as agents take over enrichment, orchestration, and response with simpler automation approaches. Removing platform barriers by enabling analysts to use natural language rather than mastering specific tools or query languages. Exploring context management, prompt engineering, and conversation history techniques essential for building effective agentic systems. Adopting tools like Cursor and Claude Code to empower technical security professionals without deep coding backgrounds. Listen to more episodes: Apple Spotify YouTube Website

head ai building managing single exploring glass adopting soar python pane mcp cursor security operations secops langchain live oak bank

Dell AI Data Platform Advancements Unlock the Power of Enterprise Data to Accelerate AI Outcomes

Irish Tech News Audio Articles

Play Episode Listen Later Oct 27, 2025 9:00

Dell Technologies has announced Dell AI Data Platform advancements designed to help enterprises turn distributed, siloed data into faster, more reliable AI outcomes. Why it matters As enterprise AI adoption surges and data grows, organisations need a platform that can securely transform distributed, siloed data into actionable insights. The Dell AI Data Platform, a critical component of the Dell AI Factory, delivers an open, modular foundation to create value from scattered data silos. By decoupling data storage from processing, it eliminates bottlenecks and provides the flexibility needed for AI workloads like training, fine-tuning, retrieval-augmented generation (RAG) or inferencing. The platform, integrated with the NVIDIA AI Data Platform reference design, is powered by four core building blocks: Storage engines for smart data placement and seamless data movement Data engines to turn data into actionable insights Built-in cyber resiliency Data management services Together, they create a scalable, flexible foundation for customers to realise AI's full potential. Dell AI Data Platform storage engines deliver peak AI performance Dell PowerScale and Dell ObjectScale, the Dell AI Data Platform's storage engines, offer the performance, security and multi-protocol access essential for AI data. Dell PowerScale delivers NAS (network-attached storage) simplicity and parallel performance for AI workloads like training, fine-tuning, inferencing and retrieval-augmented generation (RAG) pipelines. With new integration of NVIDIA GB200 and GB300 NVL72 and ongoing software updates, Dell PowerScale delivers reliable performance, simplified management at scale and seamless compatibility with applications and solution stacks. PowerScale F710, which has achieved NVIDIA Cloud Partner (NCP) certification for high-performance storage, delivers 16k+ GPU-scale with up to 5X less rack space, 88% fewer network switches and up to 72% lower power consumption compared to competitors. Dell ObjectScale, the industry's highest-performing object platform, provides extremely performant, scalable S3-native object storage for massive AI workloads. ObjectScale is available as an appliance or through a new software-defined option on Dell PowerEdge servers that is up to 8 times faster than previous-generation all-flash object storage. New advancements improve ObjectScale's speed, scalability and efficiency. S3 over RDMA support will soon enter tech preview. It will offer up to 230% higher throughput, 80% lower latency and 98% lower CPU usage compared to traditional S3. Small object performance and efficiency improvements for large deployments deliver up to 19% higher throughput and up to 18% lower latency for 10KB objects. Deeper AWS S3 integration and bucket-level compression give developers and data scientists better tools to store, move and use large amounts of data. Dell AI Data Platform data engines power real-time AI Dell is also expanding its data engines, the specialised tools in the Dell AI Data Platform that organise, query and activate AI data. Dell's data engines are built in collaboration with trusted AI leaders like NVIDIA, Elastic and Starburst. The new Data Search Engine, developed in collaboration with Elastic, speeds decision-making by allowing customers to interact with data as naturally as asking a question. Designed for tasks like RAG, semantic search and generative AI pipelines, it integrates with MetadataIQ data discovery software to search billions of files on PowerScale and ObjectScale using granular metadata. Developers can build smarter RAG applications in tools like LangChain with the engine, ingesting only updated files to save compute time and keep vector databases current. The Data Analytics Engine, developed in collaboration with Starburst, enables seamless data querying across spreadsheets, databases, cloud warehouses and lakehouses. The new Data Analytics Engine Agentic Layer transforms raw data into business-ready products in...

EP-393 Atlas Browser Challenges Chrome

AI Briefing Room

Play Episode Listen Later Oct 22, 2025 2:00

welcome to wall-e's tech briefing for wednesday, october 22! explore today's key topics in tech: openai atlas launch: openai introduces the atlas web browser, leveraging chatgpt for conversational search, challenging google's dominance in online search and advertising. netflix & generative ai: ceo ted sarandos discusses generative ai as a tool for enhancing, not replacing, creative storytelling, already utilizing it for special effects. langchain's unicorn status: with a $1.25 billion valuation, langchain garners significant investment, cementing its role in aiding ai agent development through its open source framework. call for google regulation: cloudflare's ceo matthew prince advocates for tighter regulations on google's search and ai practices in the u.k., citing competitive concerns. aws outage resolution: amazon resolves a dns issue causing significant internet disruptions, underscoring the dependency on aws infrastructure by major websites and services. stay tuned for more tech insights tomorrow!

netflix challenges storytelling unicorns milestone chrome browsers embraces langchain

ACA Dynamic Sessions - The best Azure service you've never heard of

Azure Friday (HD) - Channel 9

Play Episode Listen Later Oct 10, 2025

Discover ACA Dynamic Sessions, a versatile, container-based, low-latency compute platform that allows you to execute LLM-generated code securely and with low cold-start latency. Chapters 00:00 - Introduction 00:44 - Background 02:03 - Introduction to Dynamic Sessions 04:15 - Demo: using LangChain with Dynamic Sessions 07:30 - Demo: using Dynamic Sessions with Small Language Models 10:35 - Demo: bring your own container to Dynamic Sessions 14:55 - Demo: MCP 17:26 - Wrap up Recommended resources Learn Docs Azure Product page Connect Scott Hanselman | @SHanselman Nir Mashkowski | @nirmsk | LinkedIn Azure Friday | Twitter/X: @AzureFriday Azure | Twitter/X: @Azure

service wrap chapters dynamic demo recommended never heard azure llm langchain

ACA Dynamic Sessions - The best Azure service you've never heard of

Azure Friday (Audio) - Channel 9

Play Episode Listen Later Oct 10, 2025

service wrap chapters dynamic demo recommended never heard azure llm langchain

SE Radio 689: Amey Desai on the Model Context Protocol

Software Engineering Radio - The Podcast for Professional Software Developers

Play Episode Listen Later Oct 8, 2025 58:36

Amey Desai, the Chief Technology Officer at Nexla, speaks with host Sriram Panyam about the Model Context Protocol (MCP) and its role in enabling agentic AI systems. The conversation begins with the fundamental challenge that led to MCP's creation: the proliferation of "spaghetti code" and custom integrations as developers tried to connect LLMs to various data sources and APIs. Before MCP, engineers were writing extensive scaffolding code using frameworks such as LangChain and Haystack, spending more time on integration challenges than solving actual business problems. Desai illustrates this with concrete examples, such as building GitHub analytics to track engineering team performance. Previously, this required custom code for multiple API calls, error handling, and orchestration. With MCP, these operations can be defined as simple tool calls, allowing the LLM to handle sequencing and error management in a structured, reasonable manner. The episode explores emerging patterns in MCP development, including auction bidding patterns for multi-agent coordination and orchestration strategies. Desai shares detailed examples from Nexla's work, including a PDF processing system that intelligently routes documents to appropriate tools based on content type, and a data labeling system that coordinates multiple specialized agents. The conversation also touches on Google's competing A2A (Agent-to-Agent) protocol, which Desai positions as solving horizontal agent coordination versus MCP's vertical tool integration approach. He expresses skepticism about A2A's reliability in production environments, comparing it to peer-to-peer systems where failure rates compound across distributed components. Desai concludes with practical advice for enterprises and engineers, emphasizing the importance of embracing AI experimentation while focusing on governance and security rather than getting paralyzed by concerns about hallucination. He recommends starting with simple, high-value use cases like automated deployment pipelines and gradually building expertise with MCP-based solutions. Brought to you by IEEE Computer Society and IEEE Software magazine.

ai google model security agent context protocol api github apis chief technology officer llm desai mcp haystack amey a2a langchain se radio

#226 - Brevo : Monter l'équipe GenAI appliquée au Produit (Centaure, 189 millions ARR)

Data Gen

Play Episode Listen Later Sep 29, 2025 42:22

Sylvain Ramousse est VP of AI chez Brevo, la licorne française qui propose une solution de marketing automation qui permet notamment d'orchestrer ses campagnes d'emailing ou de SMS. La scaleup a acquis le statut de “centaure” après avoir dépassé les 100 millions d'euros de revenus annuels.On aborde :

ai data microsoft millions acast ux sms visitez genai suivez raptor laissez produit monter databricks inscrivez appliqu licorne langchain centaure

Understanding the role of MCP, Langchain and Agent2Agent

The Datanation Podcast - Podcast for Data Engineers, Analysts and Scientists

Play Episode Listen Later Sep 26, 2025

Alex Merced discusses the rising standards within the Agentic AI Space. Buy Alex Merced’s latest book “Architecting an Apache Iceberg Lakehouse” use discount code mercedconf25 for a 40% discount. https://www.manning.com/books/architecting-an-apache-iceberg-lakehouse Follow Alex on Social at AlexMerced.com

social architecting langchain alex merced

Context Engineering for Agents - Lance Martin, LangChain

Latent Space: The AI Engineer Podcast â€” CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Sep 11, 2025

Lance: https://www.linkedin.com/in/lance-martin-64a33b5/ How Context Fails: https://www.dbreunig.com/2025/06/22/how-contexts-fail-and-how-to-fix-them.html How New Buzzwords Get Created: https://www.dbreunig.com/2025/07/24/why-the-term-context-engineering-matters.html Content Engineering: https://x.com/RLanceMartin/status/1948441848978309358 https://rlancemartin.github.io/2025/06/23/context_engineering/ https://docs.google.com/presentation/d/16aaXLu40GugY-kOpqDU4e-S0hD1FmHcNyF0rRRnb1OU/edit?usp=sharing Manus Post: https://manus.im/blog/Context-Engineering-for-AI-Agents-Lessons-from-Building-Manus Cognition Post: https://cognition.ai/blog/dont-build-multi-agents Multi-Agent Researcher: https://www.anthropic.com/engineering/multi-agent-research-system Human-in-the-loop + Memory: https://github.com/langchain-ai/agents-from-scratch - Bitter Lesson in AI Engineering - Hyung Won Chung on the Bitter Lesson in AI Research: https://www.youtube.com/watch?v=orDKvo8h71o Bitter Lesson w/ Claude Code: https://www.youtube.com/watch?v=Lue8K2jqfKk&t=1s Learning the Bitter Lesson in AI Engineering: https://rlancemartin.github.io/2025/07/30/bitter_lesson/ Open Deep Research: https://github.com/langchain-ai/open_deep_research https://academy.langchain.com/courses/deep-research-with-langgraph Scaling and building things that "don't yet work": https://www.youtube.com/watch?v=p8Jx4qvDoSo - Frameworks - Roast framework at Shopify / standardization of orchestration tools: https://www.youtube.com/watch?v=0NHCyq8bBcM MCP adoption within Anthropic / standardization of protocols: https://www.youtube.com/watch?v=xlEQ6Y3WNNI How to think about frameworks: https://blog.langchain.com/how-to-think-about-agent-frameworks/ RAG benchmarking: https://rlancemartin.github.io/2025/04/03/vibe-code/ Simon's talk with memory-gone-wrong: https://simonwillison.net/2025/Jun/6/six-months-in-llms/

memory engineering context shopify rag anthropic ai research langchain

#58 - Miles Grimshaw

LABOSSIERE PODCAST

Play Episode Listen Later Sep 4, 2025 68:28

Miles Grimshaw is a Partner at Thrive Capital, an investment firm that builds and invests in internet, software, and technology-enabled companies. Thrive recently closed on $5BN in new funds and also announced Thrive Holdings, a permanent capital vehicle to invest in, acquire, and operate businesses for the long term with the strategic application of technology.During his time at Thrive, Miles has led investments in companies like Airtable, Monzo, Benchling, Lattice, and more recently Cursor, a code editor built for programming with AI, which you'll hear us chat about. That team raised a $900 million round at a $9.9B valuation in June.Prior to Thrive, Miles was a General Partner at Benchmark, where he led seed investments, most notably in LangChain.We spoke about trillion dollar companies, silicon valley as an idea, business genetics, practicing scales, and Swedish House Mafia.0:00 - Intro2:14 – “The Era of Doing”6:15 – Startup Capital Intensity in the Age of AI9:14 – The Rise of Trillion Dollar Outcomes15:11 – Silicon Valley as an Idea21:04 – Physics vs Biology-Style Investing25:41 – Business Genetics and Compounding33:04 – Dying of Indigestion and Going Multi-Product35:55 – Co-Pilots, Command Centers, and Defensibility40:07 – Investing Stage Agnostically44:29 – When is VC a Good Capital Instrument?49:18 – Thrive's Core Beliefs53:57 – A Bet vs a Commitment57:49 – The Few Ideas Miles Takes Seriously59:47 – Doing a Few Big Things vs a Million Little Things1:03:54 – Practicing Scales1:06:22 – What Should More People Be Thinking About?

Core AI Concepts – Part 3

Oracle University Podcast

Play Episode Listen Later Aug 26, 2025 23:02

Join hosts Lois Houston and Nikita Abraham, along with Principal AI/ML Instructor Himanshu Raj, as they discuss the transformative world of Generative AI. Together, they uncover the ways in which generative AI agents are changing the way we interact with technology, automating tasks and delivering new possibilities. AI for You: https://mylearn.oracle.com/ou/course/ai-for-you/152601/252500 Oracle University Learning Community: https://education.oracle.com/ou-community LinkedIn: https://www.linkedin.com/showcase/oracle-university/ X: https://x.com/Oracle_Edu Special thanks to Arijit Ghosh, David Wright, Kris-Ann Nansen, Radhika Banka, and the OU Studio Team for helping us create this episode. ------------------------------------------------------- Episode Transcript: 00:00 Welcome to the Oracle University Podcast, the first stop on your cloud journey. During this series of informative podcasts, we'll bring you foundational training on the most popular Oracle technologies. Let's get started! 00:25 Lois: Welcome to the Oracle University Podcast! I'm Lois Houston, Director of Innovation Programs with Oracle University, and with me is Nikita Abraham, Team Lead of Editorial Services.  Nikita: Hi everyone! Last week was Part 2 of our conversation on core AI concepts, where we went over the basics of data science. In Part 3 today, we'll look at generative AI and gen AI agents in detail. To help us with that, we have Himanshu Raj, Principal AI/ML Instructor. Hi Himanshu, what's the difference between traditional AI and generative AI? 01:01 Himanshu: So until now, when we talked about artificial intelligence, we usually meant models that could analyze information and make decisions based on it, like a judge who looks at evidence and gives a verdict. And that's what we call traditional AI that's focused on analysis, classification, and prediction. But with generative AI, something remarkable happens. Generative AI does not just evaluate. It creates. It's more like a storyteller who uses knowledge from the past to imagine and build something brand new. For example, instead of just detecting if an email is spam, generative AI could write an entirely new email for you. Another example, traditional AI might predict what a photo contains. Generative AI, on the other hand, creates a brand-new photo based on description. Generative AI refers to artificial intelligence models that can create entirely new content, such as text, images, music, code, or video that resembles human-made work. Instead of simple analyzing or predicting, generative AI produces something original that resembles what a human might create. 02:16 Lois: How did traditional AI progress to the generative AI we know today? Himanshu: First, we will look at small supervised learning. So in early days, AI models were trained on small labeled data sets. For example, we could train a model with a few thousand emails labeled spam or not spam. The model would learn simple decision boundaries. If email contains, "congratulations," it might be spam. This was efficient for a straightforward task, but it struggled with anything more complex. Then, comes the large supervised learning. As the internet exploded, massive data sets became available, so millions of images, billions of text snippets, and models got better because they had much more data and stronger compute power and thanks to advances, like GPUs, and cloud computing, for example, training a model on millions of product reviews to predict customer sentiment, positive or negative, or to classify thousands of images in cars, dogs, planes, etc. Models became more sophisticated, capturing deeper patterns rather than simple rules. And then, generative AI came into the picture, and we eventually reached a point where instead of just classifying or predicting, models could generate entirely new content. Generative AI models like ChatGPT or GitHub Copilot are trained on enormous data sets, not to simply answer a yes or no, but to create outputs that look and feel like human made. Instead of judging the spam or sentiment, now the model can write an article, compose a song, or paint a picture, or generate new software code. 03:55 Nikita: Himanshu, what motivated this sort of progression? Himanshu: Because of the three reasons. First one, data, we had way more of it thanks to the internet, smartphones, and social media. Second is compute. Graphics cards, GPUs, parallel computing, and cloud systems made it cheap and fast to train giant models. And third, and most important is ambition. Humans always wanted machines not just to judge existing data, but to create new knowledge, art, and ideas. 04:25 Lois: So, what's happening behind the scenes? How is gen AI making these things happen? Himanshu: Generative AI is about creating entirely new things across different domains. On one side, we have large language models or LLMs. They are masters of generating text conversations, stories, emails, and even code. And on the other side, we have diffusion models. They are the creative artists of AI, turning text prompts into detailed images, paintings, or even videos. And these two together are like two different specialists. The LLM acts like a brain that understands and talks, and the diffusion model acts like an artist that paints based on the instructions. And when we connect these spaces together, we create something called multimodal AI, systems that can take in text and produce images, audio, or other media, opening a whole new range of possibilities. It can not only take the text, but also deal in different media options. So today when we say ChatGPT or Gemini, they can generate images, and it's not just one model doing everything. These are specialized systems working together behind the scenes. 05:38 Lois: You mentioned large language models and how they power text-based gen AI, so let's talk more about them. Himanshu, what is an LLM and how does it work? Himanshu: So it's a probabilistic model of text, which means, it tries to predict what word is most likely to come next based on what came before. This ability to predict one word at a time intelligently is what builds full sentences, paragraphs, and even stories. 06:06 Nikita: But what's large about this? Why's it called a large language model? Himanshu: It simply means the model has lots and lots of parameters. And think of parameters as adjustable dials the model fine tuned during learning. There is no strict rule, but today, large models can have billions or even trillions of these parameters. And the more the parameters, more complex patterns, the model can understand and can generate a language better, more like human. 06:37 Nikita: Ok… and image-based generative AI is powered by diffusion models, right? How do they work? Himanshu: Diffusion models start with something that looks like pure random noise. Imagine static on an old TV screen. No meaningful image at all. From there, the model carefully removes noise step by step to create something more meaningful and think of it like sculpting a statue. You start with a rough block of stone and slowly, carefully you chisel away to reveal a beautiful sculpture hidden inside. And in each step of this process, the AI is making an educated guess based on everything it has learned from millions of real images. It's trying to predict. 07:24 Stay current by taking the 2025 Oracle Fusion Cloud Applications Delta Certifications. This is your chance to demonstrate your understanding of the latest features and prove your expertise by obtaining a globally recognized certification, all for free! Discover the certification paths, use the resources on MyLearn to prepare, and future-proof your skills. Get started now at mylearn.oracle.com. 07:53 Nikita: Welcome back! Himanshu, for most of us, our experience with generative AI is with text-based tools like ChatGPT. But I'm sure the uses go far beyond that, right? Can you walk us through some of them? Himanshu: First one is text generation. So we can talk about chatbots, which are now capable of handling nuanced customer queries in banking travel and retail, saving companies hours of support time. Think of a bank chatbot helping a customer understand mortgage options or virtual HR Assistant in a large company, handling leave request. You can have embedding models which powers smart search systems. Instead of searching by keywords, businesses can now search by meaning. For instance, a legal firm can search cases about contract violations in tech and get semantically relevant results, even if those exact words are not used in the documents. The third one, for example, code generation, tools like GitHub Copilot help developers write boilerplate or even functional code, accelerating software development, especially in routine or repetitive tasks. Imagine writing a waveform with just a few prompts. The second application, is image generation. So first obvious use is art. So designers and marketers can generate creative concepts instantly. Say, you need illustrations for a campaign on future cities. Generative AI can produce dozens of stylized visuals in minutes. For design, interior designers or architects use it to visualize room layouts or design ideas even before a blueprint is finalized. And realistic images, retail companies generate images of people wearing their clothing items without needing real models or photoshoots, and this reduces the cost and increase the personalization. Third application is multimodal systems, and these are combined systems that take one kind of input or a combination of different inputs and produce different kind of outputs, or can even combine various kinds, be it text image in both input and output. Text to image It's being used in e-commerce, movie concept art, and educational content creation. For text to video, this is still in early days, but imagine creating a product explainer video just by typing out the script. Marketing teams love this for quick turnarounds. And the last one is text to audio. Tools like ElevenLabs can convert text into realistic, human like voiceovers useful in training modules, audiobooks, and accessibility apps. So generative AI is no longer just a technical tool. It's becoming a creative copilot across departments, whether it's marketing, design, product support, and even operations. 10:42 Lois: That's great! So, we've established that generative AI is pretty powerful. But what kind of risks does it pose for businesses and society in general? Himanshu: The first one is deepfakes. Generative AI can create fake but highly realistic media, video, audios or even faces that look and sound authentic. Imagine a fake video of a political leader announcing a policy, they never approved. This could cause mass confusion or even impact elections. In case of business, deepfakes can be also used in scams where a CEO's voice is faked to approve fraudulent transactions. Number two, bias, if AI is trained on biased historical data, it can reinforce stereotypes even when unintended. For example, a hiring AI system that favors male candidates over equally qualified women because of historical data was biased. And this bias can expose companies to discrimination, lawsuits, brand damage and ethical concerns. Number three is hallucinations. So sometimes AI system confidently generate information that is completely wrong without realizing it. Sometimes you ask a chatbot for a legal case summary, and it gives you a very convincing but entirely made up court ruling. In case of business impact, sectors like health care, finance, or law hallucinations can or could have serious or even dangerous consequences if not caught. The fourth one is copyright and IP issues, generative AI creates new content, but often, based on material it was trained on. Who owns a new work? A real life example could be where an artist finds their unique style was copied by an AI that was trained on their paintings without permission. In case of a business impact, companies using AI-generated content for marketing, branding or product designs must watch for legal gray areas around copyright and intellectual properties. So generative AI is not just a technology conversation, it's a responsibility conversation. Businesses must innovate and protect. Creativity and caution must go together. 12:50 Nikita: Let's move on to generative AI agents. How is a generative AI agent different from just a chatbot or a basic AI tool? Himanshu: So think of it like a smart assistant, not just answering your questions, but also taking actions on your behalf. So you don't just ask, what's the best flight to Vegas? Instead, you tell the agent, book me a flight to Vegas and a room at the Hilton. And it goes ahead, understands that, finds the options, connects to the booking tools, and gets it done. So act on your behalf using goals, context, and tools, often with a degree of autonomy. Goals, are user defined outcomes. Example, I want to fly to Vegas and stay at Hilton. Context, this includes preferences history, constraints like economy class only or don't book for Mondays. Tools could be APIs, databases, or services it can call, such as a travel API or a company calendar. And together, they let the agent reason, plan, and act. 14:02 Nikita: How does a gen AI agent work under the hood? Himanshu: So usually, they go through four stages. First, one is understands and interprets your request like natural language understanding. Second, figure out what needs to be done, in this case flight booking plus hotel search. Third, retrieves data or connects to tools APIs if needed, such as Skyscanner, Expedia, or a Calendar. And fourth is takes action. That means confirming the booking and giving you a response like your travel is booked. Keep in mind not all gen AI agents are fully independent. 14:38 Lois: Himanshu, we've seen people use the terms generative AI agents and agentic AI interchangeably. What's the difference between the two? Himanshu: Agentic AI is a broad umbrella. It refers to any AI system that can perceive, reason, plan, and act toward a goal and may improve and adapt over time. Most gen AI agents are reactive, not proactive. On the other hand, agentic AI can plan ahead, anticipate problems, and can even adjust strategies. So gen AI agents are often semi-autonomous. They act in predefined ways or with human approval. Agentic systems can range from low to full autonomy. For example, auto-GPT runs loops without user prompts and autonomous car decides routes and reactions. Most gen AI agents can only make multiple steps if explicitly designed that way, like a step-by-step logic flows in LangChain. And in case of agentic AI, it can plan across multiple steps with evolving decisions. On the memory and goal persistence, gen AI agents are typically stateless. That means they forget their goal unless you remind them. In case of agentic AI, these systems remember, adapt, and refine based on goal progression. For example, a warehouse robot optimizing delivery based on changing layouts. Some generative AI agents are agentic, like auto GPT. They use LLMs to reason, plan, and act, but not all. And likewise not all agentic AIs are generative. For example, an autonomous car, which may use computer vision control systems and planning, but no generative models. So agentic AI is a design philosophy or system behavior, which could be goal-driven, autonomous, and decision making. They can overlap, but as I said, not all generative AI agents are agentic, and not all agentic AI systems are generative. 16:39 Lois: What makes a generative AI agent actually work? Himanshu: A gen AI agent isn't just about answering the question. It's about breaking down a user's goal, figuring out how to achieve it, and then executing that plan intelligently. These agents are built from five core components and each playing a critical role. The first one is goal. So what is this agent trying to achieve? Think of this as the mission or intent. For example, if I tell the agent, help me organized a team meeting for Friday. So the goal in that case would be schedule a meeting. Number 2, memory. What does it remember? So this is the agent's context awareness. Storing previous chats, preferences, or ongoing tasks. For example, if last week I said I prefer meetings in the afternoon or I have already shared my team's availability, the agent can reuse that. And without the memory, the agent behaves stateless like a typical chatbot that forgets context after every prompt. Third is tools. What can it access? Agents aren't just smart, they are also connected. They can be given access to tools like calendars, CRMs, web APIs, spreadsheets, and so on. The fourth one is planner. So how does it break down the goal? And this is where the reasoning happens. The planner breaks big goals into a step-by-step plans, for example checking team availability, drafting meeting invite, and then sending the invite. And then probably, will confirm the booking. Agents don't just guess. They reason and organize actions into a logical path. And the fifth and final one is executor, who gets it done. And this is where the action takes place. The executor performs what the planner lays out. For example, calling APIs, sending message, booking reservations, and if planner is the architect, executor is the builder. 18:36 Nikita: And where are generative AI agents being used? Himanshu: Generative AI agents aren't just abstract ideas, they are being used across business functions to eliminate repetitive work, improve consistency, and enable faster decision making. For marketing, a generative AI agent can search websites and social platforms to summarize competitor activity. They can draft content for newsletters or campaign briefs in your brand tone, and they can auto-generate email variations based on audience segment or engagement history. For finance, a generative AI agent can auto-generate financial summaries and dashboards by pulling from ERP spreadsheets and BI tools. They can also draft variance analysis and budget reports tailored for different departments. They can scan regulations or policy documents to flag potential compliance risks or changes. For sales, a generative AI agent can auto-draft personalized sales pitches based on customer behavior or past conversations. They can also log CRM entries automatically once submitting summary is generated. They can also generate battlecards or next-step recommendations based on the deal stage. For human resource, a generative AI agent can pre-screen resumes based on job requirements. They can send interview invites and coordinate calendars. A common theme here is that generative AI agents help you scale your teams without scaling the headcount. 20:02 Nikita: Himanshu, let's talk about the capabilities and benefits of generative AI agents. Himanshu: So generative AI agents are transforming how entire departments function. For example, in customer service, 24/7 AI agents handle first level queries, freeing humans for complex cases. They also enhance the decision making. Agents can quickly analyze reports, summarize lengthy documents, or spot trends across data sets. For example, a finance agent reviewing Excel data can highlight cash flow anomalies or forecast trends faster than a team of analysts. In case of personalization, the agents can deliver unique, tailored experiences without manual effort. For example, in marketing, agents generate personalized product emails based on each user's past behavior. For operational efficiency, they can reduce repetitive, low-value tasks. For example, an HR agent can screen hundreds of resumes, shortlist candidates, and auto-schedule interviews, saving HR team hours each week. 21:06 Lois: Ok. And what are the risks of using generative AI agents? Himanshu: The first one is job displacement. Let's be honest, automation raises concerns. Roles involving repetitive tasks such as data entry, content sorting are at risk. In case of ethics and accountability, when an AI agent makes a mistake, who is responsible? For example, if an AI makes a biased hiring decision or gives incorrect medical guidance, businesses must ensure accountability and fairness. For data privacy, agents often access sensitive data, for example employee records or customer history. If mishandled, it could lead to compliance violations. In case of hallucinations, agents may generate confident but incorrect outputs called hallucinations. This can often mislead users, especially in critical domains like health care, finance, or legal. So generative AI agents aren't just tools, they are a force multiplier. But they need to be deployed thoughtfully with a human lens and strong guardrails. And that's how we ensure the benefits outweigh the risks. 22:10 Lois: Thank you so much, Himanshu, for educating us. We've had such a great time with you! If you want to learn more about the topics discussed today, head over to mylearn.oracle.com and get started on the AI for You course. Nikita: Join us next week as we chat about AI workflows and tools. Until then, this is Nikita Abraham… Lois: And Lois Houston signing off! 22:32 That's all for this episode of the Oracle University Podcast. If you enjoyed listening, please click Subscribe to get all the latest episodes. We'd also love it if you would take a moment to rate and review us on your podcast app. See you again on the next episode of the Oracle University Podcast.

#517: Agentic Al Programming with Python

Talk Python To Me - Python conversations for passionate developers

Play Episode Listen Later Aug 22, 2025 77:01 Transcription Available

Agentic AI programming is what happens when coding assistants stop acting like autocomplete and start collaborating on real work. In this episode, we cut through the hype and incentives to define “agentic,” then get hands-on with how tools like Cursor, Claude Code, and LangChain actually behave inside an established codebase. Our guest, Matt Makai, now VP of Developer Relations at DigitalOcean, creator of Full Stack Python and Plushcap, shares hard-won tactics. We unpack what breaks, from brittle “generate a bunch of tests” requests to agents amplifying technical debt and uneven design patterns. Plus, we also discuss a sane git workflow for AI-sized diffs. You'll hear practical Claude tips, why developers write more bugs when typing less, and where open source agents are headed. Hint: The destination is humans as editors of systems, not just typists of code. Episode sponsors Posit Talk Python Courses Links from the show Matt Makai: linkedin.com Plushcap Developer Content Analytics: plushcap.com DigitalOcean Gradient AI Platform: digitalocean.com DigitalOcean YouTube Channel: youtube.com Why Generative AI Coding Tools and Agents Do Not Work for Me: blog.miguelgrinberg.com AI Changes Everything: lucumr.pocoo.org Claude Code - 47 Pro Tips in 9 Minutes: youtube.com Cursor AI Code Editor: cursor.com JetBrains Junie: jetbrains.com Claude Code by Anthropic: anthropic.com Full Stack Python: fullstackpython.com Watch this episode on YouTube: youtube.com Episode #517 deep-dive: talkpython.fm/517 Episode transcripts: talkpython.fm Developer Rap Theme Song: Served in a Flask: talkpython.fm/flasksong --- Stay in touch with us --- Subscribe to Talk Python on YouTube: youtube.com Talk Python on Bluesky: @talkpython.fm at bsky.app Talk Python on Mastodon: talkpython Michael on Bluesky: @mkennedy.codes at bsky.app Michael on Mastodon: mkennedy

3384: MariaDB's Roadmap for Cloud, AI, and Performance Leadership

The Tech Blog Writer Podcast

Play Episode Listen Later Aug 15, 2025 27:03

MariaDB is a name with deep roots in the open-source database world, but in 2025 it is showing the energy and ambition of a company on the rise. Taken private in 2022 and backed by K1 Investment Management, MariaDB is doubling down on innovation while positioning itself as a strong alternative to MySQL and Oracle. At a time when many organisations are frustrated with Oracle's pricing and MySQL's cloud-first pivot, MariaDB is finding new opportunities by combining open-source freedom with enterprise-grade reliability. In this conversation, I sit down with Vikas Mathur, Chief Product Officer at MariaDB, to explore how the company is capitalising on these market shifts. Vikas shares the thinking behind MariaDB's renewed focus, explains how the platform delivers similar features to Oracle at up to 80 percent lower total cost of ownership, and details how recent innovations are opening the door to new workloads and use cases. One of the most significant developments is the launch of Vector Search in January 2023. This feature is built directly into InnoDB, eliminating the need for separate vector databases and delivering two to three times the performance of PG Vector. With hardware acceleration on both x86 and IBM Power architectures, and native connectors for leading AI frameworks such as LlamaIndex, LangChain and Spring AI, MariaDB is making it easier for developers to integrate AI capabilities without complex custom work. Vikas explains how MariaDB's pluggable storage engine architecture allows users to match the right engine to the right workload. InnoDB handles balanced transactional workloads, MyRocks is optimised for heavy writes, ColumnStore supports analytical queries, and Moroonga enables text search. With native JSON support and more than forty functions for manipulating semi-structured data, MariaDB can also remove the need for separate document databases. This flexibility underpins the company's vision of one database for infinite possibilities. The discussion also examines how MariaDB manages the balance between its open-source community and enterprise customers. Community adoption provides early feedback on new features and helps drive rapid improvement, while enterprise customers benefit from production support, advanced security, high availability and disaster recovery capabilities such as Galera-based synchronous replication and the MacScale proxy. We look ahead to how MariaDB plans to expand its managed cloud services, including DBaaS and serverless options, and how the company is working on a “RAG in a box” approach to simplify retrieval-augmented generation for DBAs. Vikas also shares his perspective on market trends, from the shift away from embedded AI and traditional machine learning features toward LLM-powered applications, to the growing number of companies moving from NoSQL back to SQL for scalability and long-term maintainability. This is a deep dive into the strategy, technology and market forces shaping MariaDB's next chapter. It will be of interest to database architects, AI engineers, and technology leaders looking for insight into how an open-source veteran is reinventing itself for the AI era while challenging the biggest names in the industry.

community ai leadership performance oracle roadmap chief product officer llm sql rag galera json mysql vikas nosql dbas mariadb langchain cloud ai dbaas

Ep. 254: Gurdeep Pall | The Internet Side of the AI Battle: Why Walled Gardens Fail

The Net Promoter System Podcast – Customer Experience Insights from Loyalty Leaders

Play Episode Listen Later Aug 14, 2025 14:34

Episode 254: What if the future of AI in customer experience is built not by giant platforms but by small, reusable, open source AI agents? Gurdeep Pall, President of AI Strategy at Qualtrics, believes open, modular AI agents will outmaneuver big tech's locked-down systems. In this conversation from the X4 Summit, Gurdeep argues that “experience agents”—task-specific bots that can plug into any stack—will give companies more control, better performance, and real freedom. Closed AI platforms promise convenience, but they trap businesses in rigid walled gardens. Gurdeep argues that modular architectures unlock something better: flexibility, reuse, and evolution. “Break down the agents to very specific functionality,” he says. “And those agents can be invoked by many different agents for different types of tasks.” This isn't just a tech choice. It's a business and philosophical stance. Qualtrics is partnering with LangChain and releasing open connectors to build an ecosystem of interoperable agents. The goal? Let companies mix, match, and scale customer-facing systems without depending on any one vendor. “This is one semantic level up,” he says, comparing today's agentic architectures to the launch of the web and mobile eras. “What agents are going to do for user experience—taking our digital game to the next level—is very exciting.” Guest: Gurdeep Pall, President of AI Strategy, Qualtrics Host: Rob Markey, Partner, Bain & Company Give Us Feedback: https://bit.ly/CCPodcastFeedback Time-Stamped Topics: (00:01) Why Qualtrics is going all-in on open agentic AI (00:04) An overview of the Qualtrics and LangChain partnership (00:06) The modular architecture of “experience agents” (00:08) Why one task might require seven agents (00:09) How specialization allows reuse and scale (00:10) Rejecting the walled garden model (00:11) Making open systems friction-free (00:12) A real-time use case from the X4 stage (00:14) Plug and play simplicity for complex integrations (00:15) Why this is a new digital paradigm Time-Stamped Quotes: [7:00] “It's about how you break up the task. Like, when you call the human, the human didn't sit there and not do anything and the password got reset. The human went to a piece of software and they went and worked on it. So, what we are talking about here is the combination of software and the human, now organized most efficiently.” [8:00] “ If you're able to break down the agents to very specific functionality, then those agents can be invoked by many different agents for different types of tasks.” [10:00] “ There is one example of a very small, open system called the Internet, which somehow, through open standards, became one of the most incredible innovations of human beings ever. So what we are trying to do is to take a stand and say, 'We believe in open systems and we want to let our customers know that this is a choice.'”

president ai internet battle partner fail plug rejecting qualtrics ai strategy pall walled gardens langchain x4 gurdeep

How RAG Is Powering the Future of AI Agents

Microsoft Business Applications Podcast

Play Episode Listen Later Aug 11, 2025 30:46 Transcription Available

Get featured on the show by leaving us a Voice Mail: https://bit.ly/MIPVM

625 - The Salesforce Partner's AI Dilemma with Sanjeet Mahajan

Corporate Escapees

Play Episode Listen Later Jul 28, 2025 28:58

Why you should listenSanjeet Mahajan shares his journey building AgentForce agents and custom AI solutions, revealing the critical prompt engineering techniques that eliminate hallucinations and deliver real ROI.Learn the decision framework for when to use AgentForce versus building custom agents with LangChain and CrewAI, plus real case studies from hospitality and real estate showing measurable results.Discover how to create your own "Content Crafter" AI agent that generates marketing ideas in 10 minutes instead of 3-4 weeks, based on your unique business journey and client data.As a Salesforce consultant, you're watching competitors struggle with AgentForce implementation while others race ahead with custom AI agents. The hallucinations, pricing concerns, and lack of clear guidance on when to build versus buy has left many of you spinning your wheels. I see this frustration constantly - talented consultants who know their platforms inside-out but feel lost in the AI maze. In this episode, I talk with Sanjeet Mahajan from Kizzy Consulting, who's spent months in the trenches building both AgentForce and custom agents. We dive deep into the prompt engineering techniques that actually work, the decision framework for choosing platforms, and proven case studies that show real ROI. If you're tired of AI hype and want practical implementation strategies that work, this conversation will give you the roadmap you need.About Sanjeet MahajanSanjeet Mahajan is the Founder & CEO of Kizzy Consulting, a Salesforce Ridge and ISV Partner helping nonprofits, real estate, and homecare teams grow with clean data, smart automation, and human-first design. A seasoned Technical Architect with a deep curiosity for AI, Sanjeet is building intelligent systems that think, act, and adapt—so businesses don't just keep up, they leap ahead.Resources and LinksKizzyconsulting.comSanjeet's LinkedIn profileLangChainCrewAIN8NNotebookLMNapkin.aiPrevious episode: 624 - How to Turn Client Cloud Platform Pain Into Profitable Migration Projects with Jon TopperCheck out more episodes of the Paul Higgins PodcastSubscribe to our YouTube channel: @PaulHigginsMentoringFree Training for AI & Tech Consultants Ready to Stop Trading Time for MoneyJoin our newsletterSuggested resource

ceo founders ai discover partner roi salesforce mahajan langchain technical architect ai dilemma

AI Agent Development Tradeoffs You NEED to Know

MLOps.community

Play Episode Listen Later Jul 22, 2025 57:06

Sherwood Callaway, tech lead at 11X, joins us to talk about building digital workers—specifically Alice (an AI sales rep) and Julian (a voice agent)—that are shaking up sales outreach by automating complex, messy tasks.He looks back on his YC days at OpKit, where he first got his hands dirty with voice AI, and compares the wild ride of building voice vs. text agents. We get into the use of Langgraph Cloud, integrating observability tools like Langsmith and Arize, and keeping hallucinations in check with regular Evals.Sherwood and Demetrios wrap up with a look ahead: will today's sprawling AI agent stacks eventually simplify? // BioSherwood Callaway is an emerging leader in the world of AI startups and AI product development. He currently serves as the first engineering manager at 11x, a series B AI startup backed by Benchmark and Andreessen Horowitz, where he oversees technical work on "Alice", an AI sales rep that outperforms top human SDRs.Alice is an advanced agentic AI working in production and at scale. Under Sherwood's leadership, the system grew from initial prototype to handling over 1 million prospect interactions per month across 300+ customers, leveraging partnerships with OpenAI, Anthropic, and LangChain while maintaining consistent performance and reliability. Alice is now generating eight figures in ARR.Sherwood joined 11x in 2024 through the acquisition of his YC-backed startup, Opkit, where he built and commercialized one of the first-ever AI phone calling solutions for a specific industry vertical (healthcare). Prior to Opkit, he was the second infrastructure engineer at Brex, where he designed, built, and scaled the production infrastructure that supported Brex's application and engineering org through hypergrowth. He currently lives in San Francisco, CA.// Related Links~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExploreMLOps Swag/Merch: [https://shop.mlops.community/]Connect with Demetrios on LinkedIn: /dpbrinkmConnect with Sherwood on LinkedIn: /sherwoodcallaway/ #aiengineering Timestamps:[00:00] AI Takes Over Health Calls[05:05] What Can Agents Really Do?[08:25] Who's in Charge—User or Agent?[11:20] Why Graphs Matter in Agents[15:03] How Complex Should Agents Be?[18:33] The Hidden Cost of Model Upgrades[21:57] Inside the LLM Agent Loop[25:08] Turning Agents into APIs[29:06] Scaling Agents Without Meltdowns[30:04] The Monorepo Tangle, Explained[34:01] Building Agents the Open Source Way[38:49] What Production-Ready Agents Look Like[41:23] AI That Fixes Code on Its Own[43:26] Tracking Agent Behavior with OpenTelemetry[46:43] Running Agents Locally with Phoenix[52:55] LangGraph Meets Arise for Agent Control[53:29] Hunting Hallucinations in Agent Traces[56:45] Off-Script Insights Worth Hearing

ai san francisco agent openai apis arr benchmark sherwood anthropic hidden cost andreessen horowitz tradeoffs yc sdrs brex demetrios langchain its own agent development

921: AI Coding Roadmap for Newbies (And Skeptics)

Syntax - Tasty Web Development Treats

Play Episode Listen Later Jul 21, 2025 48:58

Scott and Wes break down how to code with and for AI; perfect for skeptics, beginners, and curious devs. They cover everything from Ghost Text and CLI agents to building your own AI-powered apps with embeddings, function calling, and multi-model workflows. Show Notes 00:00 Welcome to Syntax! 03:56 How to interface with AI. 04:07 IDE Ghost Text. 05:45 IDE Chat, Agents. 08:00 CLI Agents. Claude Code. Open Code. Gemini. 11:13 MCP Servers. Context7 14:47 GUI apps. v0. Bolt.new. Lovable. Windsurf. 19:07 Existing Chat app like ChatGPT. 22:37 Building things WITH AI. 23:32 Prompting. 26:53 Streaming VS not streaming. 28:14 Embeddings and Rag. 31:09 MCP Server. CJ's MCP Deep Dive. 32:36 Brought to you by Sentry.io. 33:25 Multi-model, multi-provider. 36:27 npm libs to use to code with AI. OpenAI SDK. AI SDK. Cloudflare Agents. Langchain. Local AI Tensorflow. Transformers.js. Huggingface. 44:12 Processes and exploring. Hit us up on Socials! Syntax: X Instagram Tiktok LinkedIn Threads Wes: X Instagram Tiktok LinkedIn Threads Scott: X Instagram Tiktok LinkedIn Threads Randy: X Instagram YouTube Threads

LangChain is about to become a unicorn, sources say

TechCrunch Startups – Spoken Edition

Play Episode Listen Later Jul 10, 2025 3:25

AI infrastructure startup LangChain is raising a new round at about $1 billion valuation led by IVP. Learn more about your ad choices. Visit podcastchoices.com/adchoices

ai unicorns ivp sources say langchain

70. How to Stop AI From Replacing You

Hot Girls Code

Play Episode Listen Later Jul 1, 2025 34:30

You've probably seen the headlines about AI replacing developers, but is AI really coming for our jobs, or is there more going on?In this episode, we break down what's actually happening in the tech industry and how you can use AI to make sure you don't get left behind. We talk about the role of AI in layoffs, how it's changing software development, and why this shift is more evolution than extinction.We cover real-world examples, practical use cases for AI as a developer, and why companies are now hiring for AI skills. Plus, we explain key AI concepts like LLMs, prompts, agents, and tokenisation in classic Hot Girls Code fashion. We also share some of our hot tips on how to use AI as a tool so it can be your bestie instead of your enemy!Whether you're curious, cautious, or fully on board with AI, this episode will help you understand how to work with it—not against it!Where to Find Us:⁠Instragam⁠⁠Tik Tok⁠⁠The Hot Girls Code Website⁠Links mentioned in the episode:Check out our original AI episodes to hear about the basics and ethics behind AI: Episode 16 and Episode 17Learn more about APIs in Episode 68. What is an API?Learn more about some popular pre-trained models: Open AI's GPT, Google's Gemini, and DALL-E Learn more about popular frameworks and libraries: LangChain, LlamaIndex, and TensorFlowLearn more about AI coding assistants: GitHub Copilot, Windsurf (formerly Codeium), and ClineSponsored By: ⁠Trade Me

ai google openai gemini api replacing gpt apis github copilot windsurf langchain codeium ai episode

#307.exe - Langchain: Faire de l'IA comme des Lego par Julien Verlaguet

IFTTD - If This Then Dev

Play Episode Listen Later Jun 20, 2025 12:24

lego comme faire visitez cliquez soutenez audiomeans langchain

EP. 267 Full Stack AI: Building with MongoDB, Deno, and Next.js

The MongoDB Podcast

Play Episode Listen Later Jun 12, 2025 60:54

Is building the backend for your AI application slowing you down? In this episode of the MongoDB Podcast, host Jesse Hall sits down with Srikar and Jimmy, the creators of Daemo AI, a revolutionary tool designed to eliminate the tedious "plumbing" of backend development.Discover how Daemo AI is building upon deprecated MongoDB features like Realm App Services, creating a more powerful and flexible solution for developers. We dive deep into their tech stack, including Next.js, Deno, and Express , and explore why they chose MongoDB for its speed and flexibility in AI applications. Plus, you'll see a live demo of Daemo's new SDK and CLI , learn how it can generate data migrations and dummy data on the fly , and get a real answer to the big question: Is AI going to take your job? In This Episode, You Will Learn: What Daemo AI is and how it accelerates development. * How to build AI agents and integrate them with frameworks like LangChain. Why MongoDB is the ideal database for rapid-growth startups and AI. The future of developer jobs in the age of AI.

ai discover express sdks mongodb fullstack cli deno langchain jesse hall

LangChain: LLM Integration for Elixir Apps with Mark Ericksen

Smart Software with SmartLogic

Play Episode Listen Later Jun 12, 2025 38:18

Mark Ericksen, creator of the Elixir LangChain framework, joins the Elixir Wizards to talk about LLM integration in Elixir apps. He explains how LangChain abstracts away the quirks of different AI providers (OpenAI, Anthropic's Claude, Google's Gemini) so you can work with any LLM in one more consistent API. We dig into core features like conversation chaining, tool execution, automatic retries, and production-grade fallback strategies. Mark shares his experiences maintaining LangChain in a fast-moving AI world: how it shields developers from API drift, manages token budgets, and handles rate limits and outages. He also reveals testing tactics for non-deterministic AI outputs, configuration tips for custom authentication, and the highlights of the new v0.4 release, including “content parts” support for thinking-style models. Key topics discussed in this episode: • Abstracting LLM APIs behind a unified Elixir interface • Building and managing conversation chains across multiple models • Exposing application functionality to LLMs through tool integrations • Automatic retries and fallback chains for production resilience • Supporting a variety of LLM providers • Tracking and optimizing token usage for cost control • Configuring API keys, authentication, and provider-specific settings • Handling rate limits and service outages with degradation • Processing multimodal inputs (text, images) in Langchain workflows • Extracting structured data from unstructured LLM responses • Leveraging “content parts” in v0.4 for advanced thinking-model support • Debugging LLM interactions using verbose logging and telemetry • Kickstarting experiments in LiveBook notebooks and demos • Comparing Elixir LangChain to the original Python implementation • Crafting human-in-the-loop workflows for interactive AI features • Integrating Langchain with the Ash framework for chat-driven interfaces • Contributing to open-source LLM adapters and staying ahead of API changes • Building fallback chains (e.g., OpenAI → Azure) for seamless continuity • Embedding business logic decisions directly into AI-powered tools • Summarization techniques for token efficiency in ongoing conversations • Batch processing tactics to leverage lower-cost API rate tiers • Real-world lessons on maintaining uptime amid LLM service disruptions Links mentioned: https://rubyonrails.org/ https://fly.io/ https://zionnationalpark.com/ https://podcast.thinkingelixir.com/ https://github.com/brainlid/langchain https://openai.com/ https://claude.ai/ https://gemini.google.com/ https://www.anthropic.com/ Vertex AI Studio https://cloud.google.com/generative-ai-studio https://www.perplexity.ai/ https://azure.microsoft.com/ https://hexdocs.pm/ecto/Ecto.html https://oban.pro/ Chris McCord's ElixirConf EU 2025 Talk https://www.youtube.com/watch?v=ojL_VHc4gLk Getting started: https://hexdocs.pm/langchain/gettingstarted.html https://ash-hq.org/ https://hex.pm/packages/langchain https://hexdocs.pm/igniter/readme.html https://www.youtube.com/watch?v=WM9iQlQSFg @brainlid on Twitter and BlueSky Special Guest: Mark Ericksen.

ai google talk real building chatgpt apps tracking integration leveraging crafting ash nlp exposing openai gemini machine learning processing monitoring iot api python data science automatic devops batch contributing llm data privacy emerging technologies deep learning cloud computing future of ai elixir business intelligence scalability logging software engineering anthropic large language models data security embedding ai ethics extracting kickstarting google gemini ci cd json microservices tech podcast natural language processing ai integration conversational ai edge computing observability prompt engineering telemetry developer experience ecto enterprise ai functional programming developer tools performance optimization ai assistants data processing ai education concurrency software architecture liveview langchain test automation tech insights cost optimization summarization azure ai api design developer podcast chris mccord asynchronous programming phoenix framework elixirconf eu

#507: Agentic AI Workflows with LangGraph

Talk Python To Me - Python conversations for passionate developers

Play Episode Listen Later Jun 2, 2025 63:59 Transcription Available

If you want to leverage the power of LLMs in your Python apps, you would be wise to consider an agentic framework. Agentic empowers the LLMs to use tools and take further action based on what it has learned at that point. And frameworks provide all the necessary building blocks to weave these into your apps with features like long-term memory and durable resumability. I'm excited to have Sydney Runkle back on the podcast to dive into building Python apps with LangChain and LangGraph. Episode sponsors Posit Auth0 Talk Python Courses Links from the show Sydney Runkle: linkedin.com LangGraph: github.com LangChain: langchain.com LangGraph Studio: github.com LangGraph (Web): langchain.com LangGraph Tutorials Introduction: langchain-ai.github.io How to Think About Agent Frameworks: blog.langchain.dev Human in the Loop Concept: langchain-ai.github.io GPT-4 Prompting Guide: cookbook.openai.com Watch this episode on YouTube: youtube.com Episode transcripts: talkpython.fm --- Stay in touch with us --- Subscribe to Talk Python on YouTube: youtube.com Talk Python on Bluesky: @talkpython.fm at bsky.app Talk Python on Mastodon: talkpython Michael on Bluesky: @mkennedy.codes at bsky.app Michael on Mastodon: mkennedy

The Rise of AI Agents in Project Work

5 Minutes Podcast with Ricardo Vargas

Play Episode Listen Later May 25, 2025 3:30

In this episode, Ricardo discusses how AI Agents are transforming project management. Unlike traditional tools, these agents are autonomous, understand context, make decisions, and interact with people and systems to deliver value. With the advancement of models like ChatGPT and platforms such as LangChain, Crew AI, and Google NotebookLM, building smart agents has become much easier. They can update schedules, write meeting notes, draft emails, generate reports, and monitor risks—all integrated with tools like Notion, Slack, Trello, and Google Docs. This shift changes the project manager's role to that of an “AI orchestrator.” However, caution is needed due to potential errors, hallucinations, and data security concerns. AI isn't here to replace project managers but to empower them to focus on what truly matters. Listen to the podcast to learn more!

ai project risk chatgpt slack project management risk management notion trello google docs pmi pmp portfolio management program management capm langchain ipma

A Ascensão dos Agentes de IA na Gestão de Projetos

5 Minutes Podcast com Ricardo Vargas

Play Episode Listen Later May 25, 2025 4:13

Neste episódio, Ricardo apresenta como agentes de inteligência artificial (AI Agents) estão revolucionando o gerenciamento de projetos. Diferentes das automações tradicionais, esses agentes são autônomos, interpretam contextos, tomam decisões e interagem com ferramentas como Notion, Slack, Trello e Google Docs. Com o avanço de modelos como ChatGPT e plataformas como LangChain, Crew AI e NotebookLM, ficou mais fácil criar agentes que entendem linguagem natural e atuam com autonomia. Eles podem atualizar cronogramas, gerar atas, escrever e-mails e sugerir ações. O papel do gerente muda de executor para orquestrador de IA. Porém, há riscos como erros e alucinações, exigindo supervisão humana. A IA não substitui o gerente de projetos, mas libera tempo para decisões mais estratégicas. Escute o podcast para saber mais!

Google Cloud Next Wrap-Up

The New Stack Podcast

Play Episode Listen Later May 22, 2025 18:22

At the close of this year's Google Cloud Next, The New Stack's Alex Williams, AI editor Frederic Lardinois, and analyst Janakiram MSV discussed the event's dominant theme: AI agents. The conversation focused heavily on agent frameworks, noting a shift from last year's third-party tools like Langchain, CrewAI, and Microsoft's Autogen, to first-party offerings from model providers themselves. Google's newly announced Agent Development Kit (ADK) highlights this trend, following closely on the heels of OpenAI's agent SDK. MSV emphasized the significance of this shift, calling it a major milestone as Google joins the race alongside Microsoft and OpenAI. Despite the buzz, Lardinois pointed out that many companies are still exploring how AI agents can fit into real-world workflows. The panel also highlighted how Google now delivers a full-stack AI development experience — from models to deployment platforms like Vertex AI. New enterprise tools like Agent Space and Agent Garden further signal Google's commitment to making agents a core part of modern software development. Learn more from The New Stack about the latest in AI agents: How AI Agents Will Change the Web for Users and Developers AI Agents: A Comprehensive Introduction for Developers AI Agents Are Coming for Your SaaS Stack Join our community of newsletter subscribers to stay on top of the news and at the top of your game.

ai google tech microsoft wrap web openai users software engineers google cloud sdks software developers tech podcast alex williams google cloud next langchain msv vertex ai new stack new stack makers

A Candid Conversation Around MCP and A2A // Rahul Parundekar and Sam Partee // #316 SF Live

MLOps.community

Play Episode Listen Later May 21, 2025 64:42

Demetrios, Sam Partee, and Rahul Parundekar unpack the chaos of AI agent tools and the evolving world of MCP (Model Context Protocol). With sharp insights and plenty of laughs, they dig into tool permissions, security quirks, agent memory, and the messy path to making agents actually useful.// BioSam ParteeSam Partee is the CTO and Co-Founder of Arcade AI. Previously a Principal Engineer leading the Applied AI team at Redis, Sam led the effort in creating the ecosystem around Redis as a vector database. He is a contributor to multiple OSS projects including Langchain, DeterminedAI, LlamaIndex and Chapel amongst others. While at Cray/HPE he created the SmartSim AI framework which is now used at national labs around the country to integrate HPC simulations like climate models with AI. Rahul ParundekarRahul Parundekar is the founder of AI Hero. He graduated with a Master's in Computer Science from USC Los Angeles in 2010, and embarked on a career focused on Artificial Intelligence. From 2010-2017, he worked as a Senior Researcher at Toyota ITC working on agent autonomy within vehicles. His journey continued as the Director of Data Science at FigureEight (later acquired by Appen), where he and his team developed an architecture supporting over 36 ML models and managing over a million predictions daily. Since 2021, he has been working on AI Hero, aiming to democratize AI access, while also consulting on LLMOps(Large Language Model Operations), and AI system scalability. Other than his full time role as a founder, he is also passionate about community engagement, and actively organizes MLOps events in SF, and contributes educational content on RAG and LLMOps at learn.mlops.community.// Related LinksWebsites: arcade.devaihero.studio~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExploreMLOps Swag/Merch: [https://shop.mlops.community/]Connect with Demetrios on LinkedIn: /dpbrinkmConnect with Rahul on LinkedIn: /rparundekarConnect with Sam on LinkedIn: /samparteeTimestamps:[00:00] Agents & Tools, Explained (Without Melting Your Brain)[09:51] MVP Servers: Why Everything's on Fire (and How to Fix It)[13:18] Can We Actually Trust the Protocol?[18:13] KYC, But Make It AI (and Less Painful)[25:25] Web Automation Tests: The Bugs Strike Back[28:18] MCP Dev: What Went Wrong (and What Saved Us)[33:53] Social Login: One Button to Rule Them All[39:33] What Even Is an AI-Native Developer?[42:21] Betting Big on Smarter Models (High Risk, High Reward)[51:40] Harrison's Bold New Tactic (With Real-Life Magic Tricks)[55:31] Async Task Handoffs: Herding Cats, But Digitally[1:00:37] Getting AI to Actually Help Your Workflow[1:03:53] The Infamous Varma System Error (And How We Dodge It)

LIVE: Ambient Agents and the New Agent Inbox ft. Harrison Chase of LangChain

Training Data

Play Episode Listen Later May 15, 2025 8:28

Recorded live at Sequoia's AI Ascent 2025: LangChain CEO Harrison Chase introduces the concept of ambient agents, AI systems that operate continuously in the background responding to events rather than direct human prompts. Learn how these agents differ from traditional chatbots, why human oversight remains essential and how this approach could dramatically scale our ability to leverage AI.

ai inbox sequoia new agent langchain

Model Context Protocol (MCP): Making AI Agents Talk to Your Data

Generation AI

Play Episode Listen Later May 13, 2025 31:07

In this insightful episode of Generation AI, hosts Ardis Kadiu and JC Bonilla tackle Model Context Protocol (MCP), a new standardization that's gaining rapid adoption across the AI industry. They explain how MCP functions as a universal adapter between AI models and data sources, solving the "Frankenstein middleware" problem that makes building AI agents so complex today. The hosts break down why this matters for higher education professionals, how it reduces hallucinations by improving data access, and why major players like OpenAI, Google, and HubSpot are already implementing it. This episode offers critical insight into how standardization will make AI tools more useful and less complex for everyone.What is Model Context Protocol (MCP)? (00:01:00)Introduction to MCP as a standardization protocol for AI agentsHosts explain MCP as a way to help AI access context and dataThe "three-legged stool" of AI agents: intelligence, context, and actionMCP provides the standard for how agents communicate with data sourcesMCP as the Universal AI Adapter (00:04:42)JC compares MCP to standardized protocols like TCP/IP and USB-CMCP sits between models like Claude or Gemini and various data sourcesIt eliminates the need for custom connectors between each tool and AI modelThe protocol's simplicity as a minimal viable product (MVP) is key to its successHow MCP Works (00:07:03)MCP is a protocol, not an API, that describes format and flow"Discovery first" approach where AI asks "what can you do?"Uses JSON format for tools and data exchangeWorks both locally and remotely over HTTPThe Technical Benefits of MCP (00:13:14)Solves the "m by n headache" of needing separate connectors for each model-tool pairReduces hallucinations by providing AI with reliable data sourcesGives AI models access to specialized tools for tasks they struggle withEnables "grounding" in real data rather than making things upIndustry Adoption and Momentum (00:17:14)OpenAI, Google, HubSpot, LangChain and others already implementing MCPHubSpot's beta MCP server allows for direct CRM data access in ClaudeGrowing availability for tools like Slack, Teams, and ZapierDiscussion of how MCP layers on top of existing APIsPractical Applications (00:20:36)Higher education examples: connecting LMS, advisor notes, financial aid systemsSales use case: AI agents accessing CRM data through MCP for follow-upsDevOps: AI monitoring logs, creating tickets, and managing communicationAnalytics: Connecting data sources, models, and reporting tools seamlesslyChallenges and Considerations (00:23:17)MCP requires widespread adoption to be truly effectiveProduct teams must be convinced to implement it alongside existing APIsPossibility that another protocol might eventually win outCurrent technical hurdles for implementation that are being addressedCall to Action for Listeners (00:26:03)Experiment with MCP servers that connect to Claude desktopFor AI product builders: write MCP servers for your applications nowAsk AI vendors: "Do you speak MCP?" as a signal of cutting-edge capabilityMCP as the new standard, comparable to asking "Do you have an API?"Conclusion: The Future of AI Integration (00:29:14)MCP's architectural implications for more open, modular AI systemsThe need for agents to speak a common language across platformsInvitation for listeners to share which workflows they'll connect once MCP goes mainstream - - - -Connect With Our Co-Hosts:Ardis Kadiuhttps://www.linkedin.com/in/ardis/https://twitter.com/ardisDr. JC Bonillahttps://www.linkedin.com/in/jcbonilla/https://twitter.com/jbonillxAbout The Enrollify Podcast Network:Generation AI is a part of the Enrollify Podcast Network. If you like this podcast, chances are you'll like other Enrollify shows too! Enrollify is made possible by Element451 — the next-generation AI student engagement platform helping institutions create meaningful and personalized interactions with students. Learn more at element451.com. Attend the 2025 Engage Summit! The Engage Summit is the premier conference for forward-thinking leaders and practitioners dedicated to exploring the transformative power of AI in education. Explore the strategies and tools to step into the next generation of student engagement, supercharged by AI. You'll leave ready to deliver the most personalized digital engagement experience every step of the way.Register now to secure your spot in Charlotte, NC, on June 24-25, 2025! Early bird registration ends February 1st -- https://engage.element451.com/register

AI-Driven Development: Driving adoption on your product teams, Team Culture, and AI-Native Engineering Practices

Convergence

Play Episode Listen Later May 7, 2025 44:22

How do you move from dabbling with AI and vibe coding to building real, production-grade software with it? In this episode, Austin Vance, CEO of Focused returns and we transition the conversation from building AI-enabled applications to fostering AI-native engineering teams. Austin shares how generative AI isn't just a shortcut—it's reshaping how we architect, code, and lead. We also get to hear Austin's thoughts on the leaked ‘AI Mandate' memo from Shopify's CEO, Tobi Lutke. We cover what Austin refers to as ‘AI-driven development', how to win over the skeptics on your teams, and why traditional patterns of software engineering might not be the best fit for LLM-driven workflows. Whether you're an engineer,product leader, or startup founder, this episode will give you a practical lens on what AI-native software development actually requires—and how to foster adoption on your teams quickly and safely to get the benefits of using AI in product delivery. Unlock the full potential of your product team with Integral's player coaches, experts in lean, human-centered design. Visit integral.io/convergence for a free Product Success Lab workshop to gain clarity and confidence in tackling any product design or engineering challenge. Inside the episode... Why Shopify's leaked AI memo was a "permission slip" for your own team The three personas in AI adoption: advocates, skeptics, and holdouts How AI-driven development (AIDD) differs from “AI-assisted” workflows Tools and practices Focused uses to ship faster and cheaper with AI Pair programming vs. pairing with an LLM: similarities and mindset shifts How teams are learning to prompt effectively—without prompt engineering training Vibe coding vs. integrating with entrenched systems: what's actually feasible Scaling engineering culture around non-determinism and experimentation Practical tips for onboarding dev teams to tools like Cursor, Windsurf, and Vercel AI SDK Using LLMs for deep codebase exploration, not just code generation Mentioned in this episode Cursor Windsurf LangChain Claude GPT-4 / ChatGPT V0.dev GitHub Copilot Focused (focused.io) Shopify internal AI memo Unlock the full potential of your product team with Integral's player coaches, experts in lean, human-centered design. Visit integral.io/convergence for a free Product Success Lab workshop to gain clarity and confidence in tackling any product design or engineering challenge. Subscribe to the Convergence podcast wherever you get podcasts including video episodes to get updated on the other crucial conversations that we'll post on YouTube at youtube.com/@convergencefmpodcast Learn something? Give us a 5 star review and like the podcast on YouTube. It's how we grow. Follow the Pod Linkedin: https://www.linkedin.com/company/convergence-podcast/ X: https://twitter.com/podconvergence Instagram: @podconvergence

AMP 72: The Future Is AI + Angular – Here's Why with Nir Kaufman

Angular Master Podcast

Play Episode Listen Later May 7, 2025 25:46

Just dropped a fresh episode of the Angular Master Podcast – and it's a must-listen for every frontend developer thinking about the future.This time I'm joined by the one and only Nir Kaufman — Google Developer Expert, international speaker, tech lead at Tikal, and the brilliant mind behind our newest initiative:

ai warsaw kaufman rag angular nir typescript uis tikal langchain

Building Agentic Apps With Craft: Field Stories from Austin Vance, CEO, Co-Founder of Focused

Convergence

Play Episode Listen Later May 1, 2025 56:49

What does it actually take to build agentic AI applications that hold up in the real world? In this episode, Ashok sits down with Austin, founder of Focused, to share field stories and hard-won lessons from building AI systems that go beyond flashy demos. From legal assistants to government transparency tools, Austin breaks down the concrete criteria for identifying where AI makes sense — and where it doesn't. They unpack how to find the right starting point for your first agentic app, why integration with legacy systems is the real hurdle, and the engineering must-haves that keep AI behavior safe and reliable. You'll hear practical guidance on designing eval frameworks, using abstraction layers like LangChain, and how observability can shape your development roadmap just like in traditional software. Whether you're a product leader or a CTO, this conversation will help you distinguish hype from real opportunity in AI. Unlock the full potential of your product team with Integral's player coaches, experts in lean, human-centered design. Visit integral.io/convergence for a free Product Success Lab workshop to gain clarity and confidence in tackling any product design or engineering challenge. Inside the episode... A practical checklist for identifying your first AI-powered app The hidden cost of "AI for AI's sake" and where traditional software is better Why repetitive knowledge work is prime territory for automation How Focused helped Hamlet build an AI for parsing government meeting data Where read-only data access gives you a safe starting point Why integration is often more complex than the AI itself The importance of eval frameworks and test-driven LLM development How to use observability to continuously improve AI agent behavior Speed vs. believability: surprising lessons from Groq-powered inference Using multiple models in one system and LLMs to QA each other Mentioned in this episode Hamlet (government transparency startup) - https://www.myhamlet.com/?convergence LangChain - https://www.langchain.com/?convergence Groq - https://groq.com/?convergence Claude (Anthropic) - https://claud.ai/?convergence Dspy Prompting framework - https://dspy.ai/?convergence Shopify AI memo (referenced) - https://convergence.fm/episode/shopifys-leaked-ai-mandate-explained-6-takeaways-for-your-product-team?convergence Amazon Bedrock / SageMaker - https://aws.amazon.com/bedrock/?convergence Unlock the full potential of your product team with Integral's player coaches, experts in lean, human-centered design. Visit integral.io/convergence for a free Product Success Lab workshop to gain clarity and confidence in tackling any product design or engineering challenge. Subscribe to the Convergence podcast wherever you get podcasts including video episodes to get updated on the other crucial conversations that we'll post on YouTube at youtube.com/@convergencefmpodcast Learn something? Give us a 5 star review and like the podcast on YouTube. It's how we grow. Follow the Pod Linkedin: https://www.linkedin.com/company/convergence-podcast/ X: https://twitter.com/podconvergence Instagram: @podconvergence

ai stories co founders field speed apps craft unlock focused cto hamlet convergence qa integral llm agentic ashok langchain groq

EP. 263 Building Agents with Natural Language with guest

The MongoDB Podcast

Play Episode Listen Later Apr 25, 2025 71:04

**(Note: Spotify listeners can also watch the screen sharing video accompanying the audio. Other podcast platforms offer the audio-only version.)**In this episode of MongoDB Podcast Live, host Shane McAllister is joined by Sachin Hejip from Dataworkz. Sachin will showcase “Dataworkz Agent Builder” which is built with MongoDB Atlas Vector Search, and demonstrate how it can use Natural Language to create Agents and in turn, automate and simplify the creation of Agentic RAG applications. Sachin will demo the MongoDB Leafy Portal Chatbot Agent, which combines operational data with unstructured data for personalised customer experience and support, built using Dataworkz and MongoDB.Struggling with millions of unstructured documents, legacy records, or scattered data formats? Discover how AI, Large Language Models (LLMs), and MongoDB are revolutionizing data management in this episode of the MongoDB Podcast.Join host Shane McAllister and the team as they delve into tackling complex data challenges using cutting-edge technology. Learn how MongoDB Atlas Vector Search enables powerful semantic search and Retrieval Augmented Generation (RAG) applications, transforming chaotic information into valuable insights. Explore integrations with popular frameworks like Langchain and Llama Index.Find out how to efficiently process and make sense of your unstructured data, potentially saving significant costs and unlocking new possibilities.Ready to dive deeper?#MongoDB #AI #LLM #LargeLanguageModels #VectorSearch #AtlasVectorSearch #UnstructuredData #Podcast #DataManagement #Dataworkz #Observability #Developer #BigData #RAG

ai discover explore struggling large language models sachin mongodb natural language langchain

EP. 262 Solving Unstructured Data Challenges with AI & Vector Search

The MongoDB Podcast

Play Episode Listen Later Apr 16, 2025 56:30

**(Note: Spotify listeners can also watch the screen sharing video accompanying the audio. Other podcast platforms offer the audio-only version.)**Struggling with millions of unstructured documents, legacy records, or scattered data formats? Discover how AI, Large Language Models (LLMs), and MongoDB are revolutionizing data management in this episode of the MongoDB Podcast.Join host Shane McAllister and the team as they delve into tackling complex data challenges using cutting-edge technology. Learn how MongoDB Atlas Vector Search enables powerful semantic search and Retrieval Augmented Generation (RAG) applications, transforming chaotic information into valuable insights. Explore integrations with popular frameworks like Langchain and Llama Index.Find out how to efficiently process and make sense of your unstructured data, potentially saving significant costs and unlocking new possibilities.Ready to dive deeper?#MongoDB #AI #LLM #LargeLanguageModels #VectorSearch #AtlasVectorSearch #UnstructuredData #Podcast #DataManagement #RAG #SemanticSearch #Langchain #LlamaIndex #Developer #BigData

ai discover challenges search explore struggling vector large language models mongodb unstructured data langchain

The Agent Network — Dharmesh Shah

Latent Space: The AI Engineer Podcast â€” CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Mar 28, 2025 98:24

If you're in SF: Join us for the Claude Plays Pokemon hackathon this Sunday!If you're not: Fill out the 2025 State of AI Eng survey for $250 in Amazon cards!We are SO excited to share our conversation with Dharmesh Shah, co-founder of HubSpot and creator of Agent.ai.A particularly compelling concept we discussed is the idea of "hybrid teams" - the next evolution in workplace organization where human workers collaborate with AI agents as team members. Just as we previously saw hybrid teams emerge in terms of full-time vs. contract workers, or in-office vs. remote workers, Dharmesh predicts that the next frontier will be teams composed of both human and AI members. This raises interesting questions about team dynamics, trust, and how to effectively delegate tasks between human and AI team members.The discussion of business models in AI reveals an important distinction between Work as a Service (WaaS) and Results as a Service (RaaS), something Dharmesh has written extensively about. While RaaS has gained popularity, particularly in customer support applications where outcomes are easily measurable, Dharmesh argues that this model may be over-indexed. Not all AI applications have clearly definable outcomes or consistent economic value per transaction, making WaaS more appropriate in many cases. This insight is particularly relevant for businesses considering how to monetize AI capabilities.The technical challenges of implementing effective agent systems are also explored, particularly around memory and authentication. Shah emphasizes the importance of cross-agent memory sharing and the need for more granular control over data access. He envisions a future where users can selectively share parts of their data with different agents, similar to how OAuth works but with much finer control. This points to significant opportunities in developing infrastructure for secure and efficient agent-to-agent communication and data sharing.Other highlights from our conversation* The Evolution of AI-Powered Agents – Exploring how AI agents have evolved from simple chatbots to sophisticated multi-agent systems, and the role of MCPs in enabling that.* Hybrid Digital Teams and the Future of Work – How AI agents are becoming teammates rather than just tools, and what this means for business operations and knowledge work.* Memory in AI Agents – The importance of persistent memory in AI systems and how shared memory across agents could enhance collaboration and efficiency.* Business Models for AI Agents – Exploring the shift from software as a service (SaaS) to work as a service (WaaS) and results as a service (RaaS), and what this means for monetization.* The Role of Standards Like MCP – Why MCP has been widely adopted and how it enables agent collaboration, tool use, and discovery.* The Future of AI Code Generation and Software Engineering – How AI-assisted coding is changing the role of software engineers and what skills will matter most in the future.* Domain Investing and Efficient Markets – Dharmesh's approach to domain investing and how inefficiencies in digital asset markets create business opportunities.* The Philosophy of Saying No – Lessons from "Sorry, You Must Pass" and how prioritization leads to greater productivity and focus.Timestamps* 00:00 Introduction and Guest Welcome* 02:29 Dharmesh Shah's Journey into AI* 05:22 Defining AI Agents* 06:45 The Evolution and Future of AI Agents* 13:53 Graph Theory and Knowledge Representation* 20:02 Engineering Practices and Overengineering* 25:57 The Role of Junior Engineers in the AI Era* 28:20 Multi-Agent Systems and MCP Standards* 35:55 LinkedIn's Legal Battles and Data Scraping* 37:32 The Future of AI and Hybrid Teams* 39:19 Building Agent AI: A Professional Network for Agents* 40:43 Challenges and Innovations in Agent AI* 45:02 The Evolution of UI in AI Systems* 01:00:25 Business Models: Work as a Service vs. Results as a Service* 01:09:17 The Future Value of Engineers* 01:09:51 Exploring the Role of Agents* 01:10:28 The Importance of Memory in AI* 01:11:02 Challenges and Opportunities in AI Memory* 01:12:41 Selective Memory and Privacy Concerns* 01:13:27 The Evolution of AI Tools and Platforms* 01:18:23 Domain Names and AI Projects* 01:32:08 Balancing Work and Personal Life* 01:35:52 Final Thoughts and ReflectionsTranscriptAlessio [00:00:04]: Hey everyone, welcome back to the Latent Space podcast. This is Alessio, partner and CTO at Decibel Partners, and I'm joined by my co-host Swyx, founder of Small AI.swyx [00:00:12]: Hello, and today we're super excited to have Dharmesh Shah to join us. I guess your relevant title here is founder of Agent AI.Dharmesh [00:00:20]: Yeah, that's true for this. Yeah, creator of Agent.ai and co-founder of HubSpot.swyx [00:00:25]: Co-founder of HubSpot, which I followed for many years, I think 18 years now, gonna be 19 soon. And you caught, you know, people can catch up on your HubSpot story elsewhere. I should also thank Sean Puri, who I've chatted with back and forth, who's been, I guess, getting me in touch with your people. But also, I think like, just giving us a lot of context, because obviously, My First Million joined you guys, and they've been chatting with you guys a lot. So for the business side, we can talk about that, but I kind of wanted to engage your CTO, agent, engineer side of things. So how did you get agent religion?Dharmesh [00:01:00]: Let's see. So I've been working, I'll take like a half step back, a decade or so ago, even though actually more than that. So even before HubSpot, the company I was contemplating that I had named for was called Ingenisoft. And the idea behind Ingenisoft was a natural language interface to business software. Now realize this is 20 years ago, so that was a hard thing to do. But the actual use case that I had in mind was, you know, we had data sitting in business systems like a CRM or something like that. And my kind of what I thought clever at the time. Oh, what if we used email as the kind of interface to get to business software? And the motivation for using email is that it automatically works when you're offline. So imagine I'm getting on a plane or I'm on a plane. There was no internet on planes back then. It's like, oh, I'm going through business cards from an event I went to. I can just type things into an email just to have them all in the backlog. When it reconnects, it sends those emails to a processor that basically kind of parses effectively the commands and updates the software, sends you the file, whatever it is. And there was a handful of commands. I was a little bit ahead of the times in terms of what was actually possible. And I reattempted this natural language thing with a product called ChatSpot that I did back 20...swyx [00:02:12]: Yeah, this is your first post-ChatGPT project.Dharmesh [00:02:14]: I saw it come out. Yeah. And so I've always been kind of fascinated by this natural language interface to software. Because, you know, as software developers, myself included, we've always said, oh, we build intuitive, easy-to-use applications. And it's not intuitive at all, right? Because what we're doing is... We're taking the mental model that's in our head of what we're trying to accomplish with said piece of software and translating that into a series of touches and swipes and clicks and things like that. And there's nothing natural or intuitive about it. And so natural language interfaces, for the first time, you know, whatever the thought is you have in your head and expressed in whatever language that you normally use to talk to yourself in your head, you can just sort of emit that and have software do something. And I thought that was kind of a breakthrough, which it has been. And it's gone. So that's where I first started getting into the journey. I started because now it actually works, right? So once we got ChatGPT and you can take, even with a few-shot example, convert something into structured, even back in the ChatGP 3.5 days, it did a decent job in a few-shot example, convert something to structured text if you knew what kinds of intents you were going to have. And so that happened. And that ultimately became a HubSpot project. But then agents intrigued me because I'm like, okay, well, that's the next step here. So chat's great. Love Chat UX. But if we want to do something even more meaningful, it felt like the next kind of advancement is not this kind of, I'm chatting with some software in a kind of a synchronous back and forth model, is that software is going to do things for me in kind of a multi-step way to try and accomplish some goals. So, yeah, that's when I first got started. It's like, okay, what would that look like? Yeah. And I've been obsessed ever since, by the way.Alessio [00:03:55]: Which goes back to your first experience with it, which is like you're offline. Yeah. And you want to do a task. You don't need to do it right now. You just want to queue it up for somebody to do it for you. Yes. As you think about agents, like, let's start at the easy question, which is like, how do you define an agent? Maybe. You mean the hardest question in the universe? Is that what you mean?Dharmesh [00:04:12]: You said you have an irritating take. I do have an irritating take. I think, well, some number of people have been irritated, including within my own team. So I have a very broad definition for agents, which is it's AI-powered software that accomplishes a goal. Period. That's it. And what irritates people about it is like, well, that's so broad as to be completely non-useful. And I understand that. I understand the criticism. But in my mind, if you kind of fast forward months, I guess, in AI years, the implementation of it, and we're already starting to see this, and we'll talk about this, different kinds of agents, right? So I think in addition to having a usable definition, and I like yours, by the way, and we should talk more about that, that you just came out with, the classification of agents actually is also useful, which is, is it autonomous or non-autonomous? Does it have a deterministic workflow? Does it have a non-deterministic workflow? Is it working synchronously? Is it working asynchronously? Then you have the different kind of interaction modes. Is it a chat agent, kind of like a customer support agent would be? You're having this kind of back and forth. Is it a workflow agent that just does a discrete number of steps? So there's all these different flavors of agents. So if I were to draw it in a Venn diagram, I would draw a big circle that says, this is agents, and then I have a bunch of circles, some overlapping, because they're not mutually exclusive. And so I think that's what's interesting, and we're seeing development along a bunch of different paths, right? So if you look at the first implementation of agent frameworks, you look at Baby AGI and AutoGBT, I think it was, not Autogen, that's the Microsoft one. They were way ahead of their time because they assumed this level of reasoning and execution and planning capability that just did not exist, right? So it was an interesting thought experiment, which is what it was. Even the guy that, I'm an investor in Yohei's fund that did Baby AGI. It wasn't ready, but it was a sign of what was to come. And so the question then is, when is it ready? And so lots of people talk about the state of the art when it comes to agents. I'm a pragmatist, so I think of the state of the practical. It's like, okay, well, what can I actually build that has commercial value or solves actually some discrete problem with some baseline of repeatability or verifiability?swyx [00:06:22]: There was a lot, and very, very interesting. I'm not irritated by it at all. Okay. As you know, I take a... There's a lot of anthropological view or linguistics view. And in linguistics, you don't want to be prescriptive. You want to be descriptive. Yeah. So you're a goals guy. That's the key word in your thing. And other people have other definitions that might involve like delegated trust or non-deterministic work, LLM in the loop, all that stuff. The other thing I was thinking about, just the comment on Baby AGI, LGBT. Yeah. In that piece that you just read, I was able to go through our backlog and just kind of track the winter of agents and then the summer now. Yeah. And it's... We can tell the whole story as an oral history, just following that thread. And it's really just like, I think, I tried to explain the why now, right? Like I had, there's better models, of course. There's better tool use with like, they're just more reliable. Yep. Better tools with MCP and all that stuff. And I'm sure you have opinions on that too. Business model shift, which you like a lot. I just heard you talk about RAS with MFM guys. Yep. Cost is dropping a lot. Yep. Inference is getting faster. There's more model diversity. Yep. Yep. I think it's a subtle point. It means that like, you have different models with different perspectives. You don't get stuck in the basin of performance of a single model. Sure. You can just get out of it by just switching models. Yep. Multi-agent research and RL fine tuning. So I just wanted to let you respond to like any of that.Dharmesh [00:07:44]: Yeah. A couple of things. Connecting the dots on the kind of the definition side of it. So we'll get the irritation out of the way completely. I have one more, even more irritating leap on the agent definition thing. So here's the way I think about it. By the way, the kind of word agent, I looked it up, like the English dictionary definition. The old school agent, yeah. Is when you have someone or something that does something on your behalf, like a travel agent or a real estate agent acts on your behalf. It's like proxy, which is a nice kind of general definition. So the other direction I'm sort of headed, and it's going to tie back to tool calling and MCP and things like that, is if you, and I'm not a biologist by any stretch of the imagination, but we have these single-celled organisms, right? Like the simplest possible form of what one would call life. But it's still life. It just happens to be single-celled. And then you can combine cells and then cells become specialized over time. And you have much more sophisticated organisms, you know, kind of further down the spectrum. In my mind, at the most fundamental level, you can almost think of having atomic agents. What is the simplest possible thing that's an agent that can still be called an agent? What is the equivalent of a kind of single-celled organism? And the reason I think that's useful is right now we're headed down the road, which I think is very exciting around tool use, right? That says, okay, the LLMs now can be provided a set of tools that it calls to accomplish whatever it needs to accomplish in the kind of furtherance of whatever goal it's trying to get done. And I'm not overly bothered by it, but if you think about it, if you just squint a little bit and say, well, what if everything was an agent? And what if tools were actually just atomic agents? Because then it's turtles all the way down, right? Then it's like, oh, well, all that's really happening with tool use is that we have a network of agents that know about each other through something like an MMCP and can kind of decompose a particular problem and say, oh, I'm going to delegate this to this set of agents. And why do we need to draw this distinction between tools, which are functions most of the time? And an actual agent. And so I'm going to write this irritating LinkedIn post, you know, proposing this. It's like, okay. And I'm not suggesting we should call even functions, you know, call them agents. But there is a certain amount of elegance that happens when you say, oh, we can just reduce it down to one primitive, which is an agent that you can combine in complicated ways to kind of raise the level of abstraction and accomplish higher order goals. Anyway, that's my answer. I'd say that's a success. Thank you for coming to my TED Talk on agent definitions.Alessio [00:09:54]: How do you define the minimum viable agent? Do you already have a definition for, like, where you draw the line between a cell and an atom? Yeah.Dharmesh [00:10:02]: So in my mind, it has to, at some level, use AI in order for it to—otherwise, it's just software. It's like, you know, we don't need another word for that. And so that's probably where I draw the line. So then the question, you know, the counterargument would be, well, if that's true, then lots of tools themselves are actually not agents because they're just doing a database call or a REST API call or whatever it is they're doing. And that does not necessarily qualify them, which is a fair counterargument. And I accept that. It's like a good argument. I still like to think about—because we'll talk about multi-agent systems, because I think—so we've accepted, which I think is true, lots of people have said it, and you've hopefully combined some of those clips of really smart people saying this is the year of agents, and I completely agree, it is the year of agents. But then shortly after that, it's going to be the year of multi-agent systems or multi-agent networks. I think that's where it's going to be headed next year. Yeah.swyx [00:10:54]: Opening eyes already on that. Yeah. My quick philosophical engagement with you on this. I often think about kind of the other spectrum, the other end of the cell spectrum. So single cell is life, multi-cell is life, and you clump a bunch of cells together in a more complex organism, they become organs, like an eye and a liver or whatever. And then obviously we consider ourselves one life form. There's not like a lot of lives within me. I'm just one life. And now, obviously, I don't think people don't really like to anthropomorphize agents and AI. Yeah. But we are extending our consciousness and our brain and our functionality out into machines. I just saw you were a Bee. Yeah. Which is, you know, it's nice. I have a limitless pendant in my pocket.Dharmesh [00:11:37]: I got one of these boys. Yeah.swyx [00:11:39]: I'm testing it all out. You know, got to be early adopters. But like, we want to extend our personal memory into these things so that we can be good at the things that we're good at. And, you know, machines are good at it. Machines are there. So like, my definition of life is kind of like going outside of my own body now. I don't know if you've ever had like reflections on that. Like how yours. How our self is like actually being distributed outside of you. Yeah.Dharmesh [00:12:01]: I don't fancy myself a philosopher. But you went there. So yeah, I did go there. I'm fascinated by kind of graphs and graph theory and networks and have been for a long, long time. And to me, we're sort of all nodes in this kind of larger thing. It just so happens that we're looking at individual kind of life forms as they exist right now. But so the idea is when you put a podcast out there, there's these little kind of nodes you're putting out there of like, you know, conceptual ideas. Once again, you have varying kind of forms of those little nodes that are up there and are connected in varying and sundry ways. And so I just think of myself as being a node in a massive, massive network. And I'm producing more nodes as I put content or ideas. And, you know, you spend some portion of your life collecting dots, experiences, people, and some portion of your life then connecting dots from the ones that you've collected over time. And I found that really interesting things happen and you really can't know in advance how those dots are necessarily going to connect in the future. And that's, yeah. So that's my philosophical take. That's the, yes, exactly. Coming back.Alessio [00:13:04]: Yep. Do you like graph as an agent? Abstraction? That's been one of the hot topics with LandGraph and Pydantic and all that.Dharmesh [00:13:11]: I do. The thing I'm more interested in terms of use of graphs, and there's lots of work happening on that now, is graph data stores as an alternative in terms of knowledge stores and knowledge graphs. Yeah. Because, you know, so I've been in software now 30 plus years, right? So it's not 10,000 hours. It's like 100,000 hours that I've spent doing this stuff. And so I've grew up with, so back in the day, you know, I started on mainframes. There was a product called IMS from IBM, which is basically an index database, what we'd call like a key value store today. Then we've had relational databases, right? We have tables and columns and foreign key relationships. We all know that. We have document databases like MongoDB, which is sort of a nested structure keyed by a specific index. We have vector stores, vector embedding database. And graphs are interesting for a couple of reasons. One is, so it's not classically structured in a relational way. When you say structured database, to most people, they're thinking tables and columns and in relational database and set theory and all that. Graphs still have structure, but it's not the tables and columns structure. And you could wonder, and people have made this case, that they are a better representation of knowledge for LLMs and for AI generally than other things. So that's kind of thing number one conceptually, and that might be true, I think is possibly true. And the other thing that I really like about that in the context of, you know, I've been in the context of data stores for RAG is, you know, RAG, you say, oh, I have a million documents, I'm going to build the vector embeddings, I'm going to come back with the top X based on the semantic match, and that's fine. All that's very, very useful. But the reality is something gets lost in the chunking process and the, okay, well, those tend, you know, like, you don't really get the whole picture, so to speak, and maybe not even the right set of dimensions on the kind of broader picture. And it makes intuitive sense to me that if we did capture it properly in a graph form, that maybe that feeding into a RAG pipeline will actually yield better results for some use cases, I don't know, but yeah.Alessio [00:15:03]: And do you feel like at the core of it, there's this difference between imperative and declarative programs? Because if you think about HubSpot, it's like, you know, people and graph kind of goes hand in hand, you know, but I think maybe the software before was more like primary foreign key based relationship, versus now the models can traverse through the graph more easily.Dharmesh [00:15:22]: Yes. So I like that representation. There's something. It's just conceptually elegant about graphs and just from the representation of it, they're much more discoverable, you can kind of see it, there's observability to it, versus kind of embeddings, which you can't really do much with as a human. You know, once they're in there, you can't pull stuff back out. But yeah, I like that kind of idea of it. And the other thing that's kind of, because I love graphs, I've been long obsessed with PageRank from back in the early days. And, you know, one of the kind of simplest algorithms in terms of coming up, you know, with a phone, everyone's been exposed to PageRank. And the idea is that, and so I had this other idea for a project, not a company, and I have hundreds of these, called NodeRank, is to be able to take the idea of PageRank and apply it to an arbitrary graph that says, okay, I'm going to define what authority looks like and say, okay, well, that's interesting to me, because then if you say, I'm going to take my knowledge store, and maybe this person that contributed some number of chunks to the graph data store has more authority on this particular use case or prompt that's being submitted than this other one that may, or maybe this one was more. popular, or maybe this one has, whatever it is, there should be a way for us to kind of rank nodes in a graph and sort them in some, some useful way. Yeah.swyx [00:16:34]: So I think that's generally useful for, for anything. I think the, the problem, like, so even though at my conferences, GraphRag is super popular and people are getting knowledge, graph religion, and I will say like, it's getting space, getting traction in two areas, conversation memory, and then also just rag in general, like the, the, the document data. Yeah. It's like a source. Most ML practitioners would say that knowledge graph is kind of like a dirty word. The graph database, people get graph religion, everything's a graph, and then they, they go really hard into it and then they get a, they get a graph that is too complex to navigate. Yes. And so like the, the, the simple way to put it is like you at running HubSpot, you know, the power of graphs, the way that Google has pitched them for many years, but I don't suspect that HubSpot itself uses a knowledge graph. No. Yeah.Dharmesh [00:17:26]: So when is it over engineering? Basically? It's a great question. I don't know. So the question now, like in AI land, right, is the, do we necessarily need to understand? So right now, LLMs for, for the most part are somewhat black boxes, right? We sort of understand how the, you know, the algorithm itself works, but we really don't know what's going on in there and, and how things come out. So if a graph data store is able to produce the outcomes we want, it's like, here's a set of queries I want to be able to submit and then it comes out with useful content. Maybe the underlying data store is as opaque as a vector embeddings or something like that, but maybe it's fine. Maybe we don't necessarily need to understand it to get utility out of it. And so maybe if it's messy, that's okay. Um, that's, it's just another form of lossy compression. Uh, it's just lossy in a way that we just don't completely understand in terms of, because it's going to grow organically. Uh, and it's not structured. It's like, ah, we're just gonna throw a bunch of stuff in there. Let the, the equivalent of the embedding algorithm, whatever they called in graph land. Um, so the one with the best results wins. I think so. Yeah.swyx [00:18:26]: Or is this the practical side of me is like, yeah, it's, if it's useful, we don't necessarilyDharmesh [00:18:30]: need to understand it.swyx [00:18:30]: I have, I mean, I'm happy to push back as long as you want. Uh, it's not practical to evaluate like the 10 different options out there because it takes time. It takes people, it takes, you know, resources, right? Set. That's the first thing. Second thing is your evals are typically on small things and some things only work at scale. Yup. Like graphs. Yup.Dharmesh [00:18:46]: Yup. That's, yeah, no, that's fair. And I think this is one of the challenges in terms of implementation of graph databases is that the most common approach that I've seen developers do, I've done it myself, is that, oh, I've got a Postgres database or a MySQL or whatever. I can represent a graph with a very set of tables with a parent child thing or whatever. And that sort of gives me the ability, uh, why would I need anything more than that? And the answer is, well, if you don't need anything more than that, you don't need anything more than that. But there's a high chance that you're sort of missing out on the actual value that, uh, the graph representation gives you. Which is the ability to traverse the graph, uh, efficiently in ways that kind of going through the, uh, traversal in a relational database form, even though structurally you have the data, practically you're not gonna be able to pull it out in, in useful ways. Uh, so you wouldn't like represent a social graph, uh, in, in using that kind of relational table model. It just wouldn't scale. It wouldn't work.swyx [00:19:36]: Uh, yeah. Uh, I think we want to move on to MCP. Yeah. But I just want to, like, just engineering advice. Yeah. Uh, obviously you've, you've, you've run, uh, you've, you've had to do a lot of projects and run a lot of teams. Do you have a general rule for over-engineering or, you know, engineering ahead of time? You know, like, because people, we know premature engineering is the root of all evil. Yep. But also sometimes you just have to. Yep. When do you do it? Yes.Dharmesh [00:19:59]: It's a great question. This is, uh, a question as old as time almost, which is what's the right and wrong levels of abstraction. That's effectively what, uh, we're answering when we're trying to do engineering. I tend to be a pragmatist, right? So here's the thing. Um, lots of times doing something the right way. Yeah. It's like a marginal increased cost in those cases. Just do it the right way. And this is what makes a, uh, a great engineer or a good engineer better than, uh, a not so great one. It's like, okay, all things being equal. If it's going to take you, you know, roughly close to constant time anyway, might as well do it the right way. Like, so do things well, then the question is, okay, well, am I building a framework as the reusable library? To what degree, uh, what am I anticipating in terms of what's going to need to change in this thing? Uh, you know, along what dimension? And then I think like a business person in some ways, like what's the return on calories, right? So, uh, and you look at, um, energy, the expected value of it's like, okay, here are the five possible things that could happen, uh, try to assign probabilities like, okay, well, if there's a 50% chance that we're going to go down this particular path at some day, like, or one of these five things is going to happen and it costs you 10% more to engineer for that. It's basically, it's something that yields a kind of interest compounding value. Um, as you get closer to the time of, of needing that versus having to take on debt, which is when you under engineer it, you're taking on debt. You're going to have to pay off when you do get to that eventuality where something happens. One thing as a pragmatist, uh, so I would rather under engineer something than over engineer it. If I were going to err on the side of something, and here's the reason is that when you under engineer it, uh, yes, you take on tech debt, uh, but the interest rate is relatively known and payoff is very, very possible, right? Which is, oh, I took a shortcut here as a result of which now this thing that should have taken me a week is now going to take me four weeks. Fine. But if that particular thing that you thought might happen, never actually, you never have that use case transpire or just doesn't, it's like, well, you just save yourself time, right? And that has value because you were able to do other things instead of, uh, kind of slightly over-engineering it away, over-engineering it. But there's no perfect answers in art form in terms of, uh, and yeah, we'll, we'll bring kind of this layers of abstraction back on the code generation conversation, which we'll, uh, I think I have later on, butAlessio [00:22:05]: I was going to ask, we can just jump ahead quickly. Yeah. Like, as you think about vibe coding and all that, how does the. Yeah. Percentage of potential usefulness change when I feel like we over-engineering a lot of times it's like the investment in syntax, it's less about the investment in like arc exacting. Yep. Yeah. How does that change your calculus?Dharmesh [00:22:22]: A couple of things, right? One is, um, so, you know, going back to that kind of ROI or a return on calories, kind of calculus or heuristic you think through, it's like, okay, well, what is it going to cost me to put this layer of abstraction above the code that I'm writing now, uh, in anticipating kind of future needs. If the cost of fixing, uh, or doing under engineering right now. Uh, we'll trend towards zero that says, okay, well, I don't have to get it right right now because even if I get it wrong, I'll run the thing for six hours instead of 60 minutes or whatever. It doesn't really matter, right? Like, because that's going to trend towards zero to be able, the ability to refactor a code. Um, and because we're going to not that long from now, we're going to have, you know, large code bases be able to exist, uh, you know, as, as context, uh, for a code generation or a code refactoring, uh, model. So I think it's going to make it, uh, make the case for under engineering, uh, even stronger. Which is why I take on that cost. You just pay the interest when you get there, it's not, um, just go on with your life vibe coded and, uh, come back when you need to. Yeah.Alessio [00:23:18]: Sometimes I feel like there's no decision-making in some things like, uh, today I built a autosave for like our internal notes platform and I literally just ask them cursor. Can you add autosave? Yeah. I don't know if it's over under engineer. Yep. I just vibe coded it. Yep. And I feel like at some point we're going to get to the point where the models kindDharmesh [00:23:36]: of decide where the right line is, but this is where the, like the, in my mind, the danger is, right? So there's two sides to this. One is the cost of kind of development and coding and things like that stuff that, you know, we talk about. But then like in your example, you know, one of the risks that we have is that because adding a feature, uh, like a save or whatever the feature might be to a product as that price tends towards zero, are we going to be less discriminant about what features we add as a result of making more product products more complicated, which has a negative impact on the user and navigate negative impact on the business. Um, and so that's the thing I worry about if it starts to become too easy, are we going to be. Too promiscuous in our, uh, kind of extension, adding product extensions and things like that. It's like, ah, why not add X, Y, Z or whatever back then it was like, oh, we only have so many engineering hours or story points or however you measure things. Uh, that least kept us in check a little bit. Yeah.Alessio [00:24:22]: And then over engineering, you're like, yeah, it's kind of like you're putting that on yourself. Yeah. Like now it's like the models don't understand that if they add too much complexity, it's going to come back to bite them later. Yep. So they just do whatever they want to do. Yeah. And I'm curious where in the workflow that's going to be, where it's like, Hey, this is like the amount of complexity and over-engineering you can do before you got to ask me if we should actually do it versus like do something else.Dharmesh [00:24:45]: So you know, we've already, let's like, we're leaving this, uh, in the code generation world, this kind of compressed, um, cycle time. Right. It's like, okay, we went from auto-complete, uh, in the GitHub co-pilot to like, oh, finish this particular thing and hit tab to a, oh, I sort of know your file or whatever. I can write out a full function to you to now I can like hold a bunch of the context in my head. Uh, so we can do app generation, which we have now with lovable and bolt and repletage. Yeah. Association and other things. So then the question is, okay, well, where does it naturally go from here? So we're going to generate products. Make sense. We might be able to generate platforms as though I want a platform for ERP that does this, whatever. And that includes the API's includes the product and the UI, and all the things that make for a platform. There's no nothing that says we would stop like, okay, can you generate an entire software company someday? Right. Uh, with the platform and the monetization and the go-to-market and the whatever. And you know, that that's interesting to me in terms of, uh, you know, what, when you take it to almost ludicrous levels. of abstract.swyx [00:25:39]: It's like, okay, turn it to 11. You mentioned vibe coding, so I have to, this is a blog post I haven't written, but I'm kind of exploring it. Is the junior engineer dead?Dharmesh [00:25:49]: I don't think so. I think what will happen is that the junior engineer will be able to, if all they're bringing to the table is the fact that they are a junior engineer, then yes, they're likely dead. But hopefully if they can communicate with carbon-based life forms, they can interact with product, if they're willing to talk to customers, they can take their kind of basic understanding of engineering and how kind of software works. I think that has value. So I have a 14-year-old right now who's taking Python programming class, and some people ask me, it's like, why is he learning coding? And my answer is, is because it's not about the syntax, it's not about the coding. What he's learning is like the fundamental thing of like how things work. And there's value in that. I think there's going to be timeless value in systems thinking and abstractions and what that means. And whether functions manifested as math, which he's going to get exposed to regardless, or there are some core primitives to the universe, I think, that the more you understand them, those are what I would kind of think of as like really large dots in your life that will have a higher gravitational pull and value to them that you'll then be able to. So I want him to collect those dots, and he's not resisting. So it's like, okay, while he's still listening to me, I'm going to have him do things that I think will be useful.swyx [00:26:59]: You know, part of one of the pitches that I evaluated for AI engineer is a term. And the term is that maybe the traditional interview path or career path of software engineer goes away, which is because what's the point of lead code? Yeah. And, you know, it actually matters more that you know how to work with AI and to implement the things that you want. Yep.Dharmesh [00:27:16]: That's one of the like interesting things that's happened with generative AI. You know, you go from machine learning and the models and just that underlying form, which is like true engineering, right? Like the actual, what I call real engineering. I don't think of myself as a real engineer, actually. I'm a developer. But now with generative AI. We call it AI and it's obviously got its roots in machine learning, but it just feels like fundamentally different to me. Like you have the vibe. It's like, okay, well, this is just a whole different approach to software development to so many different things. And so I'm wondering now, it's like an AI engineer is like, if you were like to draw the Venn diagram, it's interesting because the cross between like AI things, generative AI and what the tools are capable of, what the models do, and this whole new kind of body of knowledge that we're still building out, it's still very young, intersected with kind of classic engineering, software engineering. Yeah.swyx [00:28:04]: I just described the overlap as it separates out eventually until it's its own thing, but it's starting out as a software. Yeah.Alessio [00:28:11]: That makes sense. So to close the vibe coding loop, the other big hype now is MCPs. Obviously, I would say Cloud Desktop and Cursor are like the two main drivers of MCP usage. I would say my favorite is the Sentry MCP. I can pull in errors and then you can just put the context in Cursor. How do you think about that abstraction layer? Does it feel... Does it feel almost too magical in a way? Do you think it's like you get enough? Because you don't really see how the server itself is then kind of like repackaging theDharmesh [00:28:41]: information for you? I think MCP as a standard is one of the better things that's happened in the world of AI because a standard needed to exist and absent a standard, there was a set of things that just weren't possible. Now, we can argue whether it's the best possible manifestation of a standard or not. Does it do too much? Does it do too little? I get that, but it's just simple enough to both be useful and unobtrusive. It's understandable and adoptable by mere mortals, right? It's not overly complicated. You know, a reasonable engineer can put a stand up an MCP server relatively easily. The thing that has me excited about it is like, so I'm a big believer in multi-agent systems. And so that's going back to our kind of this idea of an atomic agent. So imagine the MCP server, like obviously it calls tools, but the way I think about it, so I'm working on my current passion project is agent.ai. And we'll talk more about that in a little bit. More about the, I think we should, because I think it's interesting not to promote the project at all, but there's some interesting ideas in there. One of which is around, we're going to need a mechanism for, if agents are going to collaborate and be able to delegate, there's going to need to be some form of discovery and we're going to need some standard way. It's like, okay, well, I just need to know what this thing over here is capable of. We're going to need a registry, which Anthropic's working on. I'm sure others will and have been doing directories of, and there's going to be a standard around that too. How do you build out a directory of MCP servers? I think that's going to unlock so many things just because, and we're already starting to see it. So I think MCP or something like it is going to be the next major unlock because it allows systems that don't know about each other, don't need to, it's that kind of decoupling of like Sentry and whatever tools someone else was building. And it's not just about, you know, Cloud Desktop or things like, even on the client side, I think we're going to see very interesting consumers of MCP, MCP clients versus just the chat body kind of things. Like, you know, Cloud Desktop and Cursor and things like that. But yeah, I'm very excited about MCP in that general direction.swyx [00:30:39]: I think the typical cynical developer take, it's like, we have OpenAPI. Yeah. What's the new thing? I don't know if you have a, do you have a quick MCP versus everything else? Yeah.Dharmesh [00:30:49]: So it's, so I like OpenAPI, right? So just a descriptive thing. It's OpenAPI. OpenAPI. Yes, that's what I meant. So it's basically a self-documenting thing. We can do machine-generated, lots of things from that output. It's a structured definition of an API. I get that, love it. But MCPs sort of are kind of use case specific. They're perfect for exactly what we're trying to use them for around LLMs in terms of discovery. It's like, okay, I don't necessarily need to know kind of all this detail. And so right now we have, we'll talk more about like MCP server implementations, but We will? I think, I don't know. Maybe we won't. At least it's in my head. It's like a back processor. But I do think MCP adds value above OpenAPI. It's, yeah, just because it solves this particular thing. And if we had come to the world, which we have, like, it's like, hey, we already have OpenAPI. It's like, if that were good enough for the universe, the universe would have adopted it already. There's a reason why MCP is taking office because marginally adds something that was missing before and doesn't go too far. And so that's why the kind of rate of adoption, you folks have written about this and talked about it. Yeah, why MCP won. Yeah. And it won because the universe decided that this was useful and maybe it gets supplanted by something else. Yeah. And maybe we discover, oh, maybe OpenAPI was good enough the whole time. I doubt that.swyx [00:32:09]: The meta lesson, this is, I mean, he's an investor in DevTools companies. I work in developer experience at DevRel in DevTools companies. Yep. Everyone wants to own the standard. Yeah. I'm sure you guys have tried to launch your own standards. Actually, it's Houseplant known for a standard, you know, obviously inbound marketing. But is there a standard or protocol that you ever tried to push? No.Dharmesh [00:32:30]: And there's a reason for this. Yeah. Is that? And I don't mean, need to mean, speak for the people of HubSpot, but I personally. You kind of do. I'm not smart enough. That's not the, like, I think I have a. You're smart. Not enough for that. I'm much better off understanding the standards that are out there. And I'm more on the composability side. Let's, like, take the pieces of technology that exist out there, combine them in creative, unique ways. And I like to consume standards. I don't like to, and that's not that I don't like to create them. I just don't think I have the, both the raw wattage or the credibility. It's like, okay, well, who the heck is Dharmesh, and why should we adopt a standard he created?swyx [00:33:07]: Yeah, I mean, there are people who don't monetize standards, like OpenTelemetry is a big standard, and LightStep never capitalized on that.Dharmesh [00:33:15]: So, okay, so if I were to do a standard, there's two things that have been in my head in the past. I was one around, a very, very basic one around, I don't even have the domain, I have a domain for everything, for open marketing. Because the issue we had in HubSpot grew up in the marketing space. There we go. There was no standard around data formats and things like that. It doesn't go anywhere. But the other one, and I did not mean to go here, but I'm going to go here. It's called OpenGraph. I know the term was already taken, but it hasn't been used for like 15 years now for its original purpose. But what I think should exist in the world is right now, our information, all of us, nodes are in the social graph at Meta or the professional graph at LinkedIn. Both of which are actually relatively closed in actually very annoying ways. Like very, very closed, right? Especially LinkedIn. Especially LinkedIn. I personally believe that if it's my data, and if I would get utility out of it being open, I should be able to make my data open or publish it in whatever forms that I choose, as long as I have control over it as opt-in. So the idea is around OpenGraph that says, here's a standard, here's a way to publish it. I should be able to go to OpenGraph.org slash Dharmesh dot JSON and get it back. And it's like, here's your stuff, right? And I can choose along the way and people can write to it and I can prove. And there can be an entire system. And if I were to do that, I would do it as a... Like a public benefit, non-profit-y kind of thing, as this is a contribution to society. I wouldn't try to commercialize that. Have you looked at AdProto? What's that? AdProto.swyx [00:34:43]: It's the protocol behind Blue Sky. Okay. My good friend, Dan Abramov, who was the face of React for many, many years, now works there. And he actually did a talk that I can send you, which basically kind of tries to articulate what you just said. But he does, he loves doing these like really great analogies, which I think you'll like. Like, you know, a lot of our data is behind a handle, behind a domain. Yep. So he's like, all right, what if we flip that? What if it was like our handle and then the domain? Yep. So, and that's really like your data should belong to you. Yep. And I should not have to wait 30 days for my Twitter data to export. Yep.Dharmesh [00:35:19]: you should be able to at least be able to automate it or do like, yes, I should be able to plug it into an agentic thing. Yeah. Yes. I think we're... Because so much of our data is... Locked up. I think the trick here isn't that standard. It is getting the normies to care.swyx [00:35:37]: Yeah. Because normies don't care.Dharmesh [00:35:38]: That's true. But building on that, normies don't care. So, you know, privacy is a really hot topic and an easy word to use, but it's not a binary thing. Like there are use cases where, and we make these choices all the time, that I will trade, not all privacy, but I will trade some privacy for some productivity gain or some benefit to me that says, oh, I don't care about that particular data being online if it gives me this in return, or I don't mind sharing this information with this company.Alessio [00:36:02]: If I'm getting, you know, this in return, but that sort of should be my option. I think now with computer use, you can actually automate some of the exports. Yes. Like something we've been doing internally is like everybody exports their LinkedIn connections. Yep. And then internally, we kind of merge them together to see how we can connect our companies to customers or things like that.Dharmesh [00:36:21]: And not to pick on LinkedIn, but since we're talking about it, but they feel strongly enough on the, you know, do not take LinkedIn data that they will block even browser use kind of things or whatever. They go to great, great lengths, even to see patterns of usage. And it says, oh, there's no way you could have, you know, gotten that particular thing or whatever without, and it's, so it's, there's...swyx [00:36:42]: Wasn't there a Supreme Court case that they lost? Yeah.Dharmesh [00:36:45]: So the one they lost was around someone that was scraping public data that was on the public internet. And that particular company had not signed any terms of service or whatever. It's like, oh, I'm just taking data that's on, there was no, and so that's why they won. But now, you know, the question is around, can LinkedIn... I think they can. Like, when you use, as a user, you use LinkedIn, you are signing up for their terms of service. And if they say, well, this kind of use of your LinkedIn account that violates our terms of service, they can shut your account down, right? They can. And they, yeah, so, you know, we don't need to make this a discussion. By the way, I love the company, don't get me wrong. I'm an avid user of the product. You know, I've got... Yeah, I mean, you've got over a million followers on LinkedIn, I think. Yeah, I do. And I've known people there for a long, long time, right? And I have lots of respect. And I understand even where the mindset originally came from of this kind of members-first approach to, you know, a privacy-first. I sort of get that. But sometimes you sort of have to wonder, it's like, okay, well, that was 15, 20 years ago. There's likely some controlled ways to expose some data on some member's behalf and not just completely be a binary. It's like, no, thou shalt not have the data.swyx [00:37:54]: Well, just pay for sales navigator.Alessio [00:37:57]: Before we move to the next layer of instruction, anything else on MCP you mentioned? Let's move back and then I'll tie it back to MCPs.Dharmesh [00:38:05]: So I think the... Open this with agent. Okay, so I'll start with... Here's my kind of running thesis, is that as AI and agents evolve, which they're doing very, very quickly, we're going to look at them more and more. I don't like to anthropomorphize. We'll talk about why this is not that. Less as just like raw tools and more like teammates. They'll still be software. They should self-disclose as being software. I'm totally cool with that. But I think what's going to happen is that in the same way you might collaborate with a team member on Slack or Teams or whatever you use, you can imagine a series of agents that do specific things just like a team member might do, that you can delegate things to. You can collaborate. You can say, hey, can you take a look at this? Can you proofread that? Can you try this? You can... Whatever it happens to be. So I think it is... I will go so far as to say it's inevitable that we're going to have hybrid teams someday. And what I mean by hybrid teams... So back in the day, hybrid teams were, oh, well, you have some full-time employees and some contractors. Then it was like hybrid teams are some people that are in the office and some that are remote. That's the kind of form of hybrid. The next form of hybrid is like the carbon-based life forms and agents and AI and some form of software. So let's say we temporarily stipulate that I'm right about that over some time horizon that eventually we're going to have these kind of digitally hybrid teams. So if that's true, then the question you sort of ask yourself is that then what needs to exist in order for us to get the full value of that new model? It's like, okay, well... You sort of need to... It's like, okay, well, how do I... If I'm building a digital team, like, how do I... Just in the same way, if I'm interviewing for an engineer or a designer or a PM, whatever, it's like, well, that's why we have professional networks, right? It's like, oh, they have a presence on likely LinkedIn. I can go through that semi-structured, structured form, and I can see the experience of whatever, you know, self-disclosed. But, okay, well, agents are going to need that someday. And so I'm like, okay, well, this seems like a thread that's worth pulling on. That says, okay. So I... So agent.ai is out there. And it's LinkedIn for agents. It's LinkedIn for agents. It's a professional network for agents. And the more I pull on that thread, it's like, okay, well, if that's true, like, what happens, right? It's like, oh, well, they have a profile just like anyone else, just like a human would. It's going to be a graph underneath, just like a professional network would be. It's just that... And you can have its, you know, connections and follows, and agents should be able to post. That's maybe how they do release notes. Like, oh, I have this new version. Whatever they decide to post, it should just be able to... Behave as a node on the network of a professional network. As it turns out, the more I think about that and pull on that thread, the more and more things, like, start to make sense to me. So it may be more than just a pure professional network. So my original thought was, okay, well, it's a professional network and agents as they exist out there, which I think there's going to be more and more of, will kind of exist on this network and have the profile. But then, and this is always dangerous, I'm like, okay, I want to see a world where thousands of agents are out there in order for the... Because those digital employees, the digital workers don't exist yet in any meaningful way. And so then I'm like, oh, can I make that easier for, like... And so I have, as one does, it's like, oh, I'll build a low-code platform for building agents. How hard could that be, right? Like, very hard, as it turns out. But it's been fun. So now, agent.ai has 1.3 million users. 3,000 people have actually, you know, built some variation of an agent, sometimes just for their own personal productivity. About 1,000 of which have been published. And the reason this comes back to MCP for me, so imagine that and other networks, since I know agent.ai. So right now, we have an MCP server for agent.ai that exposes all the internally built agents that we have that do, like, super useful things. Like, you know, I have access to a Twitter API that I can subsidize the cost. And I can say, you know, if you're looking to build something for social media, these kinds of things, with a single API key, and it's all completely free right now, I'm funding it. That's a useful way for it to work. And then we have a developer to say, oh, I have this idea. I don't have to worry about open AI. I don't have to worry about, now, you know, this particular model is better. It has access to all the models with one key. And we proxy it kind of behind the scenes. And then expose it. So then we get this kind of community effect, right? That says, oh, well, someone else may have built an agent to do X. Like, I have an agent right now that I built for myself to do domain valuation for website domains because I'm obsessed with domains, right? And, like, there's no efficient market for domains. There's no Zillow for domains right now that tells you, oh, here are what houses in your neighborhood sold for. It's like, well, why doesn't that exist? We should be able to solve that problem. And, yes, you're still guessing. Fine. There should be some simple heuristic. So I built that. It's like, okay, well, let me go look for past transactions. You say, okay, I'm going to type in agent.ai, agent.com, whatever domain. What's it actually worth? I'm looking at buying it. It can go and say, oh, which is what it does. It's like, I'm going to go look at are there any published domain transactions recently that are similar, either use the same word, same top-level domain, whatever it is. And it comes back with an approximate value, and it comes back with its kind of rationale for why it picked the value and comparable transactions. Oh, by the way, this domain sold for published. Okay. So that agent now, let's say, existed on the web, on agent.ai. Then imagine someone else says, oh, you know, I want to build a brand-building agent for startups and entrepreneurs to come up with names for their startup. Like a common problem, every startup is like, ah, I don't know what to call it. And so they type in five random words that kind of define whatever their startup is. And you can do all manner of things, one of which is like, oh, well, I need to find the domain for it. What are possible choices? Now it's like, okay, well, it would be nice to know if there's an aftermarket price for it, if it's listed for sale. Awesome. Then imagine calling this valuation agent. It's like, okay, well, I want to find where the arbitrage is, where the agent valuation tool says this thing is worth $25,000. It's listed on GoDaddy for $5,000. It's close enough. Let's go do that. Right? And that's a kind of composition use case that in my future state. Thousands of agents on the network, all discoverable through something like MCP. And then you as a developer of agents have access to all these kind of Lego building blocks based on what you're trying to solve. Then you blend in orchestration, which is getting better and better with the reasoning models now. Just describe the problem that you have. Now, the next layer that we're all contending with is that how many tools can you actually give an LLM before the LLM breaks? That number used to be like 15 or 20 before you kind of started to vary dramatically. And so that's the thing I'm thinking about now. It's like, okay, if I want to... If I want to expose 1,000 of these agents to a given LLM, obviously I can't give it all 1,000. Is there some intermediate layer that says, based on your prompt, I'm going to make a best guess at which agents might be able to be helpful for this particular thing? Yeah.Alessio [00:44:37]: Yeah, like RAG for tools. Yep. I did build the Latent Space Researcher on agent.ai. Okay. Nice. Yeah, that seems like, you know, then there's going to be a Latent Space Scheduler. And then once I schedule a research, you know, and you build all of these things. By the way, my apologies for the user experience. You realize I'm an engineer. It's pretty good.swyx [00:44:56]: I think it's a normie-friendly thing. Yeah. That's your magic. HubSpot does the same thing.Alessio [00:45:01]: Yeah, just to like quickly run through it. You can basically create all these different steps. And these steps are like, you know, static versus like variable-driven things. How did you decide between this kind of like low-code-ish versus doing, you know, low-code with code backend versus like not exposing that at all? Any fun design decisions? Yeah. And this is, I think...Dharmesh [00:45:22]: I think lots of people are likely sitting in exactly my position right now, coming through the choosing between deterministic. Like if you're like in a business or building, you know, some sort of agentic thing, do you decide to do a deterministic thing? Or do you go non-deterministic and just let the alum handle it, right, with the reasoning models? The original idea and the reason I took the low-code stepwise, a very deterministic approach. A, the reasoning models did not exist at that time. That's thing number one. Thing number two is if you can get... If you know in your head... If you know in your head what the actual steps are to accomplish whatever goal, why would you leave that to chance? There's no upside. There's literally no upside. Just tell me, like, what steps do you need executed? So right now what I'm playing with... So one thing we haven't talked about yet, and people don't talk about UI and agents. Right now, the primary interaction model... Or they don't talk enough about it. I know some people have. But it's like, okay, so we're used to the chatbot back and forth. Fine. I get that. But I think we're going to move to a blend of... Some of those things are going to be synchronous as they are now. But some are going to be... Some are going to be async. It's just going to put it in a queue, just like... And this goes back to my... Man, I talk fast. But I have this... I only have one other speed. It's even faster. So imagine it's like if you're working... So back to my, oh, we're going to have these hybrid digital teams. Like, you would not go to a co-worker and say, I'm going to ask you to do this thing, and then sit there and wait for them to go do it. Like, that's not how the world works. So it's nice to be able to just, like, hand something off to someone. It's like, okay, well, maybe I expect a response in an hour or a day or something like that.Dharmesh [00:46:52]: In terms of when things need to happen. So the UI around agents. So if you look at the output of agent.ai agents right now, they are the simplest possible manifestation of a UI, right? That says, oh, we have inputs of, like, four different types. Like, we've got a dropdown, we've got multi-select, all the things. It's like back in HTML, the original HTML 1.0 days, right? Like, you're the smallest possible set of primitives for a UI. And it just says, okay, because we need to collect some information from the user, and then we go do steps and do things. And generate some output in HTML or markup are the two primary examples. So the thing I've been asking myself, if I keep going down that path. So people ask me, I get requests all the time. It's like, oh, can you make the UI sort of boring? I need to be able to do this, right? And if I keep pulling on that, it's like, okay, well, now I've built an entire UI builder thing. Where does this end? And so I think the right answer, and this is what I'm going to be backcoding once I get done here, is around injecting a code generation UI generation into, the agent.ai flow, right? As a builder, you're like, okay, I'm going to describe the thing that I want, much like you would do in a vibe coding world. But instead of generating the entire app, it's going to generate the UI that exists at some point in either that deterministic flow or something like that. It says, oh, here's the thing I'm trying to do. Go generate the UI for me. And I can go through some iterations. And what I think of it as a, so it's like, I'm going to generate the code, generate the code, tweak it, go through this kind of prompt style, like we do with vibe coding now. And at some point, I'm going to be happy with it. And I'm going to hit save. And that's going to become the action in that particular step. It's like a caching of the generated code that I can then, like incur any inference time costs. It's just the actual code at that point.Alessio [00:48:29]: Yeah, I invested in a company called E2B, which does code sandbox. And they powered the LM arena web arena. So it's basically the, just like you do LMS, like text to text, they do the same for like UI generation. So if you're asking a model, how do you do it? But yeah, I think that's kind of where.Dharmesh [00:48:45]: That's the thing I'm really fascinated by. So the early LLM, you know, we're understandably, but laughably bad at simple arithmetic, right? That's the thing like my wife, Normies would ask us, like, you call this AI, like it can't, my son would be like, it's just stupid. It can't even do like simple arithmetic. And then like we've discovered over time that, and there's a reason for this, right? It's like, it's a large, there's, you know, the word language is in there for a reason in terms of what it's been trained on. It's not meant to do math, but now it's like, okay, well, the fact that it has access to a Python interpreter that I can actually call at runtime, that solves an entire body of problems that it wasn't trained to do. And it's basically a form of delegation. And so the thought that's kind of rattling around in my head is that that's great. So it's, it's like took the arithmetic problem and took it first. Now, like anything that's solvable through a relatively concrete Python program, it's able to do a bunch of things that I couldn't do before. Can we get to the same place with UI? I don't know what the future of UI looks like in a agentic AI world, but maybe let the LLM handle it, but not in the classic sense. Maybe it generates it on the fly, or maybe we go through some iterations and hit cache or something like that. So it's a little bit more predictable. Uh, I don't know, but yeah.Alessio [00:49:48]: And especially when is the human supposed to intervene? So, especially if you're composing them, most of them should not have a UI because then they're just web hooking to somewhere else. I just want to touch back. I don't know if you have more comments on this.swyx [00:50:01]: I was just going to ask when you, you said you got, you're going to go back to code. What

covid-19 god amazon ai english google business man technology work future space service state challenges san francisco design opportunities innovation evolution cost microsoft drop open network connecting chatgpt philosophy bitcoin supreme court invest nfts memory exploring auto lgbt switzerland lego period roi fill engineers agent privacy thousands ted talks ibm traffic average saas results cto react crm slack final thoughts machines openai gemini platforms blue sky api limitless ethereum business models shah web3 b2c gmail canva gpt python ui mm photoshop lama rem github hubspot locked zillow google analytics html ppc erp llm sam altman superhuman inbound personal life behave chai attribution venn graphs dns balancing work ras godaddy perplexity percentage miscellaneous rag anthropic ens automatically mad libs sla sentry alessio passionately abstraction lms houseplants ims mongodb rl lm json normies zooms mysql mcp gpts cursor domain names derek sivers semantic inference privacy concerns google search console latent csat mfm ai systems icann uis zep vested oauth postgres rest apis devrel raas kru search console shoring verifiable devtools pagerank open api my first million waas langchain hybrid teams mcps cogen dharmesh shah dharmesh dan abramov brett taylor graph theory lightstep yohei service raas jessica livingston latent space chatspot

LLMs for web developers with Roy Derks

PodRocket - A web development podcast from LogRocket

Play Episode Listen Later Mar 6, 2025 28:45

Roy Derks, Developer Experience at IBM, talks about the integration of Large Language Models (LLMs) in web development. We explore practical applications such as building agents, automating QA testing, and the evolving role of AI frameworks in software development. Links https://www.linkedin.com/in/gethackteam https://www.youtube.com/@gethackteam https://x.com/gethackteam https://hackteam.io We want to hear from you! How did you find us? Did you see us on Twitter? In a newsletter? Or maybe we were recommended by a friend? Let us know by sending an email to our producer, Emily, at emily.kochanekketner@logrocket.com (mailto:emily.kochanekketner@logrocket.com), or tweet at us at PodRocketPod (https://twitter.com/PodRocketpod). Follow us. Get free stickers. Follow us on Apple Podcasts, fill out this form (https://podrocket.logrocket.com/get-podrocket-stickers), and we'll send you free PodRocket stickers! What does LogRocket do? LogRocket provides AI-first session replay and analytics that surfaces the UX and technical issues impacting user experiences. Start understand where your users are struggling by trying it for free at [LogRocket.com]. Try LogRocket for free today.(https://logrocket.com/signup/?pdr) Special Guest: Roy Derks.

AI Agents & the Future of Work with LangChain's Harrison Chase | AI Basics with Google Cloud

This Week in Startups

Play Episode Listen Later Mar 4, 2025 19:58

In this episode: Jason sits down with Harrison Chase, CEO of LangChain, to explore how AI-powered agents are transforming the way startups operate. They discuss the shift from traditional entry-level roles to AI-driven automation, the importance of human-in-the-loop systems, and the future of AI-powered assistants in business. Harrison shares insights on how companies like Replit, Klarna, and GitLab are leveraging AI agents to streamline operations, plus a look ahead at what's next for AI-driven workflows. Brought to you in partnership with Google Cloud.*Timestamps:(0:00) Introduction to Startup Basics series & Importance of AI in startups(2:04) Partnership with Google Cloud & Introducing Harrison Chase from Langchain(4:38) Evolution of entry-level jobs & Examples of AI agents in startups(8:00) Challenges & Future of AI agents in startups(14:24) AI agents in collaborative spaces & Non-developers creating AI agents(18:40) Closing remarks and where to learn more*Uncover more valuable insights from AI leaders in Google Cloud's 'Future of AI: Perspectives for Startups' report. Discover what 23 AI industry leaders think about the future of AI—and how it impacts your business. Read their perspectives here: https://goo.gle/futureofai*Check out all of the Startup Basics episodes here: https://thisweekinstartups.com/basicsCheck out Google Cloud: https://cloud.google.com/Check out LangChain: https://www.langchain.com/*Follow Harrison:LinkedIn: https://www.linkedin.com/in/harrison-chase-961287118/X: https://x.com/hwchase17*Follow Jason:X: https://twitter.com/JasonLinkedIn: https://www.linkedin.com/in/jasoncalacanis*Follow TWiST:Twitter: https://twitter.com/TWiStartupsYouTube: https://www.youtube.com/thisweekinInstagram: https://www.instagram.com/thisweekinstartupsTikTok: https://www.tiktok.com/@thisweekinstartupsSubstack: https://twistartups.substack.com

ceo ai discover future challenges evolution startups partnership basics uncover future of work google cloud klarna gitlab replit langchain

Sakana AI - Chris Lu, Robert Tjarko Lange, Cong Lu

Machine Learning Street Talk

Play Episode Listen Later Mar 1, 2025 97:54

We speak with Sakana AI, who are building nature-inspired methods that could fundamentally transform how we develop AI systems.The guests include Chris Lu, a researcher who recently completed his DPhil at Oxford University under Prof. Jakob Foerster's supervision, where he focused on meta-learning and multi-agent systems. Chris is the first author of the DiscoPOP paper, which demonstrates how language models can discover and design better training algorithms. Also joining is Robert Tjarko Lange, a founding member of Sakana AI who specializes in evolutionary algorithms and large language models. Robert leads research at the intersection of evolutionary computation and foundation models, and is completing his PhD at TU Berlin on evolutionary meta-learning. The discussion also features Cong Lu, currently a Research Scientist at Google DeepMind's Open-Endedness team, who previously helped develop The AI Scientist and Intelligent Go-Explore.SPONSOR MESSAGES:***CentML offers competitive pricing for GenAI model deployment, with flexible options to suit a wide range of models, from small to large-scale deployments. Check out their super fast DeepSeek R1 hosting!https://centml.ai/pricing/Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. They are hiring a Chief Engineer and ML engineers. Events in Zurich. Goto https://tufalabs.ai/**** DiscoPOP - A framework where language models discover their own optimization algorithms* EvoLLM - Using language models as evolution strategies for optimizationThe AI Scientist - A fully automated system that conducts scientific research end-to-end* Neural Attention Memory Models (NAMMs) - Evolved memory systems that make transformers both faster and more accurateTRANSCRIPT + REFS:https://www.dropbox.com/scl/fi/gflcyvnujp8cl7zlv3v9d/Sakana.pdf?rlkey=woaoo82943170jd4yyi2he71c&dl=0Robert Tjarko Langehttps://roberttlange.com/Chris Luhttps://chrislu.page/Cong Luhttps://www.conglu.co.uk/Sakanahttps://sakana.ai/blog/TOC:1. LLMs for Algorithm Generation and Optimization [00:00:00] 1.1 LLMs generating algorithms for training other LLMs [00:04:00] 1.2 Evolutionary black-box optim using neural network loss parameterization [00:11:50] 1.3 DiscoPOP: Non-convex loss function for noisy data [00:20:45] 1.4 External entropy Injection for preventing Model collapse [00:26:25] 1.5 LLMs for black-box optimization using abstract numerical sequences2. Model Learning and Generalization [00:31:05] 2.1 Fine-tuning on teacher algorithm trajectories [00:31:30] 2.2 Transformers learning gradient descent [00:33:00] 2.3 LLM tokenization biases towards specific numbers [00:34:50] 2.4 LLMs as evolution strategies for black box optimization [00:38:05] 2.5 DiscoPOP: LLMs discovering novel optimization algorithms3. AI Agents and System Architectures [00:51:30] 3.1 ARC challenge: Induction vs. transformer approaches [00:54:35] 3.2 LangChain / modular agent components [00:57:50] 3.3 Debate improves LLM truthfulness [01:00:55] 3.4 Time limits controlling AI agent systems [01:03:00] 3.5 Gemini: Million-token context enables flatter hierarchies [01:04:05] 3.6 Agents follow own interest gradients [01:09:50] 3.7 Go-Explore algorithm: archive-based exploration [01:11:05] 3.8 Foundation models for interesting state discovery [01:13:00] 3.9 LLMs leverage prior game knowledge4. AI for Scientific Discovery and Human Alignment [01:17:45] 4.1 Encoding Alignment & Aesthetics via Reward Functions [01:20:00] 4.2 AI Scientist: Automated Open-Ended Scientific Discovery [01:24:15] 4.3 DiscoPOP: LLM for Preference Optimization Algorithms [01:28:30] 4.4 Balancing AI Knowledge with Human Understanding [01:33:55] 4.5 AI-Driven Conferences and Paper Review

MongoDB's Sahir Azam: Vector Databases and the Data Structure of AI

Training Data

Play Episode Listen Later Feb 13, 2025 44:26

MongoDB product leader Sahir Azam explains how vector databases have evolved from semantic search to become the essential memory and state layer for AI applications. He describes his view of how AI is transforming software development generally, and how combining vectors, graphs and traditional data structures enables high-quality retrieval needed for mission-critical enterprise AI use cases. Drawing from MongoDB's successful cloud transformation, Azam shares his vision for democratizing AI development by making sophisticated capabilities accessible to mainstream developers through integrated tools and abstractions. Hosted by: Sonya Huang and Pat Grady, Sequoia Capital Mentioned in this episode: Introducing ambient agents: Blog post by Langchain on a new UX pattern where AI agents can listen to an event stream and act on it Google Gemini Deep Research: Sahir enjoys its amazing product experience Perplexity: AI search app that Sahir admires for its product craft Snipd: AI powered podcast app Sahir likes

ai drawing blog structure ux databases vector mongodb azam langchain sahir pat grady

AI Performance Declines Under Pressure, UK Orders Apple Backdoor, and IT Jobs Face Automation Crisis

Business of Tech

Play Episode Listen Later Feb 12, 2025 15:06

Artificial intelligence agents are facing significant challenges when tasked with complex responsibilities, as highlighted by a recent study from Langchain. The research indicates that AI performance deteriorates under cognitive overload, with one model's effectiveness dropping to just 2% when managing more than seven domains. This finding emphasizes the need for businesses to design AI systems that can effectively manage complexity rather than assuming that AI can scale to handle human-like multitasking. The implications are particularly relevant for industries reliant on automation, such as customer service and IT operations.In a related development, researchers from Stanford and the University of Washington have introduced a new AI reasoning model called S1, which can be trained at a fraction of the cost of existing high-end models. This innovation raises concerns about the commoditization of AI, as smaller teams can replicate sophisticated models with minimal resources. Meanwhile, Hugging Face has quickly developed an open-source AI research agent that aims to compete with OpenAI's offerings, showcasing the rapid advancements in AI capabilities and the importance of community contributions in this space.The podcast also discusses the ongoing legal battles surrounding AI and copyright, particularly the recent ruling in favor of Thomson Reuters against Ross Intelligence. This case underscores the complexities of how AI tools are trained using copyrighted material and sets a precedent that could impact future AI developments. As AI continues to evolve, the legal landscape surrounding its use and the rights of content creators remains a critical area of concern.Finally, the episode touches on the rising unemployment rate in the IT sector, attributed to the increasing influence of AI and automation. The data reveals a significant jump in unemployment among IT workers, with many routine jobs being automated rather than replaced. This shift highlights the need for IT professionals to upskill and adapt to the changing job market, focusing on areas such as AI integration and cybersecurity, as traditional software development roles decline. Four things to know today 00:00 AI Agents Struggle Under Pressure—More Complexity Means Less Accuracy03:23 Big AI, Big Trouble? Low-Cost Models Challenge Industry Leaders07:17 Apple Faces UK Encryption Fight, CISA's Role Strengthens, and AI Copyright Battle Heats Up10:38 AI Disrupts IT Hiring: Fewer Software Jobs, More Automation, and Higher Unemployment Supported by: https://www.huntress.com/mspradio/ Event: https://nerdiocon.com/ All our Sponsors: https://businessof.tech/sponsors/ Do you want the show on your podcast app or the written versions of the stories? Subscribe to the Business of Tech: https://www.businessof.tech/subscribe/Looking for a link from the stories? The entire script of the show, with links to articles, are posted in each story on https://www.businessof.tech/ Support the show on Patreon: https://patreon.com/mspradio/ Want to be a guest on Business of Tech: Daily 10-Minute IT Services Insights? Send Dave Sobel a message on PodMatch, here: https://www.podmatch.com/hostdetailpreview/businessoftech Want our stuff? Cool Merch? Wear “Why Do We Care?” - Visit https://mspradio.myspreadshop.com Follow us on:LinkedIn: https://www.linkedin.com/company/28908079/YouTube: https://youtube.com/mspradio/Facebook: https://www.facebook.com/mspradionews/Instagram: https://www.instagram.com/mspradio/TikTok: https://www.tiktok.com/@businessoftechBluesky: https://bsky.app/profile/businessof.tech

LangChain and Agentic AI Engineering with Erick Friis

Software Engineering Daily

Play Episode Listen Later Feb 11, 2025 41:50

LangChain is a popular open-source framework to build applications that integrate LLMs with external data sources like APIs, databases, or custom knowledge bases. It's commonly used for chatbots, question-answering systems, and workflow automation. Its flexibility and extensibility have made it something of a standard for creating sophisticated AI-driven software. Erick Friis is a Founding Engineer The post LangChain and Agentic AI Engineering with Erick Friis appeared first on Software Engineering Daily.

ai engineering apis agentic friis langchain software engineering daily

Agent Engineering with Pydantic + Graphs — with Samuel Colvin

Latent Space: The AI Engineer Podcast â€” CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Feb 6, 2025 64:04

Did you know that adding a simple Code Interpreter took o3 from 9.2% to 32% on FrontierMath? The Latent Space crew is hosting a hack night Feb 11th in San Francisco focused on CodeGen use cases, co-hosted with E2B and Edge AGI; watch E2B's new workshop and RSVP here!We're happy to announce that today's guest Samuel Colvin will be teaching his very first Pydantic AI workshop at the newly announced AI Engineer NYC Workshops day on Feb 22! 25 tickets left.If you're a Python developer, it's very likely that you've heard of Pydantic. Every month, it's downloaded >300,000,000 times, making it one of the top 25 PyPi packages. OpenAI uses it in its SDK for structured outputs, it's at the core of FastAPI, and if you've followed our AI Engineer Summit conference, Jason Liu of Instructor has given two great talks about it: “Pydantic is all you need” and “Pydantic is STILL all you need”. Now, Samuel Colvin has raised $17M from Sequoia to turn Pydantic from an open source project to a full stack AI engineer platform with Logfire, their observability platform, and PydanticAI, their new agent framework.Logfire: bringing OTEL to AIOpenTelemetry recently merged Semantic Conventions for LLM workloads which provides standard definitions to track performance like gen_ai.server.time_per_output_token. In Sam's view at least 80% of new apps being built today have some sort of LLM usage in them, and just like web observability platform got replaced by cloud-first ones in the 2010s, Logfire wants to do the same for AI-first apps. If you're interested in the technical details, Logfire migrated away from Clickhouse to Datafusion for their backend. We spent some time on the importance of picking open source tools you understand and that you can actually contribute to upstream, rather than the more popular ones; listen in ~43:19 for that part.Agents are the killer app for graphsPydantic AI is their attempt at taking a lot of the learnings that LangChain and the other early LLM frameworks had, and putting Python best practices into it. At an API level, it's very similar to the other libraries: you can call LLMs, create agents, do function calling, do evals, etc.They define an “Agent” as a container with a system prompt, tools, structured result, and an LLM. Under the hood, each Agent is now a graph of function calls that can orchestrate multi-step LLM interactions. You can start simple, then move toward fully dynamic graph-based control flow if needed.“We were compelled enough by graphs once we got them right that our agent implementation [...] is now actually a graph under the hood.”Why Graphs?* More natural for complex or multi-step AI workflows.* Easy to visualize and debug with mermaid diagrams.* Potential for distributed runs, or “waiting days” between steps in certain flows.In parallel, you see folks like Emil Eifrem of Neo4j talk about GraphRAG as another place where graphs fit really well in the AI stack, so it might be time for more people to take them seriously.Full Video EpisodeLike and subscribe!Chapters* 00:00:00 Introductions* 00:00:24 Origins of Pydantic* 00:05:28 Pydantic's AI moment * 00:08:05 Why build a new agents framework?* 00:10:17 Overview of Pydantic AI* 00:12:33 Becoming a believer in graphs* 00:24:02 God Model vs Compound AI Systems* 00:28:13 Why not build an LLM gateway?* 00:31:39 Programmatic testing vs live evals* 00:35:51 Using OpenTelemetry for AI traces* 00:43:19 Why they don't use Clickhouse* 00:48:34 Competing in the observability space* 00:50:41 Licensing decisions for Pydantic and LogFire* 00:51:48 Building Pydantic.run* 00:55:24 Marimo and the future of Jupyter notebooks* 00:57:44 London's AI sceneShow Notes* Sam Colvin* Pydantic* Pydantic AI* Logfire* Pydantic.run* Zod* E2B* Arize* Langsmith* Marimo* Prefect* GLA (Google Generative Language API)* OpenTelemetry* Jason Liu* Sebastian Ramirez* Bogomil Balkansky* Hood Chatham* Jeremy Howard* Andrew LambTranscriptAlessio [00:00:03]: Hey, everyone. Welcome to the Latent Space podcast. This is Alessio, partner and CTO at Decibel Partners, and I'm joined by my co-host Swyx, founder of Smol AI.Swyx [00:00:12]: Good morning. And today we're very excited to have Sam Colvin join us from Pydantic AI. Welcome. Sam, I heard that Pydantic is all we need. Is that true?Samuel [00:00:24]: I would say you might need Pydantic AI and Logfire as well, but it gets you a long way, that's for sure.Swyx [00:00:29]: Pydantic almost basically needs no introduction. It's almost 300 million downloads in December. And obviously, in the previous podcasts and discussions we've had with Jason Liu, he's been a big fan and promoter of Pydantic and AI.Samuel [00:00:45]: Yeah, it's weird because obviously I didn't create Pydantic originally for uses in AI, it predates LLMs. But it's like we've been lucky that it's been picked up by that community and used so widely.Swyx [00:00:58]: Actually, maybe we'll hear it. Right from you, what is Pydantic and maybe a little bit of the origin story?Samuel [00:01:04]: The best name for it, which is not quite right, is a validation library. And we get some tension around that name because it doesn't just do validation, it will do coercion by default. We now have strict mode, so you can disable that coercion. But by default, if you say you want an integer field and you get in a string of 1, 2, 3, it will convert it to 123 and a bunch of other sensible conversions. And as you can imagine, the semantics around it. Exactly when you convert and when you don't, it's complicated, but because of that, it's more than just validation. Back in 2017, when I first started it, the different thing it was doing was using type hints to define your schema. That was controversial at the time. It was genuinely disapproved of by some people. I think the success of Pydantic and libraries like FastAPI that build on top of it means that today that's no longer controversial in Python. And indeed, lots of other people have copied that route, but yeah, it's a data validation library. It uses type hints for the for the most part and obviously does all the other stuff you want, like serialization on top of that. But yeah, that's the core.Alessio [00:02:06]: Do you have any fun stories on how JSON schemas ended up being kind of like the structure output standard for LLMs? And were you involved in any of these discussions? Because I know OpenAI was, you know, one of the early adopters. So did they reach out to you? Was there kind of like a structure output console in open source that people were talking about or was it just a random?Samuel [00:02:26]: No, very much not. So I originally. Didn't implement JSON schema inside Pydantic and then Sebastian, Sebastian Ramirez, FastAPI came along and like the first I ever heard of him was over a weekend. I got like 50 emails from him or 50 like emails as he was committing to Pydantic, adding JSON schema long pre version one. So the reason it was added was for OpenAPI, which is obviously closely akin to JSON schema. And then, yeah, I don't know why it was JSON that got picked up and used by OpenAI. It was obviously very convenient for us. That's because it meant that not only can you do the validation, but because Pydantic will generate you the JSON schema, it will it kind of can be one source of source of truth for structured outputs and tools.Swyx [00:03:09]: Before we dive in further on the on the AI side of things, something I'm mildly curious about, obviously, there's Zod in JavaScript land. Every now and then there is a new sort of in vogue validation library that that takes over for quite a few years and then maybe like some something else comes along. Is Pydantic? Is it done like the core Pydantic?Samuel [00:03:30]: I've just come off a call where we were redesigning some of the internal bits. There will be a v3 at some point, which will not break people's code half as much as v2 as in v2 was the was the massive rewrite into Rust, but also fixing all the stuff that was broken back from like version zero point something that we didn't fix in v1 because it was a side project. We have plans to move some of the basically store the data in Rust types after validation. Not completely. So we're still working to design the Pythonic version of it, in order for it to be able to convert into Python types. So then if you were doing like validation and then serialization, you would never have to go via a Python type we reckon that can give us somewhere between three and five times another three to five times speed up. That's probably the biggest thing. Also, like changing how easy it is to basically extend Pydantic and define how particular types, like for example, NumPy arrays are validated and serialized. But there's also stuff going on. And for example, Jitter, the JSON library in Rust that does the JSON parsing, has SIMD implementation at the moment only for AMD64. So we can add that. We need to go and add SIMD for other instruction sets. So there's a bunch more we can do on performance. I don't think we're going to go and revolutionize Pydantic, but it's going to continue to get faster, continue, hopefully, to allow people to do more advanced things. We might add a binary format like CBOR for serialization for when you'll just want to put the data into a database and probably load it again from Pydantic. So there are some things that will come along, but for the most part, it should just get faster and cleaner.Alessio [00:05:04]: From a focus perspective, I guess, as a founder too, how did you think about the AI interest rising? And then how do you kind of prioritize, okay, this is worth going into more, and we'll talk about Pydantic AI and all of that. What was maybe your early experience with LLAMP, and when did you figure out, okay, this is something we should take seriously and focus more resources on it?Samuel [00:05:28]: I'll answer that, but I'll answer what I think is a kind of parallel question, which is Pydantic's weird, because Pydantic existed, obviously, before I was starting a company. I was working on it in my spare time, and then beginning of 22, I started working on the rewrite in Rust. And I worked on it full-time for a year and a half, and then once we started the company, people came and joined. And it was a weird project, because that would never go away. You can't get signed off inside a startup. Like, we're going to go off and three engineers are going to work full-on for a year in Python and Rust, writing like 30,000 lines of Rust just to release open-source-free Python library. The result of that has been excellent for us as a company, right? As in, it's made us remain entirely relevant. And it's like, Pydantic is not just used in the SDKs of all of the AI libraries, but I can't say which one, but one of the big foundational model companies, when they upgraded from Pydantic v1 to v2, their number one internal model... The metric of performance is time to first token. That went down by 20%. So you think about all of the actual AI going on inside, and yet at least 20% of the CPU, or at least the latency inside requests was actually Pydantic, which shows like how widely it's used. So we've benefited from doing that work, although it didn't, it would have never have made financial sense in most companies. In answer to your question about like, how do we prioritize AI, I mean, the honest truth is we've spent a lot of the last year and a half building. Good general purpose observability inside LogFire and making Pydantic good for general purpose use cases. And the AI has kind of come to us. Like we just, not that we want to get away from it, but like the appetite, uh, both in Pydantic and in LogFire to go and build with AI is enormous because it kind of makes sense, right? Like if you're starting a new greenfield project in Python today, what's the chance that you're using GenAI 80%, let's say, globally, obviously it's like a hundred percent in California, but even worldwide, it's probably 80%. Yeah. And so everyone needs that stuff. And there's so much yet to be figured out so much like space to do things better in the ecosystem in a way that like to go and implement a database that's better than Postgres is a like Sisyphean task. Whereas building, uh, tools that are better for GenAI than some of the stuff that's about now is not very difficult. Putting the actual models themselves to one side.Alessio [00:07:40]: And then at the same time, then you released Pydantic AI recently, which is, uh, um, you know, agent framework and early on, I would say everybody like, you know, Langchain and like, uh, Pydantic kind of like a first class support, a lot of these frameworks, we're trying to use you to be better. What was the decision behind we should do our own framework? Were there any design decisions that you disagree with any workloads that you think people didn't support? Well,Samuel [00:08:05]: it wasn't so much like design and workflow, although I think there were some, some things we've done differently. Yeah. I think looking in general at the ecosystem of agent frameworks, the engineering quality is far below that of the rest of the Python ecosystem. There's a bunch of stuff that we have learned how to do over the last 20 years of building Python libraries and writing Python code that seems to be abandoned by people when they build agent frameworks. Now I can kind of respect that, particularly in the very first agent frameworks, like Langchain, where they were literally figuring out how to go and do this stuff. It's completely understandable that you would like basically skip some stuff.Samuel [00:08:42]: I'm shocked by the like quality of some of the agent frameworks that have come out recently from like well-respected names, which it just seems to be opportunism and I have little time for that, but like the early ones, like I think they were just figuring out how to do stuff and just as lots of people have learned from Pydantic, we were able to learn a bit from them. I think from like the gap we saw and the thing we were frustrated by was the production readiness. And that means things like type checking, even if type checking makes it hard. Like Pydantic AI, I will put my hand up now and say it has a lot of generics and you need to, it's probably easier to use it if you've written a bit of Rust and you really understand generics, but like, and that is, we're not claiming that that makes it the easiest thing to use in all cases, we think it makes it good for production applications in big systems where type checking is a no-brainer in Python. But there are also a bunch of stuff we've learned from maintaining Pydantic over the years that we've gone and done. So every single example in Pydantic AI's documentation is run on Python. As part of tests and every single print output within an example is checked during tests. So it will always be up to date. And then a bunch of things that, like I say, are standard best practice within the rest of the Python ecosystem, but I'm not followed surprisingly by some AI libraries like coverage, linting, type checking, et cetera, et cetera, where I think these are no-brainers, but like weirdly they're not followed by some of the other libraries.Alessio [00:10:04]: And can you just give an overview of the framework itself? I think there's kind of like the. LLM calling frameworks, there are the multi-agent frameworks, there's the workflow frameworks, like what does Pydantic AI do?Samuel [00:10:17]: I glaze over a bit when I hear all of the different sorts of frameworks, but I like, and I will tell you when I built Pydantic, when I built Logfire and when I built Pydantic AI, my methodology is not to go and like research and review all of the other things. I kind of work out what I want and I go and build it and then feedback comes and we adjust. So the fundamental building block of Pydantic AI is agents. The exact definition of agents and how you want to define them. is obviously ambiguous and our things are probably sort of agent-lit, not that we would want to go and rename them to agent-lit, but like the point is you probably build them together to build something and most people will call an agent. So an agent in our case has, you know, things like a prompt, like system prompt and some tools and a structured return type if you want it, that covers the vast majority of cases. There are situations where you want to go further and the most complex workflows where you want graphs and I resisted graphs for quite a while. I was sort of of the opinion you didn't need them and you could use standard like Python flow control to do all of that stuff. I had a few arguments with people, but I basically came around to, yeah, I can totally see why graphs are useful. But then we have the problem that by default, they're not type safe because if you have a like add edge method where you give the names of two different edges, there's no type checking, right? Even if you go and do some, I'm not, not all the graph libraries are AI specific. So there's a, there's a graph library called, but it allows, it does like a basic runtime type checking. Ironically using Pydantic to try and make up for the fact that like fundamentally that graphs are not typed type safe. Well, I like Pydantic, but it did, that's not a real solution to have to go and run the code to see if it's safe. There's a reason that starting type checking is so powerful. And so we kind of, from a lot of iteration eventually came up with a system of using normally data classes to define nodes where you return the next node you want to call and where we're able to go and introspect the return type of a node to basically build the graph. And so the graph is. Yeah. Inherently type safe. And once we got that right, I, I wasn't, I'm incredibly excited about graphs. I think there's like masses of use cases for them, both in gen AI and other development, but also software's all going to have interact with gen AI, right? It's going to be like web. There's no longer be like a web department in a company is that there's just like all the developers are building for web building with databases. The same is going to be true for gen AI.Alessio [00:12:33]: Yeah. I see on your docs, you call an agent, a container that contains a system prompt function. Tools, structure, result, dependency type model, and then model settings. Are the graphs in your mind, different agents? Are they different prompts for the same agent? What are like the structures in your mind?Samuel [00:12:52]: So we were compelled enough by graphs once we got them right, that we actually merged the PR this morning. That means our agent implementation without changing its API at all is now actually a graph under the hood as it is built using our graph library. So graphs are basically a lower level tool that allow you to build these complex workflows. Our agents are technically one of the many graphs you could go and build. And we just happened to build that one for you because it's a very common, commonplace one. But obviously there are cases where you need more complex workflows where the current agent assumptions don't work. And that's where you can then go and use graphs to build more complex things.Swyx [00:13:29]: You said you were cynical about graphs. What changed your mind specifically?Samuel [00:13:33]: I guess people kept giving me examples of things that they wanted to use graphs for. And my like, yeah, but you could do that in standard flow control in Python became a like less and less compelling argument to me because I've maintained those systems that end up with like spaghetti code. And I could see the appeal of this like structured way of defining the workflow of my code. And it's really neat that like just from your code, just from your type hints, you can get out a mermaid diagram that defines exactly what can go and happen.Swyx [00:14:00]: Right. Yeah. You do have very neat implementation of sort of inferring the graph from type hints, I guess. Yeah. Is what I would call it. Yeah. I think the question always is I have gone back and forth. I used to work at Temporal where we would actually spend a lot of time complaining about graph based workflow solutions like AWS step functions. And we would actually say that we were better because you could use normal control flow that you already knew and worked with. Yours, I guess, is like a little bit of a nice compromise. Like it looks like normal Pythonic code. But you just have to keep in mind what the type hints actually mean. And that's what we do with the quote unquote magic that the graph construction does.Samuel [00:14:42]: Yeah, exactly. And if you look at the internal logic of actually running a graph, it's incredibly simple. It's basically call a node, get a node back, call that node, get a node back, call that node. If you get an end, you're done. We will add in soon support for, well, basically storage so that you can store the state between each node that's run. And then the idea is you can then distribute the graph and run it across computers. And also, I mean, the other weird, the other bit that's really valuable is across time. Because it's all very well if you look at like lots of the graph examples that like Claude will give you. If it gives you an example, it gives you this lovely enormous mermaid chart of like the workflow, for example, managing returns if you're an e-commerce company. But what you realize is some of those lines are literally one function calls another function. And some of those lines are wait six days for the customer to print their like piece of paper and put it in the post. And if you're writing like your demo. Project or your like proof of concept, that's fine because you can just say, and now we call this function. But when you're building when you're in real in real life, that doesn't work. And now how do we manage that concept to basically be able to start somewhere else in the in our code? Well, this graph implementation makes it incredibly easy because you just pass the node that is the start point for carrying on the graph and it continues to run. So it's things like that where I was like, yeah, I can just imagine how things I've done in the past would be fundamentally easier to understand if we had done them with graphs.Swyx [00:16:07]: You say imagine, but like right now, this pedantic AI actually resume, you know, six days later, like you said, or is this just like a theoretical thing we can go someday?Samuel [00:16:16]: I think it's basically Q&A. So there's an AI that's asking the user a question and effectively you then call the CLI again to continue the conversation. And it basically instantiates the node and calls the graph with that node again. Now, we don't have the logic yet for effectively storing state in the database between individual nodes that we're going to add soon. But like the rest of it is basically there.Swyx [00:16:37]: It does make me think that not only are you competing with Langchain now and obviously Instructor, and now you're going into sort of the more like orchestrated things like Airflow, Prefect, Daxter, those guys.Samuel [00:16:52]: Yeah, I mean, we're good friends with the Prefect guys and Temporal have the same investors as us. And I'm sure that my investor Bogomol would not be too happy if I was like, oh, yeah, by the way, as well as trying to take on Datadog. We're also going off and trying to take on Temporal and everyone else doing that. Obviously, we're not doing all of the infrastructure of deploying that right yet, at least. We're, you know, we're just building a Python library. And like what's crazy about our graph implementation is, sure, there's a bit of magic in like introspecting the return type, you know, extracting things from unions, stuff like that. But like the actual calls, as I say, is literally call a function and get back a thing and call that. It's like incredibly simple and therefore easy to maintain. The question is, how useful is it? Well, I don't know yet. I think we have to go and find out. We have a whole. We've had a slew of people joining our Slack over the last few days and saying, tell me how good Pydantic AI is. How good is Pydantic AI versus Langchain? And I refuse to answer. That's your job to go and find that out. Not mine. We built a thing. I'm compelled by it, but I'm obviously biased. The ecosystem will work out what the useful tools are.Swyx [00:17:52]: Bogomol was my board member when I was at Temporal. And I think I think just generally also having been a workflow engine investor and participant in this space, it's a big space. Like everyone needs different functions. I think the one thing that I would say like yours, you know, as a library, you don't have that much control of it over the infrastructure. I do like the idea that each new agents or whatever or unit of work, whatever you call that should spin up in this sort of isolated boundaries. Whereas yours, I think around everything runs in the same process. But you ideally want to sort of spin out its own little container of things.Samuel [00:18:30]: I agree with you a hundred percent. And we will. It would work now. Right. As in theory, you're just like as long as you can serialize the calls to the next node, you just have to all of the different containers basically have to have the same the same code. I mean, I'm super excited about Cloudflare workers running Python and being able to install dependencies. And if Cloudflare could only give me my invitation to the private beta of that, we would be exploring that right now because I'm super excited about that as a like compute level for some of this stuff where exactly what you're saying, basically. You can run everything as an individual. Like worker function and distribute it. And it's resilient to failure, et cetera, et cetera.Swyx [00:19:08]: And it spins up like a thousand instances simultaneously. You know, you want it to be sort of truly serverless at once. Actually, I know we have some Cloudflare friends who are listening, so hopefully they'll get in front of the line. Especially.Samuel [00:19:19]: I was in Cloudflare's office last week shouting at them about other things that frustrate me. I have a love-hate relationship with Cloudflare. Their tech is awesome. But because I use it the whole time, I then get frustrated. So, yeah, I'm sure I will. I will. I will get there soon.Swyx [00:19:32]: There's a side tangent on Cloudflare. Is Python supported at full? I actually wasn't fully aware of what the status of that thing is.Samuel [00:19:39]: Yeah. So Pyodide, which is Python running inside the browser in scripting, is supported now by Cloudflare. They basically, they're having some struggles working out how to manage, ironically, dependencies that have binaries, in particular, Pydantic. Because these workers where you can have thousands of them on a given metal machine, you don't want to have a difference. You basically want to be able to have a share. Shared memory for all the different Pydantic installations, effectively. That's the thing they work out. They're working out. But Hood, who's my friend, who is the primary maintainer of Pyodide, works for Cloudflare. And that's basically what he's doing, is working out how to get Python running on Cloudflare's network.Swyx [00:20:19]: I mean, the nice thing is that your binary is really written in Rust, right? Yeah. Which also compiles the WebAssembly. Yeah. So maybe there's a way that you'd build... You have just a different build of Pydantic and that ships with whatever your distro for Cloudflare workers is.Samuel [00:20:36]: Yes, that's exactly what... So Pyodide has builds for Pydantic Core and for things like NumPy and basically all of the popular binary libraries. Yeah. It's just basic. And you're doing exactly that, right? You're using Rust to compile the WebAssembly and then you're calling that shared library from Python. And it's unbelievably complicated, but it works. Okay.Swyx [00:20:57]: Staying on graphs a little bit more, and then I wanted to go to some of the other features that you have in Pydantic AI. I see in your docs, there are sort of four levels of agents. There's single agents, there's agent delegation, programmatic agent handoff. That seems to be what OpenAI swarms would be like. And then the last one, graph-based control flow. Would you say that those are sort of the mental hierarchy of how these things go?Samuel [00:21:21]: Yeah, roughly. Okay.Swyx [00:21:22]: You had some expression around OpenAI swarms. Well.Samuel [00:21:25]: And indeed, OpenAI have got in touch with me and basically, maybe I'm not supposed to say this, but basically said that Pydantic AI looks like what swarms would become if it was production ready. So, yeah. I mean, like, yeah, which makes sense. Awesome. Yeah. I mean, in fact, it was specifically saying, how can we give people the same feeling that they were getting from swarms that led us to go and implement graphs? Because my, like, just call the next agent with Python code was not a satisfactory answer to people. So it was like, okay, we've got to go and have a better answer for that. It's not like, let us to get to graphs. Yeah.Swyx [00:21:56]: I mean, it's a minimal viable graph in some sense. What are the shapes of graphs that people should know? So the way that I would phrase this is I think Anthropic did a very good public service and also kind of surprisingly influential blog post, I would say, when they wrote Building Effective Agents. We actually have the authors coming to speak at my conference in New York, which I think you're giving a workshop at. Yeah.Samuel [00:22:24]: I'm trying to work it out. But yes, I think so.Swyx [00:22:26]: Tell me if you're not. yeah, I mean, like, that was the first, I think, authoritative view of, like, what kinds of graphs exist in agents and let's give each of them a name so that everyone is on the same page. So I'm just kind of curious if you have community names or top five patterns of graphs.Samuel [00:22:44]: I don't have top five patterns of graphs. I would love to see what people are building with them. But like, it's been it's only been a couple of weeks. And of course, there's a point is that. Because they're relatively unopinionated about what you can go and do with them. They don't suit them. Like, you can go and do lots of lots of things with them, but they don't have the structure to go and have like specific names as much as perhaps like some other systems do. I think what our agents are, which have a name and I can't remember what it is, but this basically system of like, decide what tool to call, go back to the center, decide what tool to call, go back to the center and then exit. One form of graph, which, as I say, like our agents are effectively one implementation of a graph, which is why under the hood they are now using graphs. And it'll be interesting to see over the next few years whether we end up with these like predefined graph names or graph structures or whether it's just like, yep, I built a graph or whether graphs just turn out not to match people's mental image of what they want and die away. We'll see.Swyx [00:23:38]: I think there is always appeal. Every developer eventually gets graph religion and goes, oh, yeah, everything's a graph. And then they probably over rotate and go go too far into graphs. And then they have to learn a whole bunch of DSLs. And then they're like, actually, I didn't need that. I need this. And they scale back a little bit.Samuel [00:23:55]: I'm at the beginning of that process. I'm currently a graph maximalist, although I haven't actually put any into production yet. But yeah.Swyx [00:24:02]: This has a lot of philosophical connections with other work coming out of UC Berkeley on compounding AI systems. I don't know if you know of or care. This is the Gartner world of things where they need some kind of industry terminology to sell it to enterprises. I don't know if you know about any of that.Samuel [00:24:24]: I haven't. I probably should. I should probably do it because I should probably get better at selling to enterprises. But no, no, I don't. Not right now.Swyx [00:24:29]: This is really the argument is that instead of putting everything in one model, you have more control and more maybe observability to if you break everything out into composing little models and changing them together. And obviously, then you need an orchestration framework to do that. Yeah.Samuel [00:24:47]: And it makes complete sense. And one of the things we've seen with agents is they work well when they work well. But when they. Even if you have the observability through log five that you can see what was going on, if you don't have a nice hook point to say, hang on, this is all gone wrong. You have a relatively blunt instrument of basically erroring when you exceed some kind of limit. But like what you need to be able to do is effectively iterate through these runs so that you can have your own control flow where you're like, OK, we've gone too far. And that's where one of the neat things about our graph implementation is you can basically call next in a loop rather than just running the full graph. And therefore, you have this opportunity to to break out of it. But yeah, basically, it's the same point, which is like if you have two bigger unit of work to some extent, whether or not it involves gen AI. But obviously, it's particularly problematic in gen AI. You only find out afterwards when you've spent quite a lot of time and or money when it's gone off and done done the wrong thing.Swyx [00:25:39]: Oh, drop on this. We're not going to resolve this here, but I'll drop this and then we can move on to the next thing. This is the common way that we we developers talk about this. And then the machine learning researchers look at us. And laugh and say, that's cute. And then they just train a bigger model and they wipe us out in the next training run. So I think there's a certain amount of we are fighting the bitter lesson here. We're fighting AGI. And, you know, when AGI arrives, this will all go away. Obviously, on Latent Space, we don't really discuss that because I think AGI is kind of this hand wavy concept that isn't super relevant. But I think we have to respect that. For example, you could do a chain of thoughts with graphs and you could manually orchestrate a nice little graph that does like. Reflect, think about if you need more, more inference time, compute, you know, that's the hot term now. And then think again and, you know, scale that up. Or you could train Strawberry and DeepSeq R1. Right.Samuel [00:26:32]: I saw someone saying recently, oh, they were really optimistic about agents because models are getting faster exponentially. And I like took a certain amount of self-control not to describe that it wasn't exponential. But my main point was. If models are getting faster as quickly as you say they are, then we don't need agents and we don't really need any of these abstraction layers. We can just give our model and, you know, access to the Internet, cross our fingers and hope for the best. Agents, agent frameworks, graphs, all of this stuff is basically making up for the fact that right now the models are not that clever. In the same way that if you're running a customer service business and you have loads of people sitting answering telephones, the less well trained they are, the less that you trust them, the more that you need to give them a script to go through. Whereas, you know, so if you're running a bank and you have lots of customer service people who you don't trust that much, then you tell them exactly what to say. If you're doing high net worth banking, you just employ people who you think are going to be charming to other rich people and set them off to go and have coffee with people. Right. And the same is true of models. The more intelligent they are, the less we need to tell them, like structure what they go and do and constrain the routes in which they take.Swyx [00:27:42]: Yeah. Yeah. Agree with that. So I'm happy to move on. So the other parts of Pydantic AI that are worth commenting on, and this is like my last rant, I promise. So obviously, every framework needs to do its sort of model adapter layer, which is, oh, you can easily swap from OpenAI to Cloud to Grok. You also have, which I didn't know about, Google GLA, which I didn't really know about until I saw this in your docs, which is generative language API. I assume that's AI Studio? Yes.Samuel [00:28:13]: Google don't have good names for it. So Vertex is very clear. That seems to be the API that like some of the things use, although it returns 503 about 20% of the time. So... Vertex? No. Vertex, fine. But the... Oh, oh. GLA. Yeah. Yeah.Swyx [00:28:28]: I agree with that.Samuel [00:28:29]: So we have, again, another example of like, well, I think we go the extra mile in terms of engineering is we run on every commit, at least commit to main, we run tests against the live models. Not lots of tests, but like a handful of them. Oh, okay. And we had a point last week where, yeah, GLA is a little bit better. GLA1 was failing every single run. One of their tests would fail. And we, I think we might even have commented out that one at the moment. So like all of the models fail more often than you might expect, but like that one seems to be particularly likely to fail. But Vertex is the same API, but much more reliable.Swyx [00:29:01]: My rant here is that, you know, versions of this appear in Langchain and every single framework has to have its own little thing, a version of that. I would put to you, and then, you know, this is, this can be agree to disagree. This is not needed in Pydantic AI. I would much rather you adopt a layer like Lite LLM or what's the other one in JavaScript port key. And that's their job. They focus on that one thing and they, they normalize APIs for you. All new models are automatically added and you don't have to duplicate this inside of your framework. So for example, if I wanted to use deep seek, I'm out of luck because Pydantic AI doesn't have deep seek yet.Samuel [00:29:38]: Yeah, it does.Swyx [00:29:39]: Oh, it does. Okay. I'm sorry. But you know what I mean? Should this live in your code or should it live in a layer that's kind of your API gateway that's a defined piece of infrastructure that people have?Samuel [00:29:49]: And I think if a company who are well known, who are respected by everyone had come along and done this at the right time, maybe we should have done it a year and a half ago and said, we're going to be the universal AI layer. That would have been a credible thing to do. I've heard varying reports of Lite LLM is the truth. And it didn't seem to have exactly the type safety that we needed. Also, as I understand it, and again, I haven't looked into it in great detail. Part of their business model is proxying the request through their, through their own system to do the generalization. That would be an enormous put off to an awful lot of people. Honestly, the truth is I don't think it is that much work unifying the model. I get where you're coming from. I kind of see your point. I think the truth is that everyone is centralizing around open AIs. Open AI's API is the one to do. So DeepSeq support that. Grok with OK support that. Ollama also does it. I mean, if there is that library right now, it's more or less the open AI SDK. And it's very high quality. It's well type checked. It uses Pydantic. So I'm biased. But I mean, I think it's pretty well respected anyway.Swyx [00:30:57]: There's different ways to do this. Because also, it's not just about normalizing the APIs. You have to do secret management and all that stuff.Samuel [00:31:05]: Yeah. And there's also. There's Vertex and Bedrock, which to one extent or another, effectively, they host multiple models, but they don't unify the API. But they do unify the auth, as I understand it. Although we're halfway through doing Bedrock. So I don't know about it that well. But they're kind of weird hybrids because they support multiple models. But like I say, the auth is centralized.Swyx [00:31:28]: Yeah, I'm surprised they don't unify the API. That seems like something that I would do. You know, we can discuss all this all day. There's a lot of APIs. I agree.Samuel [00:31:36]: It would be nice if there was a universal one that we didn't have to go and build.Alessio [00:31:39]: And I guess the other side of, you know, routing model and picking models like evals. How do you actually figure out which one you should be using? I know you have one. First of all, you have very good support for mocking in unit tests, which is something that a lot of other frameworks don't do. So, you know, my favorite Ruby library is VCR because it just, you know, it just lets me store the HTTP requests and replay them. That part I'll kind of skip. I think you are busy like this test model. We're like just through Python. You try and figure out what the model might respond without actually calling the model. And then you have the function model where people can kind of customize outputs. Any other fun stories maybe from there? Or is it just what you see is what you get, so to speak?Samuel [00:32:18]: On those two, I think what you see is what you get. On the evals, I think watch this space. I think it's something that like, again, I was somewhat cynical about for some time. Still have my cynicism about some of the well, it's unfortunate that so many different things are called evals. It would be nice if we could agree. What they are and what they're not. But look, I think it's a really important space. I think it's something that we're going to be working on soon, both in Pydantic AI and in LogFire to try and support better because it's like it's an unsolved problem.Alessio [00:32:45]: Yeah, you do say in your doc that anyone who claims to know for sure exactly how your eval should be defined can safely be ignored.Samuel [00:32:52]: We'll delete that sentence when we tell people how to do their evals.Alessio [00:32:56]: Exactly. I was like, we need we need a snapshot of this today. And so let's talk about eval. So there's kind of like the vibe. Yeah. So you have evals, which is what you do when you're building. Right. Because you cannot really like test it that many times to get statistical significance. And then there's the production eval. So you also have LogFire, which is kind of like your observability product, which I tried before. It's very nice. What are some of the learnings you've had from building an observability tool for LEMPs? And yeah, as people think about evals, even like what are the right things to measure? What are like the right number of samples that you need to actually start making decisions?Samuel [00:33:33]: I'm not the best person to answer that is the truth. So I'm not going to come in here and tell you that I think I know the answer on the exact number. I mean, we can do some back of the envelope statistics calculations to work out that like having 30 probably gets you most of the statistical value of having 200 for, you know, by definition, 15% of the work. But the exact like how many examples do you need? For example, that's a much harder question to answer because it's, you know, it's deep within the how models operate in terms of LogFire. One of the reasons we built LogFire the way we have and we allow you to write SQL directly against your data and we're trying to build the like powerful fundamentals of observability is precisely because we know we don't know the answers. And so allowing people to go and innovate on how they're going to consume that stuff and how they're going to process it is we think that's valuable. Because even if we come along and offer you an evals framework on top of LogFire, it won't be right in all regards. And we want people to be able to go and innovate and being able to write their own SQL connected to the API. And effectively query the data like it's a database with SQL allows people to innovate on that stuff. And that's what allows us to do it as well. I mean, we do a bunch of like testing what's possible by basically writing SQL directly against LogFire as any user could. I think the other the other really interesting bit that's going on in observability is OpenTelemetry is centralizing around semantic attributes for GenAI. So it's a relatively new project. A lot of it's still being added at the moment. But basically the idea that like. They unify how both SDKs and or agent frameworks send observability data to to any OpenTelemetry endpoint. And so, again, we can go and having that unification allows us to go and like basically compare different libraries, compare different models much better. That stuff's in a very like early stage of development. One of the things we're going to be working on pretty soon is basically, I suspect, GenAI will be the first agent framework that implements those semantic attributes properly. Because, again, we control and we can say this is important for observability, whereas most of the other agent frameworks are not maintained by people who are trying to do observability. With the exception of Langchain, where they have the observability platform, but they chose not to go down the OpenTelemetry route. So they're like plowing their own furrow. And, you know, they're a lot they're even further away from standardization.Alessio [00:35:51]: Can you maybe just give a quick overview of how OTEL ties into the AI workflows? There's kind of like the question of is, you know, a trace. And a span like a LLM call. Is it the agent? It's kind of like the broader thing you're tracking. How should people think about it?Samuel [00:36:06]: Yeah, so they have a PR that I think may have now been merged from someone at IBM talking about remote agents and trying to support this concept of remote agents within GenAI. I'm not particularly compelled by that because I don't think that like that's actually by any means the common use case. But like, I suppose it's fine for it to be there. The majority of the stuff in OTEL is basically defining how you would instrument. A given call to an LLM. So basically the actual LLM call, what data you would send to your telemetry provider, how you would structure that. Apart from this slightly odd stuff on remote agents, most of the like agent level consideration is not yet implemented in is not yet decided effectively. And so there's a bit of ambiguity. Obviously, what's good about OTEL is you can in the end send whatever attributes you like. But yeah, there's quite a lot of churn in that space and exactly how we store the data. I think that one of the most interesting things, though, is that if you think about observability. Traditionally, it was sure everyone would say our observability data is very important. We must keep it safe. But actually, companies work very hard to basically not have anything that sensitive in their observability data. So if you're a doctor in a hospital and you search for a drug for an STI, the sequel might be sent to the observability provider. But none of the parameters would. It wouldn't have the patient number or their name or the drug. With GenAI, that distinction doesn't exist because it's all just messed up in the text. If you have that same patient asking an LLM how to. What drug they should take or how to stop smoking. You can't extract the PII and not send it to the observability platform. So the sensitivity of the data that's going to end up in observability platforms is going to be like basically different order of magnitude to what's in what you would normally send to Datadog. Of course, you can make a mistake and send someone's password or their card number to Datadog. But that would be seen as a as a like mistake. Whereas in GenAI, a lot of data is going to be sent. And I think that's why companies like Langsmith and are trying hard to offer observability. On prem, because there's a bunch of companies who are happy for Datadog to be cloud hosted, but want self-hosted self-hosting for this observability stuff with GenAI.Alessio [00:38:09]: And are you doing any of that today? Because I know in each of the spans you have like the number of tokens, you have the context, you're just storing everything. And then you're going to offer kind of like a self-hosting for the platform, basically. Yeah. Yeah.Samuel [00:38:23]: So we have scrubbing roughly equivalent to what the other observability platforms have. So if we, you know, if we see password as the key, we won't send the value. But like, like I said, that doesn't really work in GenAI. So we're accepting we're going to have to store a lot of data and then we'll offer self-hosting for those people who can afford it and who need it.Alessio [00:38:42]: And then this is, I think, the first time that most of the workloads performance is depending on a third party. You know, like if you're looking at Datadog data, usually it's your app that is driving the latency and like the memory usage and all of that. Here you're going to have spans that maybe take a long time to perform because the GLA API is not working or because OpenAI is kind of like overwhelmed. Do you do anything there since like the provider is almost like the same across customers? You know, like, are you trying to surface these things for people and say, hey, this was like a very slow span, but actually all customers using OpenAI right now are seeing the same thing. So maybe don't worry about it or.Samuel [00:39:20]: Not yet. We do a few things that people don't generally do in OTA. So we send. We send information at the beginning. At the beginning of a trace as well as sorry, at the beginning of a span, as well as when it finishes. By default, OTA only sends you data when the span finishes. So if you think about a request which might take like 20 seconds, even if some of the intermediate spans finished earlier, you can't basically place them on the page until you get the top level span. And so if you're using standard OTA, you can't show anything until those requests are finished. When those requests are taking a few hundred milliseconds, it doesn't really matter. But when you're doing Gen AI calls or when you're like running a batch job that might take 30 minutes. That like latency of not being able to see the span is like crippling to understanding your application. And so we've we do a bunch of slightly complex stuff to basically send data about a span as it starts, which is closely related. Yeah.Alessio [00:40:09]: Any thoughts on all the other people trying to build on top of OpenTelemetry in different languages, too? There's like the OpenLEmetry project, which doesn't really roll off the tongue. But how do you see the future of these kind of tools? Is everybody going to have to build? Why does everybody want to build? They want to build their own open source observability thing to then sell?Samuel [00:40:29]: I mean, we are not going off and trying to instrument the likes of the OpenAI SDK with the new semantic attributes, because at some point that's going to happen and it's going to live inside OTEL and we might help with it. But we're a tiny team. We don't have time to go and do all of that work. So OpenLEmetry, like interesting project. But I suspect eventually most of those semantic like that instrumentation of the big of the SDKs will live, like I say, inside the main OpenTelemetry report. I suppose. What happens to the agent frameworks? What data you basically need at the framework level to get the context is kind of unclear. I don't think we know the answer yet. But I mean, I was on the, I guess this is kind of semi-public, because I was on the call with the OpenTelemetry call last week talking about GenAI. And there was someone from Arize talking about the challenges they have trying to get OpenTelemetry data out of Langchain, where it's not like natively implemented. And obviously they're having quite a tough time. And I was realizing, hadn't really realized this before, but how lucky we are to primarily be talking about our own agent framework, where we have the control rather than trying to go and instrument other people's.Swyx [00:41:36]: Sorry, I actually didn't know about this semantic conventions thing. It looks like, yeah, it's merged into main OTel. What should people know about this? I had never heard of it before.Samuel [00:41:45]: Yeah, I think it looks like a great start. I think there's some unknowns around how you send the messages that go back and forth, which is kind of the most important part. It's the most important thing of all. And that is moved out of attributes and into OTel events. OTel events in turn are moving from being on a span to being their own top-level API where you send data. So there's a bunch of churn still going on. I'm impressed by how fast the OTel community is moving on this project. I guess they, like everyone else, get that this is important, and it's something that people are crying out to get instrumentation off. So I'm kind of pleasantly surprised at how fast they're moving, but it makes sense.Swyx [00:42:25]: I'm just kind of browsing through the specification. I can already see that this basically bakes in whatever the previous paradigm was. So now they have genai.usage.prompt tokens and genai.usage.completion tokens. And obviously now we have reasoning tokens as well. And then only one form of sampling, which is top-p. You're basically baking in or sort of reifying things that you think are important today, but it's not a super foolproof way of doing this for the future. Yeah.Samuel [00:42:54]: I mean, that's what's neat about OTel is you can always go and send another attribute and that's fine. It's just there are a bunch that are agreed on. But I would say, you know, to come back to your previous point about whether or not we should be relying on one centralized abstraction layer, this stuff is moving so fast that if you start relying on someone else's standard, you risk basically falling behind because you're relying on someone else to keep things up to date.Swyx [00:43:14]: Or you fall behind because you've got other things going on.Samuel [00:43:17]: Yeah, yeah. That's fair. That's fair.Swyx [00:43:19]: Any other observations just about building LogFire, actually? Let's just talk about this. So you announced LogFire. I was kind of only familiar with LogFire because of your Series A announcement. I actually thought you were making a separate company. I remember some amount of confusion with you when that came out. So to be clear, it's Pydantic LogFire and the company is one company that has kind of two products, an open source thing and an observability thing, correct? Yeah. I was just kind of curious, like any learnings building LogFire? So classic question is, do you use ClickHouse? Is this like the standard persistence layer? Any learnings doing that?Samuel [00:43:54]: We don't use ClickHouse. We started building our database with ClickHouse, moved off ClickHouse onto Timescale, which is a Postgres extension to do analytical databases. Wow. And then moved off Timescale onto DataFusion. And we're basically now building, it's DataFusion, but it's kind of our own database. Bogomil is not entirely happy that we went through three databases before we chose one. I'll say that. But like, we've got to the right one in the end. I think we could have realized that Timescale wasn't right. I think ClickHouse. They both taught us a lot and we're in a great place now. But like, yeah, it's been a real journey on the database in particular.Swyx [00:44:28]: Okay. So, you know, as a database nerd, I have to like double click on this, right? So ClickHouse is supposed to be the ideal backend for anything like this. And then moving from ClickHouse to Timescale is another counterintuitive move that I didn't expect because, you know, Timescale is like an extension on top of Postgres. Not super meant for like high volume logging. But like, yeah, tell us those decisions.Samuel [00:44:50]: So at the time, ClickHouse did not have good support for JSON. I was speaking to someone yesterday and said ClickHouse doesn't have good support for JSON and got roundly stepped on because apparently it does now. So they've obviously gone and built their proper JSON support. But like back when we were trying to use it, I guess a year ago or a bit more than a year ago, everything happened to be a map and maps are a pain to try and do like looking up JSON type data. And obviously all these attributes, everything you're talking about there in terms of the GenAI stuff. You can choose to make them top level columns if you want. But the simplest thing is just to put them all into a big JSON pile. And that was a problem with ClickHouse. Also, ClickHouse had some really ugly edge cases like by default, or at least until I complained about it a lot, ClickHouse thought that two nanoseconds was longer than one second because they compared intervals just by the number, not the unit. And I complained about that a lot. And then they caused it to raise an error and just say you have to have the same unit. Then I complained a bit more. And I think as I understand it now, they have some. They convert between units. But like stuff like that, when all you're looking at is when a lot of what you're doing is comparing the duration of spans was really painful. Also things like you can't subtract two date times to get an interval. You have to use the date sub function. But like the fundamental thing is because we want our end users to write SQL, the like quality of the SQL, how easy it is to write, matters way more to us than if you're building like a platform on top where your developers are going to write the SQL. And once it's written and it's working, you don't mind too much. So I think that's like one of the fundamental differences. The other problem that I have with the ClickHouse and Impact Timescale is that like the ultimate architecture, the like snowflake architecture of binary data in object store queried with some kind of cache from nearby. They both have it, but it's closed sourced and you only get it if you go and use their hosted versions. And so even if we had got through all the problems with Timescale or ClickHouse, we would end up like, you know, they would want to be taking their 80% margin. And then we would be wanting to take that would basically leave us less space for margin. Whereas data fusion. Properly open source, all of that same tooling is open source. And for us as a team of people with a lot of Rust expertise, data fusion, which is implemented in Rust, we can literally dive into it and go and change it. So, for example, I found that there were some slowdowns in data fusion's string comparison kernel for doing like string contains. And it's just Rust code. And I could go and rewrite the string comparison kernel to be faster. Or, for example, data fusion, when we started using it, didn't have JSON support. Obviously, as I've said, it's something we can do. It's something we needed. I was able to go and implement that in a weekend using our JSON parser that we built for Pydantic Core. So it's the fact that like data fusion is like for us the perfect mixture of a toolbox to build a database with, not a database. And we can go and implement stuff on top of it in a way that like if you were trying to do that in Postgres or in ClickHouse. I mean, ClickHouse would be easier because it's C++, relatively modern C++. But like as a team of people who are not C++ experts, that's much scarier than data fusion for us.Swyx [00:47:47]: Yeah, that's a beautiful rant.Alessio [00:47:49]: That's funny. Most people don't think they have agency on these projects. They're kind of like, oh, I should use this or I should use that. They're not really like, what should I pick so that I contribute the most back to it? You know, so but I think you obviously have an open source first mindset. So that makes a lot of sense.Samuel [00:48:05]: I think if we were probably better as a startup, a better startup and faster moving and just like headlong determined to get in front of customers as fast as possible, we should have just started with ClickHouse. I hope that long term we're in a better place for having worked with data fusion. We like we're quite engaged now with the data fusion community. Andrew Lam, who maintains data fusion, is an advisor to us. We're in a really good place now. But yeah, it's definitely slowed us down relative to just like building on ClickHouse and moving as fast as we can.Swyx [00:48:34]: OK, we're about to zoom out and do Pydantic run and all the other stuff. But, you know, my last question on LogFire is really, you know, at some point you run out sort of community goodwill just because like, oh, I use Pydantic. I love Pydantic. I'm going to use LogFire. OK, then you start entering the territory of the Datadogs, the Sentrys and the honeycombs. Yeah. So where are you going to really spike here? What differentiator here?Samuel [00:48:59]: I wasn't writing code in 2001, but I'm assuming that there were people talking about like web observability and then web observability stopped being a thing, not because the web stopped being a thing, but because all observability had to do web. If you were talking to people in 2010 or 2012, they would have talked about cloud observability. Now that's not a term because all observability is cloud first. The same is going to happen to gen AI. And so whether or not you're trying to compete with Datadog or with Arise and Langsmith, you've got to do first class. You've got to do general purpose observability with first class support for AI. And as far as I know, we're the only people really trying to do that. I mean, I think Datadog is starting in that direction. And to be honest, I think Datadog is a much like scarier company to compete with than the AI specific observability platforms. Because in my opinion, and I've also heard this from lots of customers, AI specific observability where you don't see everything else going on in your app is not actually that useful. Our hope is that we can build the first general purpose observability platform with first class support for AI. And that we have this open source heritage of putting developer experience first that other companies haven't done. For all I'm a fan of Datadog and what they've done. If you search Datadog logging Python. And you just try as a like a non-observability expert to get something up and running with Datadog and Python. It's not trivial, right? That's something Sentry have done amazingly well. But like there's enormous space in most of observability to do DX better.Alessio [00:50:27]: Since you mentioned Sentry, I'm curious how you thought about licensing and all of that. Obviously, your MIT license, you don't have any rolling license like Sentry has where you can only use an open source, like the one year old version of it. Was that a hard decision?Samuel [00:50:41]: So to be clear, LogFire is co-sourced. So Pydantic and Pydantic AI are MIT licensed and like properly open source. And then LogFire for now is completely closed source. And in fact, the struggles that Sentry have had with licensing and the like weird pushback the community gives when they take something that's closed source and make it source available just meant that we just avoided that whole subject matter. I think the other way to look at it is like in terms of either headcount or revenue or dollars in the bank. The amount of open source we do as a company is we've got to be open source. We're up there with the most prolific open source companies, like I say, per head. And so we didn't feel like we were morally obligated to make LogFire open source. We have Pydantic. Pydantic is a foundational library in Python. That and now Pydantic AI are our contribution to open source. And then LogFire is like openly for profit, right? As in we're not claiming otherwise. We're not sort of trying to walk a line if it's open source. But really, we want to make it hard to deploy. So you probably want to pay us. We're trying to be straight. That it's to pay for. We could change that at some point in the future, but it's not an immediate plan.Alessio [00:51:48]: All right. So the first one I saw this new I don't know if it's like a product you're building the Pydantic that run, which is a Python browser sandbox. What was the inspiration behind that? We talk a lot about code interpreter for lamps. I'm an investor in a company called E2B, which is a code sandbox as a service for remote execution. Yeah. What's the Pydantic that run story?Samuel [00:52:09]: So Pydantic that run is again completely open source. I have no interest in making it into a product. We just needed a sandbox to be able to demo LogFire in particular, but also Pydantic AI. So it doesn't have it yet, but I'm going to add basically a proxy to OpenAI and the other models so that you can run Pydantic AI in the browser. See how it works. Tweak the prompt, et cetera, et cetera. And we'll have some kind of limit per day of what you can spend on it or like what the spend is. The other thing we wanted to b

new york california ai google china internet pr san francisco project tools mit putting staying states engineering origins cloud honestly agent reflect chapters ibm shared cto excel slack instructors openai competing rust arise uc berkeley api lovely ironically python rsvp aws traditionally github apis licensing strawberry javascript temporal llm gartner ota cpu genai sti agi graphs sequoia grok sql git bedrock cloudflare dx anthropic vcr tweak sdks sentry alessio zod json inherently mcp cli colvin programmatic vertex datadog pii webassembly prefect 17m postgres gla airflow daxter sisyphean jupyter open api neo4j jeremy howard langchain pypi otel numpy dsls jitter timescale clickhouse code interpreter simd marimo latent space pythonic lemps emil eifrem pytest

239: Scaling to Unicorn Status

Thinking Elixir Podcast

Play Episode Listen Later Feb 4, 2025 29:09

News includes an impressive case study from Remote showing how they scaled Elixir to support nearly 300 engineers and reach unicorn status, Tailwind CSS 4.0's major release with Phoenix integration in progress, Chris McCord teasing an exciting AI code generator project on Fly.io, the release of Elixir LangChain v0.3.0 with expanded support for multiple AI providers, ElixirConfEU 2025 tickets going on sale in Kraków, and more! Show Notes online - http://podcast.thinkingelixir.com/239 (http://podcast.thinkingelixir.com/239) Elixir Community News https://elixir-lang.org/blog/2025/01/21/remote-elixir-case/ (https://elixir-lang.org/blog/2025/01/21/remote-elixir-case/?utm_source=thinkingelixir&utm_medium=shownotes) – New case study about Remote, a unicorn company using Elixir as their primary technology with nearly 300 engineers. https://github.com/sasa1977/boundary (https://github.com/sasa1977/boundary?utm_source=thinkingelixir&utm_medium=shownotes) – Remote uses Saša Jurić's Boundary library to help enforce boundaries in their monolithic codebase. https://www.reddit.com/r/elixir/comments/1i77ia9/comment/m8il2ho/ (https://www.reddit.com/r/elixir/comments/1i77ia9/comment/m8il2ho/?utm_source=thinkingelixir&utm_medium=shownotes) – Discussion about the type spec future in Elixir, with plans to replace Dialyzer typespecs in versions 1.19 and 1.20. https://bsky.app/profile/zachdaniel.dev/post/3lgqdugp7zs2b (https://bsky.app/profile/zachdaniel.dev/post/3lgqdugp7zs2b?utm_source=thinkingelixir&utm_medium=shownotes) – Ash installer now supports Oban integration via a flag option. https://tailwindcss.com/blog/tailwindcss-v4 (https://tailwindcss.com/blog/tailwindcss-v4?utm_source=thinkingelixir&utm_medium=shownotes) – Tailwind CSS 4.0 released with major changes including moving theme configuration to CSS variables. https://tailwindcss.com/docs/upgrade-guide (https://tailwindcss.com/docs/upgrade-guide?utm_source=thinkingelixir&utm_medium=shownotes) – Comprehensive upgrade guide for Tailwind CSS v4. https://github.com/phoenixframework/phoenix/pull/5990 (https://github.com/phoenixframework/phoenix/pull/5990?utm_source=thinkingelixir&utm_medium=shownotes) – WIP PR to support Tailwind v4 in Phoenix. https://bsky.app/profile/zachdaniel.dev/post/3lggmuk4dis2x (https://bsky.app/profile/zachdaniel.dev/post/3lggmuk4dis2x?utm_source=thinkingelixir&utm_medium=shownotes) – Zach Daniel shares how Tailwind v4 changes will improve igniter's utility configuration capabilities. https://github.com/brainlid/langchain (https://github.com/brainlid/langchain?utm_source=thinkingelixir&utm_medium=shownotes) – Elixir LangChain v0.3.0 released with expanded support for OpenAI, Anthropic, Gemini, Llama, and more. https://x.com/chris_mccord/status/1880377175200669770 (https://x.com/chris_mccord/status/1880377175200669770?utm_source=thinkingelixir&utm_medium=shownotes) – Chris McCord teases new Fly.io AI code generator project with IDE/terminal integration. https://x.com/chris_mccord/status/1880392153924530376 (https://x.com/chris_mccord/status/1880392153924530376?utm_source=thinkingelixir&utm_medium=shownotes) – Demo video of Chris McCord's AI-integrated editor creating a multiplayer Phoenix LiveView app. https://www.elixirconf.eu/ (https://www.elixirconf.eu/?utm_source=thinkingelixir&utm_medium=shownotes) – ElixirConfEU 2025 tickets on sale, happening May 15-16 in Kraków Poland & Virtual. Do you have some Elixir news to share? Tell us at @ThinkingElixir (https://twitter.com/ThinkingElixir) or email at show@thinkingelixir.com (mailto:show@thinkingelixir.com) Find us online - Message the show - Bluesky (https://bsky.app/profile/thinkingelixir.com) - Message the show - X (https://x.com/ThinkingElixir) - Message the show on Fediverse - @ThinkingElixir@genserver.social (https://genserver.social/ThinkingElixir) - Email the show - show@thinkingelixir.com (mailto:show@thinkingelixir.com) - Mark Ericksen on X - @brainlid (https://x.com/brainlid) - Mark Ericksen on Bluesky - @brainlid.bsky.social (https://bsky.app/profile/brainlid.bsky.social) - Mark Ericksen on Fediverse - @brainlid@genserver.social (https://genserver.social/brainlid) - David Bernheisel on Bluesky - @david.bernheisel.com (https://bsky.app/profile/david.bernheisel.com) - David Bernheisel on Fediverse - @dbern@genserver.social (https://genserver.social/dbern)

234: Source Drops, AI, and Holiday Cheer

Thinking Elixir Podcast

Play Episode Listen Later Dec 24, 2024 14:43

News includes Ellie Fairholm and José Giralt D'Lacoste releasing the source code for "Engineering Elixir Applications," Michael Russo introducing "hex2txt" to enhance AI coding assistants, Brian Cardarella showcasing LiveView Native's LiveUploads, Headway's guide on building AI-powered iOS apps with LiveView Native, and more! Wishing you a Merry Christmas and a Happy New Year from all of us! Show Notes online - http://podcast.thinkingelixir.com/234 (http://podcast.thinkingelixir.com/234) Elixir Community News https://github.com/gilacost/engineeringelixirapplications (https://github.com/gilacost/engineering_elixir_applications?utm_source=thinkingelixir&utm_medium=shownotes) – Source code for the book "Engineering Elixir Applications" is now publicly available on GitHub. https://podcast.thinkingelixir.com/206 (https://podcast.thinkingelixir.com/206?utm_source=thinkingelixir&utm_medium=shownotes) – Previous episode with José Giralt D'Lacoste and Ellie Fairholm about their BEAM-focused DevOps book. https://x.com/mjrusso/status/1868881707262439582 (https://x.com/mjrusso/status/1868881707262439582?utm_source=thinkingelixir&utm_medium=shownotes) – Michael Russo created a proof-of-concept package "hex2txt" that converts hex package docs into llms.txt files. https://llmstxt.org/ (https://llmstxt.org/?utm_source=thinkingelixir&utm_medium=shownotes) – Website describing the llms.txt file standard for providing information for coders and AI. https://hex2txt.fly.dev/ (https://hex2txt.fly.dev/?utm_source=thinkingelixir&utm_medium=shownotes) – Michael's website for browsing examples of generated text files using hex2txt. Sum up that a proposal aims for such standardization to help AI coding assistants. https://github.com/brainlid/langchain/discussions/218 (https://github.com/brainlid/langchain/discussions/218?utm_source=thinkingelixir&utm_medium=shownotes) – New release v0.3.0-rc.1 of the Elixir LangChain library. https://github.com/brainlid/langchain (https://github.com/brainlid/langchain?utm_source=thinkingelixir&utm_medium=shownotes) – Repository for the Elixir LangChain library. https://github.com/brainlid/langchain/blob/main/CHANGELOG.md (https://github.com/brainlid/langchain/blob/main/CHANGELOG.md?utm_source=thinkingelixir&utm_medium=shownotes) – CHANGELOG for the Elixir LangChain library detailing breaking changes and updates. New features in LangChain like SummarizeConversationChain and LLMChain.run with fallbacks enhance production resilience and usability. https://bsky.app/profile/liveviewnative.dev/post/3lcgrxqm5lk2g (https://bsky.app/profile/liveviewnative.dev/post/3lcgrxqm5lk2g?utm_source=thinkingelixir&utm_medium=shownotes) – Brian Cardarella showed LiveView Native's support for LiveUploads, unlocking photo and video features. https://bsky.app/profile/bcardarella.bsky.social/post/3ldhg433mxc2y (https://bsky.app/profile/bcardarella.bsky.social/post/3ldhg433mxc2y?utm_source=thinkingelixir&utm_medium=shownotes) – Shows direct usage of LiveUploads in LiveView Native. LiveView Native simplifies mobile app development by reducing project and team requirements. https://bsky.app/profile/liveviewnative.dev/post/3ldhosnmjjc2v (https://bsky.app/profile/liveviewnative.dev/post/3ldhosnmjjc2v?utm_source=thinkingelixir&utm_medium=shownotes) – Building an AI-powered iOS app with LiveView Native by Headway. https://www.youtube.com/watch?v=nx_7gLfk7vA (https://www.youtube.com/watch?v=nx_7gLfk7vA?utm_source=thinkingelixir&utm_medium=shownotes) – 40-minute video tutorial on getting started with LiveView Native, Nx, and Axon by Headway. Do you have some Elixir news to share? Tell us at @ThinkingElixir (https://twitter.com/ThinkingElixir) or email at show@thinkingelixir.com (mailto:show@thinkingelixir.com) Find us online - Message the show - Bluesky (https://bsky.app/profile/thinkingelixir.com) - Message the show - X (https://x.com/ThinkingElixir) - Message the show on Fediverse - @ThinkingElixir@genserver.social (https://genserver.social/ThinkingElixir) - Email the show - show@thinkingelixir.com (mailto:show@thinkingelixir.com) - Mark Ericksen on X - @brainlid (https://x.com/brainlid) - Mark Ericksen on Bluesky - @brainlid.bsky.social (https://bsky.app/profile/brainlid.bsky.social) - Mark Ericksen on Fediverse - @brainlid@genserver.social (https://genserver.social/brainlid) - David Bernheisel on Bluesky - @david.bernheisel.com (https://bsky.app/profile/david.bernheisel.com) - David Bernheisel on Fediverse - @dbern@genserver.social (https://genserver.social/dbern)

$100B Founder Breaks Down The Biggest AI Business Opportunities For 2025

My First Million

Play Episode Listen Later Nov 26, 2024 91:38

Episode 653: Shaan Puri ( https://x.com/ShaanVP ) talks to Furqan Rydhan ( https://x.com/FurqanR ) about the biggest opportunities in AI right now. — Show Notes: (0:00) Intro (4:42) Define the Job-to-be-done (8:20) How to build an AI Agent workflow (11:16) AI Tools break down (27:05) How Polymarket won (31:48) Why VR is a sleeping giant? (44:43) Be a lifelong player in tech (58:52) The unbeatable combination (1:02:27) Adam Foroughi's A+ execution (1:18:35) Betting on -1 to 0 — Links: • Furqan's site - https://furqan.sh/ • Founders, Inc - https://f.inc/ • Applovin - https://www.applovin.com/ • Claude - https://claude.ai/ • OpenAI - https://platform.openai.com/ • Langchain - https://www.langchain.com/ • AutoGen - https://autogenai.com/ • Crew - https://www.crewai.com/ • CloudSDK - https://cloud.google.com/sdk/ • Perplexity - https://www.perplexity.ai/ • “Attention is all you need” - https://typeset.io/papers/attention-is-all-you-need-1hodz0wcqb • Anthropic - https://www.anthropic.com/ • Third Web - https://thirdweb.com/ • Luna's AI Brain - https://terminal.virtuals.io/ • Oasis - https://oasis.decart.ai/welcome • Polymarket - https://polymarket.com/ • Gorilla Tag - https://www.gorillatagvr.com/ • Yeeps - https://tinyurl.com/59z2yrdu — Check Out Shaan's Stuff: Need to hire? You should use the same service Shaan uses to hire developers, designers, & Virtual Assistants → it's called Shepherd (tell ‘em Shaan sent you): https://bit.ly/SupportShepherd — Check Out Sam's Stuff: • Hampton - https://www.joinhampton.com/ • Ideation Bootcamp - https://www.ideationbootcamp.co/ • Copy That - https://copythat.com • Hampton Wealth Survey - https://joinhampton.com/wealth • Sam's List - http://samslist.co/ My First Million is a HubSpot Original Podcast // Brought to you by The HubSpot Podcast Network // Production by Arie Desormeaux // Editing by Ezra Bakker Trupiano

Podcasts about langchain

Best podcasts about langchain

Latent Space: The AI Engineer Podcast â€” CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Thinking Elixir Podcast

GraphStuff.FM: The Neo4j Graph Database Developer Podcast

Syntax - Tasty Web Development Treats

Talk Python To Me - Python conversations for passionate developers

Real World Serverless with theburningmonk

Zero Knowledge

This Day in AI Podcast

PodRocket - A web development podcast from LogRocket

The New Stack Podcast

Dead Cat

airhacks.fm podcast with adam bien

AI Unraveled: Latest AI News & Trends, Master GPT, Gemini, Generative AI, LLMs, Prompting, GPT Store

programmier.bar – der Podcast für App- und Webentwicklung

AIA Podcast

Web Reactiva

Two Voice Devs

Latest news about langchain

Latest podcast episodes about langchain

20VC: Benchmark's Newest General Partner Ev Randle on Why Margins Matter Less in AI | Why Mega Funds Will Not Produce Good Returns | OpenAI vs Anthropic: What Happens and Who Wins Coding | Investing Lessons from Peter Thiel and Mamoon Hamid

SED News: AMD's Big OpenAI Deal, Intel's Struggles, and Apple's AI Long Game

SED News: AMD's Big OpenAI Deal, Intel's Struggles, and Apple's AI Long Game

⚡️ Ship AI recap: Agents, Workflows, and Python — w/ Vercel CTO Malte Ubl

Live Oak Bank's George Werbacher on AI As SecOps' Single Pane of Glass

Dell AI Data Platform Advancements Unlock the Power of Enterprise Data to Accelerate AI Outcomes

EP-393 Atlas Browser Challenges Chrome

ACA Dynamic Sessions - The best Azure service you've never heard of

ACA Dynamic Sessions - The best Azure service you've never heard of

SE Radio 689: Amey Desai on the Model Context Protocol

#226 - Brevo : Monter l'équipe GenAI appliquée au Produit (Centaure, 189 millions ARR)

Understanding the role of MCP, Langchain and Agent2Agent

Context Engineering for Agents - Lance Martin, LangChain

#58 - Miles Grimshaw

Core AI Concepts – Part 3

#517: Agentic Al Programming with Python

3384: MariaDB's Roadmap for Cloud, AI, and Performance Leadership

Ep. 254: Gurdeep Pall | The Internet Side of the AI Battle: Why Walled Gardens Fail

How RAG Is Powering the Future of AI Agents

625 - The Salesforce Partner's AI Dilemma with Sanjeet Mahajan

AI Agent Development Tradeoffs You NEED to Know

921: AI Coding Roadmap for Newbies (And Skeptics)

LangChain is about to become a unicorn, sources say

70. How to Stop AI From Replacing You

#307.exe - Langchain: Faire de l'IA comme des Lego par Julien Verlaguet

EP. 267 Full Stack AI: Building with MongoDB, Deno, and Next.js

LangChain: LLM Integration for Elixir Apps with Mark Ericksen

#507: Agentic AI Workflows with LangGraph

The Rise of AI Agents in Project Work

A Ascensão dos Agentes de IA na Gestão de Projetos

Google Cloud Next Wrap-Up

A Candid Conversation Around MCP and A2A // Rahul Parundekar and Sam Partee // #316 SF Live

LIVE: Ambient Agents and the New Agent Inbox ft. Harrison Chase of LangChain

Model Context Protocol (MCP): Making AI Agents Talk to Your Data

AI-Driven Development: Driving adoption on your product teams, Team Culture, and AI-Native Engineering Practices

AMP 72: The Future Is AI + Angular – Here's Why with Nir Kaufman

Building Agentic Apps With Craft: Field Stories from Austin Vance, CEO, Co-Founder of Focused

EP. 263 Building Agents with Natural Language with guest

EP. 262 Solving Unstructured Data Challenges with AI & Vector Search

The Agent Network — Dharmesh Shah

LLMs for web developers with Roy Derks

AI Agents & the Future of Work with LangChain's Harrison Chase | AI Basics with Google Cloud

Sakana AI - Chris Lu, Robert Tjarko Lange, Cong Lu

MongoDB's Sahir Azam: Vector Databases and the Data Structure of AI

AI Performance Declines Under Pressure, UK Orders Apple Backdoor, and IT Jobs Face Automation Crisis

LangChain and Agentic AI Engineering with Erick Friis

Agent Engineering with Pydantic + Graphs — with Samuel Colvin

239: Scaling to Unicorn Status

234: Source Drops, AI, and Holiday Cheer

$100B Founder Breaks Down The Biggest AI Business Opportunities For 2025