Phenomenon of human communication having different forms that combine
POPULARITY
Fei-Fei Li and Justin Johnson are cofounders of World Labs, who have recently launched Marble (https://marble.worldlabs.ai/), a new kind of generative “world model” that can create editable 3D environments from text, images, and other spatial inputs. Marble lets creators generate persistent 3D worlds, precisely control cameras, and interactively edit scenes, making it a powerful tool for games, film, VR, robotics simulation, and more. In this episode, Fei-Fei and Justin share how their journey from ImageNet and Stanford research led to World Labs, why spatial intelligence is the next frontier after LLMs, and how world models could change how machines see, understand, and build in 3D. We discuss: The massive compute scaling from AlexNet to today and why world models and spatial data are the most compelling way to “soak up” modern GPU clusters compared to language alone. What Marble actually is: a generative model of 3D worlds that turns text and images into editable scenes using Gaussian splats, supports precise camera control and recording, and runs interactively on phones, laptops, and VR headsets. Fei-fei's essay (https://drfeifei.substack.com/p/from-words-to-worlds-spatial-intelligence) on spatial intelligence as a distinct form of intelligence from language: from picking up a mug to inferring the 3D structure of DNA, and why language is a lossy, low-bandwidth channel for describing the rich 3D/4D world we live in. Whether current models “understand” physics or just fit patterns: the gap between predicting orbits and discovering F=ma, and how attaching physical properties to splats and distilling physics engines into neural networks could lead to genuine causal reasoning. The changing role of academia in AI, why Fei-Fei worries more about under-resourced universities than “open vs closed,” and how initiatives like national AI compute clouds and open benchmarks can rebalance the ecosystem. Why transformers are fundamentally set models, not sequence models, and how that perspective opens up new architectures for world models, especially as hardware shifts from single GPUs to massive distributed clusters. Real use cases for Marble today: previsualization and VFX, game environments, virtual production, interior and architectural design (including kitchen remodels), and generating synthetic simulation worlds for training embodied agents and robots. How spatial intelligence and language intelligence will work together in multimodal systems, and why the goal isn't to throw away LLMs but to complement them with rich, embodied models of the world. Fei-Fei and Justin's long-term vision for spatial intelligence: from creative tools for artists and game devs to broader applications in science, medicine, and real-world decision-making. — Fei-Fei Li X: https://x.com/drfeifei LinkedIn: https://www.linkedin.com/in/fei-fei-li-4541247 Justin Johnson X: https://x.com/jcjohnss LinkedIn: https://www.linkedin.com/in/justin-johnson-41b43664 Where to find Latent Space X: https://x.com/latentspacepod Substack: https://www.latent.space/ Chapters 00:00:00 Introduction and the Fei-Fei Li & Justin Johnson Partnership 00:02:00 From ImageNet to World Models: The Evolution of Computer Vision 00:12:42 Dense Captioning and Early Vision-Language Work 00:19:57 Spatial Intelligence: Beyond Language Models 00:28:46 Introducing Marble: World Labs' First Spatial Intelligence Model 00:33:21 Gaussian Splats and the Technical Architecture of Marble 00:22:10 Physics, Dynamics, and the Future of World Models 00:41:09 Multimodality and the Interplay of Language and Space 00:37:37 Use Cases: From Creative Industries to Robotics and Embodied AI 00:56:58 Hiring, Research Directions, and the Future of World Labs
In this episode of Crazy Wisdom, host Stewart Alsop talks with Kevin Smith, co-founder of Snipd, about how AI is reshaping the way we listen, learn, and interact with podcasts. They explore Snipd's vision of transforming podcasts into living knowledge systems, the evolution of machine learning from finance to large language models, and the broader connection between AI, robotics, and energy as the foundation for the next technological era. Kevin also touches on ideas like the bitter lesson, reinforcement learning, and the growing energy demands of AI. Listeners can try Snipd's premium version free for a month using this promo link.Check out this GPT we trained on the conversationTimestamps00:00 – Stewart Alsop welcomes Kevin Smith, co-founder of Snipd, to discuss AI, podcasting, and curiosity-driven learning.05:00 – Kevin explains Snipd's snipping feature, chatting with episodes, and future plans for voice interaction with podcasts.10:00 – They discuss vector search, embeddings, and context windows, comparing full-episode context to chunked transcripts.15:00 – Kevin shares his background in mathematics and economics, his shift from finance to machine learning, and early startup work in AI.20:00 – They explore early quant models versus modern machine learning, statistical modeling, and data limitations in finance.25:00 – Conversation turns to transformer models, pretraining, and the bitter lesson—how compute-based methods outperform human-crafted systems. 30:00 – Stewart connects this to RLHF, Scale AI, and data scarcity; Kevin reflects on reinforcement learning's future. 35:00 – They pivot to Snipd's podcast ecosystem, hidden gems like Founders Podcast, and how stories shape entrepreneurial insight. 40:00 – ETH Zurich, robotics, and startup culture come up, linking academia to real-world innovation. 45:00 – They close on AI, robotics, and energy as the pillars of the future, debating nuclear and solar power's role in sustaining progress.Key InsightsPodcasts as dynamic knowledge systems: Kevin Smith presents Snipd as an AI-powered tool that transforms podcasts into interactive learning environments. By allowing listeners to “snip” and summarize meaningful moments, Snipd turns passive listening into active knowledge management—bridging curiosity, memory, and technology in a way that reframes podcasts as living knowledge capsules rather than static media.AI transforming how we engage with information: The discussion highlights how AI enables entirely new modes of interaction—chatting directly with podcast episodes, asking follow-up questions, and contextualizing information across an author's full body of work. This evolution points toward a future where knowledge consumption becomes conversational and personalized rather than linear and one-size-fits-all.Vectorization and context windows matter: Kevin explains that Snipd currently avoids heavy use of vector databases, opting instead to feed entire episodes into large models. This choice enhances coherence and comprehension, reflecting how advances in context windows have reshaped how AI understands complex audio content.Machine learning's roots in finance shaped early AI thinking: Kevin's journey from quantitative finance to AI reveals how statistical modeling laid the groundwork for modern learning systems. While finance once relied on rigid, theory-based models, the machine learning paradigm replaced those priors with flexible, data-driven discovery—an essential philosophical shift in how intelligence is approached.The Bitter Lesson and the rise of compute: Together they unpack Richard Sutton's “bitter lesson”—the idea that methods leveraging computation and data inevitably surpass those built from human intuition. This insight serves as a compass for understanding why transformers, pretraining, and scaling have driven recent AI breakthroughs.Reinforcement learning and data scarcity define AI's next phase: Stewart links RLHF and the work of companies like Scale AI and Surge AI to the broader question of data limits. Kevin agrees that the next wave of AI will depend on reinforcement learning and simulated environments that generate new, high-quality data beyond what humans can label.The future hinges on AI, robotics, and energy: Kevin closes with a framework for the next decade: AI provides intelligence, robotics applies it to the physical world, and energy sustains it all. He warns that society must shift from fearing energy use to innovating in production—especially through nuclear and solar power—to meet the demands of an increasingly intelligent, interconnected world.
In this episode of The Plastic Surgery Revolution, Dr. Steven Davis sits down with Taylor Farley, PA-C to unpack key insights from the 2025 Global Aesthetics Conference in Miami — one of the industry's biggest gatherings for surgical and non-surgical innovation. From the rise of regenerative medicine and peptide therapy to the evolving “multimodality” approach that combines injectables, lasers, and biostimulatory treatments, Dr. Davis and Taylor explore how today's most natural, long-lasting results come from blending science, sequencing, and patient health. They also dive into: How peptides and GLP-1 medications are changing the conversation around functional health and cosmetic outcomes The best timing and sequencing for injectables, radiofrequency, and laser treatments Why collagen regeneration starts with nutrition, protein, and overall wellness The importance of “low and slow” when using GLP-1s like Ozempic and Wegovy How at-home skin care and consistency enhance in-office treatments The takeaway? There's no single magic bullet in modern aesthetics — real transformation comes from a balanced, regenerative approach that treats the whole patient, inside and out. Listen to The Plastic Surgery Revolution wherever you get your podcasts and stay ahead of what's next in beauty, health, and science.
How do we know through atmospheres? How can being affected by an atmosphere give rise to knowledge? What role does somatic, nonverbal knowledge play in how we belong to places? Atmospheric Knowledge takes up these questions through detailed analyses of practices that generate atmospheres and in which knowledge emerges through visceral intermingling with atmospheres. From combined musicological and anthropological perspectives, Birgit Abels and Patrick Eisenlohr investigate atmospheres as a compelling alternative to better-known analytics of affect by way of performative and sonic practices across a range of ethnographic settings. With particular focus on oceanic relations and sonic affectedness, Atmospheric Knowledge centers the rich affordances of sonic connections for knowing our environments. A free ebook version of this title is available through Luminos, University of California Press's Open Access publishing program. Visit www.luminosoa.org to learn more. Learn more about your ad choices. Visit megaphone.fm/adchoices Support our show by becoming a premium member! https://newbooksnetwork.supportingcast.fm/new-books-network
How do we know through atmospheres? How can being affected by an atmosphere give rise to knowledge? What role does somatic, nonverbal knowledge play in how we belong to places? Atmospheric Knowledge takes up these questions through detailed analyses of practices that generate atmospheres and in which knowledge emerges through visceral intermingling with atmospheres. From combined musicological and anthropological perspectives, Birgit Abels and Patrick Eisenlohr investigate atmospheres as a compelling alternative to better-known analytics of affect by way of performative and sonic practices across a range of ethnographic settings. With particular focus on oceanic relations and sonic affectedness, Atmospheric Knowledge centers the rich affordances of sonic connections for knowing our environments. A free ebook version of this title is available through Luminos, University of California Press's Open Access publishing program. Visit www.luminosoa.org to learn more. Learn more about your ad choices. Visit megaphone.fm/adchoices Support our show by becoming a premium member! https://newbooksnetwork.supportingcast.fm/anthropology
How do we know through atmospheres? How can being affected by an atmosphere give rise to knowledge? What role does somatic, nonverbal knowledge play in how we belong to places? Atmospheric Knowledge takes up these questions through detailed analyses of practices that generate atmospheres and in which knowledge emerges through visceral intermingling with atmospheres. From combined musicological and anthropological perspectives, Birgit Abels and Patrick Eisenlohr investigate atmospheres as a compelling alternative to better-known analytics of affect by way of performative and sonic practices across a range of ethnographic settings. With particular focus on oceanic relations and sonic affectedness, Atmospheric Knowledge centers the rich affordances of sonic connections for knowing our environments. A free ebook version of this title is available through Luminos, University of California Press's Open Access publishing program. Visit www.luminosoa.org to learn more. Learn more about your ad choices. Visit megaphone.fm/adchoices Support our show by becoming a premium member! https://newbooksnetwork.supportingcast.fm/sociology
How do we know through atmospheres? How can being affected by an atmosphere give rise to knowledge? What role does somatic, nonverbal knowledge play in how we belong to places? Atmospheric Knowledge takes up these questions through detailed analyses of practices that generate atmospheres and in which knowledge emerges through visceral intermingling with atmospheres. From combined musicological and anthropological perspectives, Birgit Abels and Patrick Eisenlohr investigate atmospheres as a compelling alternative to better-known analytics of affect by way of performative and sonic practices across a range of ethnographic settings. With particular focus on oceanic relations and sonic affectedness, Atmospheric Knowledge centers the rich affordances of sonic connections for knowing our environments. A free ebook version of this title is available through Luminos, University of California Press's Open Access publishing program. Visit www.luminosoa.org to learn more. Learn more about your ad choices. Visit megaphone.fm/adchoices Support our show by becoming a premium member! https://newbooksnetwork.supportingcast.fm/geography
How do we know through atmospheres? How can being affected by an atmosphere give rise to knowledge? What role does somatic, nonverbal knowledge play in how we belong to places? Atmospheric Knowledge takes up these questions through detailed analyses of practices that generate atmospheres and in which knowledge emerges through visceral intermingling with atmospheres. From combined musicological and anthropological perspectives, Birgit Abels and Patrick Eisenlohr investigate atmospheres as a compelling alternative to better-known analytics of affect by way of performative and sonic practices across a range of ethnographic settings. With particular focus on oceanic relations and sonic affectedness, Atmospheric Knowledge centers the rich affordances of sonic connections for knowing our environments. A free ebook version of this title is available through Luminos, University of California Press's Open Access publishing program. Visit www.luminosoa.org to learn more. Learn more about your ad choices. Visit megaphone.fm/adchoices Support our show by becoming a premium member! https://newbooksnetwork.supportingcast.fm/sound-studies
Welcome back to the Alt Goes Mainstream podcast.Today's episode provides a fascinating window into the world of private equity value creation and how AI can help transform both portfolio company and investment firm processes and operations to create value.We sat down in New York with Hg Partner and Head of Value Creation Chris Kindt to dive into how he spearheaded the growth of the firm's in-house value creation efforts and has built a high-performing team to meet the evolving needs of portfolio companies.Chris brings 15 years of experience in architecting and driving value creation to bear, with 11 years of experience at Hg and another 8 years as a consultant at BCG and Parthenon.Chris and his team are responsible for driving value across a portfolio of 55 companies that represents over $180B in enterprise value and would represent the second largest software company in Europe after SAP if it were a conglomerate.Chris and I had a thought-provoking discussion about value creation and the impact of AI on investing and operating companies. We discussed:How to build a high-performing value creation team.Where value creation has the biggest impact.How AI is transforming investment processes.How Hg has become an “AI-first” investment firm.Where AI will have the most impact in a company today and why agentic AI is transforming certain roles and processes within companies.How professionals can generate the most leverage from utilizing AI – without having cognitive decline.Why prompting and prompt engineering is so critical in the age of AI (and how you can use your weekends to perfect prompt engineering).How a management team can buy into AI and why the current time period represents an interesting opportunity for incumbents, particularly for those building mission-critical enterprise software.Thanks Chris for sharing your expertise and wisdom on company building and AI. We hope you enjoy.Note: this episode was filmed in August 2025.A word from AGM podcast sponsor, Ultimus Fund SolutionsThis episode of Alt Goes Mainstream is brought to you by Ultimus Fund Solutions, a leading full-service fund administrator for asset managers in private and public markets. As private markets continue to move into the mainstream, the industry requires infrastructure solutions that help funds and investors keep pace. In an increasingly sophisticated financial marketplace, investment managers must navigate a growing array of challenges: elaborate fund structures, specialized strategies, evolving compliance requirements, a growing need for sophisticated reporting, and intensifying demands for transparency.To assist with these challenging opportunities, more and more fund sponsors and asset managers are turning to Ultimus, a leading service provider that blends high tech and high touch in unique and customized fund administration and middle office solutions for a diverse and growing universe of over 450 clients and 1,800 funds, representing $500 billion assets under administration, all handled by a team of over 1,000 professionals. Ultimus offers a wide range of capabilities across registered funds, private funds and public plans, as well as outsourced middle office services. Delivering operational excellence, Ultimus helps firms manage the ever-changing regulatory environment while meeting the needs of their institutional and retail investors. Ultimus provides comprehensive operational support and fund governance services to help managers successfully launch retail alternative products.Visit www.ultimusfundsolutions.com to learn more about Ultimus' technology enhanced services and solutions or contact Ultimus Executive Vice President of Business Development Gary Harris on email at gharris@ultimusfundsolutions.com.We thank Ultimus for their support of alts going mainstream.Show Notes00:00 Message from our Sponsor, Ultimus01:18 Welcome to the Alt Goes Mainstream Podcast02:10 Guest Introduction: Chris Kindt03:58 Chris Kindt's Background and Journey05:43 Value Creation at Hg06:55 Pre-Investment and Diligence Process07:44 Management Team Dynamics08:40 Common Value Creation Interventions11:14 Building a Cross-Functional Team12:31 Scaling the Value Creation Team15:40 Measuring Value Creation Impact17:49 Investment Philosophy: Inch Wide, Mile Deep19:28 Partnerships with Management Teams20:03 Embedding AI in Value Creation20:24 Internal vs External AI Applications21:47 AI First Culture22:18 Effective AI Utilization24:04 Prompt Engineering and Whispering26:25 Choosing the Right AI Models26:50 AI Models: Strengths and Weaknesses27:42 Transformative Impact of AI28:21 Skills Needed in the AI Era28:26 AI's Role in Investment Firms28:55 Core Insights and Judgements29:04 The Core Skillset and Efficiency29:12 Philosophical Questions on AI and Talent Development30:18 Building Grit in the Age of AI31:12 Maintaining Discipline with AI32:08 AI as a Value Creation Lever32:35 Operational Efficiency and Copilots33:43 Emergence of Reasoning Models and Agentic Frameworks34:31 10x Efficiency in Engineering36:24 Challenges in Implementing AI37:16 AI Immersion Strategy Days39:52 Organizational Agility and AI42:15 AI's Impact on Investment Strategies43:26 AI in Mergers and Acquisitions45:29 The Importance of Proprietary Data47:12 AI Fatigue and Disillusionment48:07 Building AI Products and Agentic Products48:58 Hg's Internal AI Incubator49:59 The Next Wave of AI51:19 Voice and Multimodality in AI51:55 Globalization and Internationalization of AI53:35 Overestimating and Underestimating AI's Impact54:54 The Competitive Landscape of AI55:42 The Future of Value Creation with AI56:23 Conclusion and Final ThoughtsEditing and post-production work for this episode was provided by The Podcast Consultant.
CME credits: 0.75 Valid until: 22-08-2026 Claim your CME credit at https://reachmd.com/programs/cme/multidisciplinary-collaboration-facilitates-multimodality-therapy/36636/ This online CME activity examines advances in managing resectable locally advanced head and neck squamous cell carcinoma (HNSCC), focusing on the integration of perioperative immune checkpoint inhibitors (ICIs) and multimodal approaches. Faculty review current standards of care and highlight unmet needs that have driven investigation into combining radiation and immunotherapy. Emerging clinical trial data are discussed, including the impact of perioperative ICIs on event-free survival and pathologic response, with attention to patient selection informed by risk stratification and biomarkers. The program also addresses practical considerations for multidisciplinary care, including immune-related adverse event management and strategies to support patient access to these evolving treatment paradigms.
CME credits: 0.75 Valid until: 22-08-2026 Claim your CME credit at https://reachmd.com/programs/cme/multidisciplinary-collaboration-facilitates-multimodality-therapy/36636/ This online CME activity examines advances in managing resectable locally advanced head and neck squamous cell carcinoma (HNSCC), focusing on the integration of perioperative immune checkpoint inhibitors (ICIs) and multimodal approaches. Faculty review current standards of care and highlight unmet needs that have driven investigation into combining radiation and immunotherapy. Emerging clinical trial data are discussed, including the impact of perioperative ICIs on event-free survival and pathologic response, with attention to patient selection informed by risk stratification and biomarkers. The program also addresses practical considerations for multidisciplinary care, including immune-related adverse event management and strategies to support patient access to these evolving treatment paradigms.
Speak (https://speak.com) may not be very well known to native English speakers, but they have come from a slow start in 2016 to emerge as one of the favorite partners of OpenAI, with their Startup Fund leading and joining their Series B and C as one of the new AI-native unicorns, noting that “Speak has the potential to revolutionize not just language learning, but education broadly”. Today we speak with Speak's CTO, Andrew Hsu, on the journey of building the “3rd generation” of language learning software (with Rosetta Stone being Gen 1, and Duolingo being Gen 2). Speak's premise is that speech and language models can now do what was previously only possible with human tutors—provide fluent, responsive, and adaptive instruction—and this belief has shaped its product and company strategy since its early days. https://www.linkedin.com/in/adhsu/ https://speak.com One of the most interesting strategic decisions discussed in the episode is Speak's early focus on South Korea. While counterintuitive for a San Francisco-based startup, the decision was influenced by a combination of market opportunity and founder proximity via a Korean first employee. South Korea's intense demand for English fluency and a highly competitive education market made it a proving ground for a deeply AI-native product. By succeeding in a market saturated with human-based education solutions, Speak validated its model and built strong product-market fit before expanding to other Asian markets and eventually, globally. The arrival of Whisper and GPT-based LLMs in 2022 marked a turning point for Speak. Suddenly, capabilities that were once theoretical—real-time feedback, semantic understanding, conversational memory—became technically feasible. Speak didn't pivot, but rather evolved into its second phase: from a supplemental practice tool to a full-featured language tutor. This transition required significant engineering work, including building custom ASR models, managing latency, and integrating real-time APIs for interactive lessons. It also unlocked the possibility of developing voice-first, immersive roleplay experiences and a roadmap to real-time conversational fluency. To scale globally and support many languages, Speak is investing heavily in AI-generated curriculum and content. Instead of manually scripting all lessons, they are building agents and pipelines that can scaffold curriculum, generate lesson content, and adapt pedagogically to the learner. This ties into one of Speak's most ambitious goals: creating a knowledge graph that captures what a learner knows and can do in a target language, and then adapting the course path accordingly. This level-adjusting tutor model aims to personalize learning at scale and could eventually be applied beyond language learning to any educational domain. Finally, the conversation touches on the broader implications of AI-powered education and the slow real-world adoption of transformative AI technologies. Despite the capabilities of GPT-4 and others, most people's daily lives haven't changed dramatically. Speak sees itself as part of the generation of startups that will translate AI's raw power into tangible consumer value. The company is also a testament to long-term conviction—founded in 2016, it weathered years of slow growth before AI caught up to its vision. Now, with over $50M ARR, a growing B2B arm, and plans to expand across languages and learning domains, Speak represents what AI-native education could look like in the next decade. Chapters 00:00:00 Introductions & Thiel Fellowship Origins 00:02:13 Genesis of Speak: Early Vision & Market Focus 00:03:44 Building the Product: Iterations and Lessons Learned 00:10:59 AI's Role in Language Learning 00:13:49 Scaling Globally & B2B Expansion 00:16:30 Why Korea? Localizing for Success 00:19:08 Content Creation, The Speak Method, and Engineering Culture 00:23:31 The Impact of Whisper and LLM Advances 00:29:08 AI-Generated Content & Measuring Fluency 00:35:30 Personalization, Dialects, and Pronunciation 00:39:38 Immersive Learning, Multimodality, and Real-Time Voice 00:50:02 Engineering Challenges & Company Culture 00:53:20 Beyond Languages: B2B, Knowledge Graphs, and Broader Learning 00:57:32 Fun Stories, Lessons, and Reflections 01:02:03 Final Thoughts: The Future of AI Learning & Slow Takeoff
In this episode, CardioNerds Dr. Gurleen Kaur, Dr. Richard Ferraro, and Dr. Jake Roberts are joined by Cardio-Rheumatology expert, Dr. Monica Mukherjee, to discuss the role of utilizing multimodal imaging for cardiovascular disease risk stratification, monitoring, and management in patients with chronic systemic inflammation. The team delves into the contexts for utilizing advanced imaging to assess systemic inflammation with cardiac involvement, as well as the role of imaging in monitoring various specific cardiovascular complications that may develop due to inflammatory diseases. Audio editing by CardioNerds academy intern, Christiana Dangas. CardioNerds Prevention PageCardioNerds Episode PageCardioNerds AcademyCardionerds Healy Honor Roll CardioNerds Journal ClubSubscribe to The Heartbeat Newsletter!Check out CardioNerds SWAG!Become a CardioNerds Patron! Pearls - Cardiovascular Multimodality Imaging & Systemic Inflammation Systemic inflammatory diseases are associated with an elevated CVD risk that has significant implications for early detection, risk stratification, and implementation of therapeutic strategies to address these risks and disease-specific complications. As an example, patients with SLE have a 48-fold increased risk for developing ASCVD compared to the general population. They may also develop disease-specific complications, such as pericarditis, that require focused imaging approaches to detect. In addition to increasing the risk for CAD, systemic inflammatory diseases can also result in cardiac complications, including myocardial, pericardial, and valvular involvement. Assessment of these complications requires the use of different imaging techniques, with the modality and serial studies selected based on the suspected disease process involved. In most contexts, echocardiography remains the starting point for evaluating cardiac involvement in systemic inflammatory diseases and can inform the next steps in terms of diagnostic study selection for the assessment of specific cardiac processes. For example, if echocardiography is completed in an SLE patient and demonstrates potential myocardial or pericardial inflammation, the next steps in evaluation may include completing a cardiac MRI for better characterization. While no current guidelines or standards of care directly guide our selection of advanced imaging studies for screening and management of CVD in patients with systemic inflammatory diseases, our understanding of cardiac involvement in these patients continues to improve and will likely lead to future guideline development. Due to the vast heterogeneity of cardiac involvement both across and within different systemic inflammatory diseases, a personalized approach to caring for each individual patient remains central to CVD evaluation and management in these patients. For example, patients with systemic sclerosis and symptoms of shortness of breath may experience these symptoms due to a range of causes. Echocardiography can be a central guiding tool in assessing these patients for potential concerns related to pulmonary hypertension or diastolic dysfunction. Based on the initial echocardiogram, the next steps in evaluation may involve further ischemic evaluation or right heart catheterization, depending on the pathology of concern. Show notes - Cardiovascular Multimodality Imaging & Systemic Inflammation Episode notes drafted by Dr. Jake Roberts. What are the contexts in which we should consider pursuing multimodal cardiac imaging, and are there certain inflammatory disorders associated with systemic inflammation and higher associated CVD risk for which advanced imaging can help guide early intervention? Systemic inflammatory diseases are associated with elevated CVD risk, which has significant implications for early detection, risk stratification, prognostication, and implementation of therapeutic strategies to address CVD risk and complicat...
In de podcastserie proefschriften spreekt aios interne geneeskunde dr. Tessa Steenbruggen met promovendi. In deze aflevering spreekt zij met Timo Kalisvaart over zijn proefschrift, getiteld: ‘Multimodality imaging in bone and soft tissue tumours: Deciphering Quantitative Data'. Timo vertelt onder andere over ontwikkelingen in beeldvormingstechnieken die onderzoek en uiteindelijk de behandeling van bot- en wekedelentumoren vooruit kunnen helpen. Timo gaat op 12 juni zijn proefschrift verdedigen aan de Universiteit van Leiden bij zijn promoteren prof. dr. Hans Bloem en prof. dr. Lioe-Fee de Geus-Oei en copromotor dr. Willem Grootjans.Referenties Inspiratietips: boek 1: De heelmeesters – Abraham Verghese; boek 2: Oorlogsparadijs – Nico Dros DCE-MRI in osteosarcoom DCE-MRI in osteosarcoom 2 Fantoom-model Validatie Fantoom-model Oxford
This week, we discuss Google being found to be a monopoly, OpenAI's “offer” to buy Chrome, and some hot takes on JSON. Plus, is it better to wait on hold or ask for a callback? Watch the YouTube Live Recording of Episode (https://www.youtube.com/watch?v=EhUxUPJv5g4) 516 (https://www.youtube.com/watch?v=EhUxUPJv5g4) Runner-up Titles Just Fine The SDT “Fine” Scale Callback Asynchronous Friendship I would love to get to know you better…over text Send you Jams to the dry cleaners. JSON Take it xslt-easy! Rundown OpenAI OpenAI in talks to pay about $3 billion to acquire AI coding startup Windsurf (https://www.cnbc.com/2025/04/16/openai-in-talks-to-pay-about-3-billion-to-acquire-startup-windsurf.html) The Cursor Mirage (https://artificialintelligencemadesimple.substack.com/p/the-cursor-mirage) AI is for Tinkerers (https://redmonk.com/kholterhoff/2023/06/27/ai-is-for-tinkerers/) Vibe Coding is for PMs (https://redmonk.com/rstephens/2025/04/18/vibe-coding-is-for-pms/) OpenAI releases new simulated reasoning models with full tool access (https://arstechnica.com/ai/2025/04/openai-releases-new-simulated-reasoning-models-with-full-tool-access/) Clouded Judgement 4.18.25 - The Hidden Value in the AI Application Layer (https://cloudedjudgement.substack.com/p/clouded-judgement-41825-the-hidden?utm_source=post-email-title&publication_id=56878&post_id=161562220&utm_campaign=email-post-title&isFreemail=true&r=2l9&triedRedirect=true&utm_medium=email) OpenAI tells judge it would buy Chrome from Google (https://www.theverge.com/news/653882/openai-chrome-google-us-judge) The Creators of Model Context Protocol (https://www.latent.space/p/mcp?utm_source=substack&utm_medium=email) Judge finds Google holds illegal online ad tech monopolies (https://www.cnbc.com/2025/04/17/judge-finds-google-holds-illegal-online-ad-tech-monopolies.html) Intuit, Owner of TurboTax, Wins Battle Against America's Taxpayers (https://prospect.org/power/2025-04-17-intuit-turbotax-wins-battle-against-taxpayers-irs-direct-file/) Relevant to your Interests Switch 2 Carts Still Taste Bad, Designed Purposefully To Be Spat Out (https://www.gamespot.com/articles/switch-2-carts-still-taste-bad-designed-purposefully-to-be-spat-out/1100-6530649/) CEO Andy Jassy's 2024 Letter to Shareholders (https://www.aboutamazon.com/news/company-news/amazon-ceo-andy-jassy-2024-letter-to-shareholders) Amazon CEO Andy Jassy says AI costs will come down (https://www.cnbc.com/2025/04/10/amazon-ceo-andy-jassys-2025-shareholder-letter.html) Happy 18th Birthday CUDA! (https://www.aboutamazon.com/news/company-news/amazon-ceo-andy-jassy-2024-letter-to-shareholders) Honeycomb Acquires Grit: A Strategic Investment in Pragmatic AI and Customer Value (https://www.honeycomb.io/blog/honeycomb-acquires-grit) Everything Announced at Google Cloud Next in 12 Minutes (https://www.youtube.com/watch?v=2OpHbyN4vEM) GitLab vs GitHub : Key Differences in 2025 (https://spacelift.io/blog/gitlab-vs-github) Old Fashioned Function Keys (https://economistwritingeveryday.com/2025/04/11/old-fashioned-function-keys/) Fake job seekers are flooding U.S. companies that are hiring for remote positions, (https://www.cnbc.com/2025/04/08/fake-job-seekers-use-ai-to-interview-for-remote-jobs-tech-ceos-say.html) NetRise raises $10M to expand software supply chain security platform (https://siliconangle.com/2025/04/15/netrise-raises-10-million-expand-software-supply-chain-security-platform/) Mark Zuckerberg's antitrust testimony aired his wildest ideas from Meta's history (https://www.theverge.com/policy/649520/zuckerberg-meta-ftc-antitrust-testimony-facebook-history) How Much Should I Be Spending On Observability? (https://www.honeycomb.io/blog/how-much-should-i-spend-on-observability-pt1) Did we just make platform engineering much easier by shipping a cloud IDP? (https://seroter.com/2025/04/16/did-we-just-make-platform-engineering-much-easier-by-shipping-a-cloud-idp/) Google Cloud Next 2025: Agentic AI Stack, Multimodality, And Sovereignty (https://www.forrester.com/blogs/google-next-2025-agentic-ai-stack-multimodality-and-sovereignty/) iPhone Shipments Down 9% in China's Q1 Smartphone Boom (https://www.macrumors.com/2025/04/18/iphone-shipments-down-in-china-q1/) Exclusive: Anthropic warns fully AI employees are a year away (https://www.axios.com/2025/04/22/ai-anthropic-virtual-employees-security) Synology requires self-branded drives for some consumer NAS systems, drops full functionality and support for third-party HDDs (https://www.tomshardware.com/pc-components/nas/synology-requires-self-branded-drives-for-some-consumer-nas-systems-drops-full-functionality-and-support-for-third-party-hdds) Porting Tailscale to Plan 9 (https://tailscale.com/blog/plan9-port?ck_subscriber_id=512840665&utm_source=convertkit&utm_medium=email&utm_campaign=[Last%20Week%20in%20AWS]%20Issue%20#418:%20Another%20New%20Capacity%20Dingus%20-%2017270009) CVE Foundation (https://www.thecvefoundation.org/) The Cursor Mirage (https://artificialintelligencemadesimple.substack.com/p/the-cursor-mirage) There's a Lot of Bad Telemetry Out There (https://blog.olly.garden/theres-a-lot-of-bad-telemetry-out-there) Gee Wiz (https://redmonk.com/rstephens/2025/04/04/gee-wiz/?ck_subscriber_id=512840665&utm_source=convertkit&utm_medium=email&utm_campaign=[Last%20Week%20in%20AWS]%20Issue%20#418:%20Another%20New%20Capacity%20Dingus%20-%2017270009) Nonsense Silicon Valley crosswalk buttons hacked to imitate Musk, Zuckerberg's voices (https://techcrunch.com/2025/04/14/silicon-valley-crosswalk-buttons-hacked-to-imitate-musk-zuckerberg-voices/) A Visit to Costco in France (https://davidlebovitz.substack.com/p/a-visit-to-costco-in-france) No sweat: Humanoid robots run a Chinese half-marathon (https://apnews.com/article/china-robot-half-marathon-153c6823bd628625106ed26267874d21) Metre, a consistent measurement of the world (https://mappingignorance.org/2025/04/23/150-years-ago-the-metre-convention-determined-how-we-measure-the-world/) Conferences DevOps Days Atlanta (https://devopsdays.org/events/2025-atlanta/welcome/), April 29th-30th. KCD Texas Austin 2025 (https://community.cncf.io/events/details/cncf-kcd-texas-presents-kcd-texas-austin-2025/), May 15th, Whitney Lee Speaking. Cloud Foundry Day US (https://events.linuxfoundation.org/cloud-foundry-day-north-america/), May 14th, Palo Alto, CA, Coté speaking. Fr (https://vmwarereg.fig-street.com/051325-tanzu-workshop/)ee AI workshop (https://vmwarereg.fig-street.com/051325-tanzu-workshop/), May 13th. day before C (https://events.linuxfoundation.org/cloud-foundry-day-north-america/)loud (https://events.linuxfoundation.org/cloud-foundry-day-north-america/) (https://events.linuxfoundation.org/cloud-foundry-day-north-america/)Foundry (https://events.linuxfoundation.org/cloud-foundry-day-north-america/) Day (https://events.linuxfoundation.org/cloud-foundry-day-north-america/). NDC Oslo (https://ndcoslo.com/), May 21st-23th, Coté speaking. SDT News & Community Join our Slack community (https://softwaredefinedtalk.slack.com/join/shared_invite/zt-1hn55iv5d-UTfN7mVX1D9D5ExRt3ZJYQ#/shared-invite/email) Email the show: questions@softwaredefinedtalk.com (mailto:questions@softwaredefinedtalk.com) Free stickers: Email your address to stickers@softwaredefinedtalk.com (mailto:stickers@softwaredefinedtalk.com) Follow us on social media: Twitter (https://twitter.com/softwaredeftalk), Threads (https://www.threads.net/@softwaredefinedtalk), Mastodon (https://hachyderm.io/@softwaredefinedtalk), LinkedIn (https://www.linkedin.com/company/software-defined-talk/), BlueSky (https://bsky.app/profile/softwaredefinedtalk.com) Watch us on: Twitch (https://www.twitch.tv/sdtpodcast), YouTube (https://www.youtube.com/channel/UCi3OJPV6h9tp-hbsGBLGsDQ/featured), Instagram (https://www.instagram.com/softwaredefinedtalk/), TikTok (https://www.tiktok.com/@softwaredefinedtalk) Book offer: Use code SDT for $20 off "Digital WTF" by Coté (https://leanpub.com/digitalwtf/c/sdt) Sponsor the show (https://www.softwaredefinedtalk.com/ads): ads@softwaredefinedtalk.com (mailto:ads@softwaredefinedtalk.com) Recommendations Brandon: Dope Thief (https://www.rottentomatoes.com/tv/dope_thief) on Apple TV (https://www.rottentomatoes.com/tv/dope_thief) Coté: Check out the recording of the Tanzu Annual update (https://www.youtube.com/watch?v=c1QZXzJcAfQ), all about Tanzu's private AI platform. Next, watch Coté's new MCP for D&D video (#4) figures out something cool to do with MCP Prompts (https://www.youtube.com/watch?v=xEtYBznneFg), they make sense now. And, a regret-a-mmendation: Fields Notes annual subscription (https://fieldnotesbrand.com/limited-editions). Photo Credits Header (https://unsplash.com/photos/a-telephone-sitting-on-top-of-a-wooden-shelf-2XnGRN_caHc)
How can AI help us understand and master deeply complex systems—from the game Go, which has 10 to the power 170 possible positions a player could pursue, or proteins, which, on average, can fold in 10 to the power 300 possible ways? This week, Reid and Aria are joined by Demis Hassabis. Demis is a British artificial intelligence researcher, co-founder, and CEO of the AI company, DeepMind. Under his leadership, DeepMind developed Alpha Go, the first AI to defeat a human world champion in Go and later created AlphaFold, which solved the 50-year-old protein folding problem. He's considered one of the most influential figures in AI. Demis, Reid, and Aria discuss game theory, medicine, multimodality, and the nature of innovation and creativity. For more info on the podcast and transcripts of all the episodes, visit https://www.possible.fm/podcast/ Listen to more from Possible here. Learn more about your ad choices. Visit podcastchoices.com/adchoices
How can AI help us understand and master deeply complex systems—from the game Go, which has 10 to the power 170 possible positions a player could pursue, or proteins, which, on average, can fold in 10 to the power 300 possible ways? This week, Reid and Aria are joined by Demis Hassabis. Demis is a British artificial intelligence researcher, co-founder, and CEO of the AI company, DeepMind. Under his leadership, DeepMind developed Alpha Go, the first AI to defeat a human world champion in Go and later created AlphaFold, which solved the 50-year-old protein folding problem. He's considered one of the most influential figures in AI. Demis, Reid, and Aria discuss game theory, medicine, multimodality, and the nature of innovation and creativity. For more info on the podcast and transcripts of all the episodes, visit https://www.possible.fm/podcast/ Select mentions: Hitchhiker's Guide to the Galaxy by Douglas Adams AlphaGo documentary: https://www.youtube.com/watch?v=WXuK6gekU1Y Nash equilibrium & US mathematician John Forbes Nash Homo Ludens by Johan Huizinga Veo 2, an advanced, AI-powered video creation platform from Google DeepMind The Culture series by Iain Banks Hartmut Neven, German-American computer scientist Topics: 3:11 - Hellos and intros 5:20 - Brute force vs. self-learning systems 8:24 - How a learning approach helped develop new AI systems 11:29 - AlphaGo's Move 37 16:16 - What will the next Move 37 be? 19:42 - What makes an AI that can play the video game StarCraft impressive 22:32 - The importance of the act of play 26:24 - Data and synthetic data 28:33 - Midroll ad 28:39 - Is it important to have AI embedded in the world? 33:44 - The trade-off between thinking time and output quality 36:03 - Computer languages designed for AI 40:22 - The future of multimodality 43:27 - AI and geographic diversity 48:24 - AlphaFold and the future of medicine 51:18 - Rapid-fire Questions Possible is an award-winning podcast that sketches out the brightest version of the future—and what it will take to get there. Most of all, it asks: what if, in the future, everything breaks humanity's way? Tune in for grounded and speculative takes on how technology—and, in particular, AI—is inspiring change and transforming the future. Hosted by Reid Hoffman and Aria Finger, each episode features an interview with an ambitious builder or deep thinker on a topic, from art to geopolitics and from healthcare to education. These conversations also showcase another kind of guest: AI. Each episode seeks to enhance and advance our discussion about what humanity could possibly get right if we leverage technology—and our collective effort—effectively.
Send us a textIn this episode, Raúl Alberto Mora talks to us about education theory as a driver for innovative teaching, mentoring and supporting one another, and the journey of a career in Education. Raúl is known worldwide for his work in the areas of alternative literacy paradigms in second language education and research, the study of second language literacies in physical and virtual spaces, and the use of sociocritical frameworks in language education. In particular, he studies the applications of alternative literacy paradigms to analyze second-language literacy practices in urban and virtual spaces He works to understand the use of languages a social and semiotic resource. His work has been published in the Journal of Adolescent and Adult Literacy, The ALAN Review, Bilingualism and Bilingual Education, International Journal of Cultural Studies, Social Semiotics, Key Concepts in Intercultural Dialogue, Pedagogies: An International Journal, and other journals. He co-edited The Handbook of Critical Literacies, Translanguaging and Multimodality as Flow, Agency, and a New Sense of Advocacy in and From the Global South, and most recently, Reimagining Literacy in the Age of AI: Theory and Practice. Dr. Raúl Alberto Mora Velez is a researcher at the Educations, Languages, and Learning Environments research group and chairs the award-winning Literacies in Second Languages Project (LSLP) research lab. Raúl is a Research Professor at Universidad Pontificia Bolivariana in Colombia. For more information about our guest, stay tuned to the end of this episode.Links mentioned in this episode:Literacies in Second Languages Project Micro-PapersAmerican Educational Research AssociationLiteracy Research AssociationConnect with Classroom Caffeine at www.classroomcaffeine.com or on Instagram, Facebook, Twitter, and LinkedIn.
Today's episode is with Paul Klein, founder of Browserbase. We talked about building browser infrastructure for AI agents, the future of agent authentication, and their open source framework Stagehand.* [00:00:00] Introductions* [00:04:46] AI-specific challenges in browser infrastructure* [00:07:05] Multimodality in AI-Powered Browsing* [00:12:26] Running headless browsers at scale* [00:18:46] Geolocation when proxying* [00:21:25] CAPTCHAs and Agent Auth* [00:28:21] Building “User take over” functionality* [00:33:43] Stagehand: AI web browsing framework* [00:38:58] OpenAI's Operator and computer use agents* [00:44:44] Surprising use cases of Browserbase* [00:47:18] Future of browser automation and market competition* [00:53:11] Being a solo founderTranscriptAlessio [00:00:04]: Hey everyone, welcome to the Latent Space podcast. This is Alessio, partner and CTO at Decibel Partners, and I'm joined by my co-host Swyx, founder of Smol.ai.swyx [00:00:12]: Hey, and today we are very blessed to have our friends, Paul Klein, for the fourth, the fourth, CEO of Browserbase. Welcome.Paul [00:00:21]: Thanks guys. Yeah, I'm happy to be here. I've been lucky to know both of you for like a couple of years now, I think. So it's just like we're hanging out, you know, with three ginormous microphones in front of our face. It's totally normal hangout.swyx [00:00:34]: Yeah. We've actually mentioned you on the podcast, I think, more often than any other Solaris tenant. Just because like you're one of the, you know, best performing, I think, LLM tool companies that have started up in the last couple of years.Paul [00:00:50]: Yeah, I mean, it's been a whirlwind of a year, like Browserbase is actually pretty close to our first birthday. So we are one years old. And going from, you know, starting a company as a solo founder to... To, you know, having a team of 20 people, you know, a series A, but also being able to support hundreds of AI companies that are building AI applications that go out and automate the web. It's just been like, really cool. It's been happening a little too fast. I think like collectively as an AI industry, let's just take a week off together. I took my first vacation actually two weeks ago, and Operator came out on the first day, and then a week later, DeepSeat came out. And I'm like on vacation trying to chill. I'm like, we got to build with this stuff, right? So it's been a breakneck year. But I'm super happy to be here and like talk more about all the stuff we're seeing. And I'd love to hear kind of what you guys are excited about too, and share with it, you know?swyx [00:01:39]: Where to start? So people, you've done a bunch of podcasts. I think I strongly recommend Jack Bridger's Scaling DevTools, as well as Turner Novak's The Peel. And, you know, I'm sure there's others. So you covered your Twilio story in the past, talked about StreamClub, you got acquired to Mux, and then you left to start Browserbase. So maybe we just start with what is Browserbase? Yeah.Paul [00:02:02]: Browserbase is the web browser for your AI. We're building headless browser infrastructure, which are browsers that run in a server environment that's accessible to developers via APIs and SDKs. It's really hard to run a web browser in the cloud. You guys are probably running Chrome on your computers, and that's using a lot of resources, right? So if you want to run a web browser or thousands of web browsers, you can't just spin up a bunch of lambdas. You actually need to use a secure containerized environment. You have to scale it up and down. It's a stateful system. And that infrastructure is, like, super painful. And I know that firsthand, because at my last company, StreamClub, I was CTO, and I was building our own internal headless browser infrastructure. That's actually why we sold the company, is because Mux really wanted to buy our headless browser infrastructure that we'd built. And it's just a super hard problem. And I actually told my co-founders, I would never start another company unless it was a browser infrastructure company. And it turns out that's really necessary in the age of AI, when AI can actually go out and interact with websites, click on buttons, fill in forms. You need AI to do all of that work in an actual browser running somewhere on a server. And BrowserBase powers that.swyx [00:03:08]: While you're talking about it, it occurred to me, not that you're going to be acquired or anything, but it occurred to me that it would be really funny if you became the Nikita Beer of headless browser companies. You just have one trick, and you make browser companies that get acquired.Paul [00:03:23]: I truly do only have one trick. I'm screwed if it's not for headless browsers. I'm not a Go programmer. You know, I'm in AI grant. You know, browsers is an AI grant. But we were the only company in that AI grant batch that used zero dollars on AI spend. You know, we're purely an infrastructure company. So as much as people want to ask me about reinforcement learning, I might not be the best guy to talk about that. But if you want to ask about headless browser infrastructure at scale, I can talk your ear off. So that's really my area of expertise. And it's a pretty niche thing. Like, nobody has done what we're doing at scale before. So we're happy to be the experts.swyx [00:03:59]: You do have an AI thing, stagehand. We can talk about the sort of core of browser-based first, and then maybe stagehand. Yeah, stagehand is kind of the web browsing framework. Yeah.What is Browserbase? Headless Browser Infrastructure ExplainedAlessio [00:04:10]: Yeah. Yeah. And maybe how you got to browser-based and what problems you saw. So one of the first things I worked on as a software engineer was integration testing. Sauce Labs was kind of like the main thing at the time. And then we had Selenium, we had Playbrite, we had all these different browser things. But it's always been super hard to do. So obviously you've worked on this before. When you started browser-based, what were the challenges? What were the AI-specific challenges that you saw versus, there's kind of like all the usual running browser at scale in the cloud, which has been a problem for years. What are like the AI unique things that you saw that like traditional purchase just didn't cover? Yeah.AI-specific challenges in browser infrastructurePaul [00:04:46]: First and foremost, I think back to like the first thing I did as a developer, like as a kid when I was writing code, I wanted to write code that did stuff for me. You know, I wanted to write code to automate my life. And I do that probably by using curl or beautiful soup to fetch data from a web browser. And I think I still do that now that I'm in the cloud. And the other thing that I think is a huge challenge for me is that you can't just create a web site and parse that data. And we all know that now like, you know, taking HTML and plugging that into an LLM, you can extract insights, you can summarize. So it was very clear that now like dynamic web scraping became very possible with the rise of large language models or a lot easier. And that was like a clear reason why there's been more usage of headless browsers, which are necessary because a lot of modern websites don't expose all of their page content via a simple HTTP request. You know, they actually do require you to run this type of code for a specific time. JavaScript on the page to hydrate this. Airbnb is a great example. You go to airbnb.com. A lot of that content on the page isn't there until after they run the initial hydration. So you can't just scrape it with a curl. You need to have some JavaScript run. And a browser is that JavaScript engine that's going to actually run all those requests on the page. So web data retrieval was definitely one driver of starting BrowserBase and the rise of being able to summarize that within LLM. Also, I was familiar with if I wanted to automate a website, I could write one script and that would work for one website. It was very static and deterministic. But the web is non-deterministic. The web is always changing. And until we had LLMs, there was no way to write scripts that you could write once that would run on any website. That would change with the structure of the website. Click the login button. It could mean something different on many different websites. And LLMs allow us to generate code on the fly to actually control that. So I think that rise of writing the generic automation scripts that can work on many different websites, to me, made it clear that browsers are going to be a lot more useful because now you can automate a lot more things without writing. If you wanted to write a script to book a demo call on 100 websites, previously, you had to write 100 scripts. Now you write one script that uses LLMs to generate that script. That's why we built our web browsing framework, StageHand, which does a lot of that work for you. But those two things, web data collection and then enhanced automation of many different websites, it just felt like big drivers for more browser infrastructure that would be required to power these kinds of features.Alessio [00:07:05]: And was multimodality also a big thing?Paul [00:07:08]: Now you can use the LLMs to look, even though the text in the dome might not be as friendly. Maybe my hot take is I was always kind of like, I didn't think vision would be as big of a driver. For UI automation, I felt like, you know, HTML is structured text and large language models are good with structured text. But it's clear that these computer use models are often vision driven, and they've been really pushing things forward. So definitely being multimodal, like rendering the page is required to take a screenshot to give that to a computer use model to take actions on a website. And it's just another win for browser. But I'll be honest, that wasn't what I was thinking early on. I didn't even think that we'd get here so fast with multimodality. I think we're going to have to get back to multimodal and vision models.swyx [00:07:50]: This is one of those things where I forgot to mention in my intro that I'm an investor in Browserbase. And I remember that when you pitched to me, like a lot of the stuff that we have today, we like wasn't on the original conversation. But I did have my original thesis was something that we've talked about on the podcast before, which is take the GPT store, the custom GPT store, all the every single checkbox and plugin is effectively a startup. And this was the browser one. I think the main hesitation, I think I actually took a while to get back to you. The main hesitation was that there were others. Like you're not the first hit list browser startup. It's not even your first hit list browser startup. There's always a question of like, will you be the category winner in a place where there's a bunch of incumbents, to be honest, that are bigger than you? They're just not targeted at the AI space. They don't have the backing of Nat Friedman. And there's a bunch of like, you're here in Silicon Valley. They're not. I don't know.Paul [00:08:47]: I don't know if that's, that was it, but like, there was a, yeah, I mean, like, I think I tried all the other ones and I was like, really disappointed. Like my background is from working at great developer tools, companies, and nothing had like the Vercel like experience. Um, like our biggest competitor actually is partly owned by private equity and they just jacked up their prices quite a bit. And the dashboard hasn't changed in five years. And I actually used them at my last company and tried them and I was like, oh man, like there really just needs to be something that's like the experience of these great infrastructure companies, like Stripe, like clerk, like Vercel that I use in love, but oriented towards this kind of like more specific category, which is browser infrastructure, which is really technically complex. Like a lot of stuff can go wrong on the internet when you're running a browser. The internet is very vast. There's a lot of different configurations. Like there's still websites that only work with internet explorer out there. How do you handle that when you're running your own browser infrastructure? These are the problems that we have to think about and solve at BrowserBase. And it's, it's certainly a labor of love, but I built this for me, first and foremost, I know it's super cheesy and everyone says that for like their startups, but it really, truly was for me. If you look at like the talks I've done even before BrowserBase, and I'm just like really excited to try and build a category defining infrastructure company. And it's, it's rare to have a new category of infrastructure exists. We're here in the Chroma offices and like, you know, vector databases is a new category of infrastructure. Is it, is it, I mean, we can, we're in their office, so, you know, we can, we can debate that one later. That is one.Multimodality in AI-Powered Browsingswyx [00:10:16]: That's one of the industry debates.Paul [00:10:17]: I guess we go back to the LLMOS talk that Karpathy gave way long ago. And like the browser box was very clearly there and it seemed like the people who were building in this space also agreed that browsers are a core primitive of infrastructure for the LLMOS that's going to exist in the future. And nobody was building something there that I wanted to use. So I had to go build it myself.swyx [00:10:38]: Yeah. I mean, exactly that talk that, that honestly, that diagram, every box is a startup and there's the code box and then there's the. The browser box. I think at some point they will start clashing there. There's always the question of the, are you a point solution or are you the sort of all in one? And I think the point solutions tend to win quickly, but then the only ones have a very tight cohesive experience. Yeah. Let's talk about just the hard problems of browser base you have on your website, which is beautiful. Thank you. Was there an agency that you used for that? Yeah. Herb.paris.Paul [00:11:11]: They're amazing. Herb.paris. Yeah. It's H-E-R-V-E. I highly recommend for developers. Developer tools, founders to work with consumer agencies because they end up building beautiful things and the Parisians know how to build beautiful interfaces. So I got to give prep.swyx [00:11:24]: And chat apps, apparently are, they are very fast. Oh yeah. The Mistral chat. Yeah. Mistral. Yeah.Paul [00:11:31]: Late chat.swyx [00:11:31]: Late chat. And then your videos as well, it was professionally shot, right? The series A video. Yeah.Alessio [00:11:36]: Nico did the videos. He's amazing. Not the initial video that you shot at the new one. First one was Austin.Paul [00:11:41]: Another, another video pretty surprised. But yeah, I mean, like, I think when you think about how you talk about your company. You have to think about the way you present yourself. It's, you know, as a developer, you think you evaluate a company based on like the API reliability and the P 95, but a lot of developers say, is the website good? Is the message clear? Do I like trust this founder? I'm building my whole feature on. So I've tried to nail that as well as like the reliability of the infrastructure. You're right. It's very hard. And there's a lot of kind of foot guns that you run into when running headless browsers at scale. Right.Competing with Existing Headless Browser Solutionsswyx [00:12:10]: So let's pick one. You have eight features here. Seamless integration. Scalability. Fast or speed. Secure. Observable. Stealth. That's interesting. Extensible and developer first. What comes to your mind as like the top two, three hardest ones? Yeah.Running headless browsers at scalePaul [00:12:26]: I think just running headless browsers at scale is like the hardest one. And maybe can I nerd out for a second? Is that okay? I heard this is a technical audience, so I'll talk to the other nerds. Whoa. They were listening. Yeah. They're upset. They're ready. The AGI is angry. Okay. So. So how do you run a browser in the cloud? Let's start with that, right? So let's say you're using a popular browser automation framework like Puppeteer, Playwright, and Selenium. Maybe you've written a code, some code locally on your computer that opens up Google. It finds the search bar and then types in, you know, search for Latent Space and hits the search button. That script works great locally. You can see the little browser open up. You want to take that to production. You want to run the script in a cloud environment. So when your laptop is closed, your browser is doing something. The browser is doing something. Well, I, we use Amazon. You can see the little browser open up. You know, the first thing I'd reach for is probably like some sort of serverless infrastructure. I would probably try and deploy on a Lambda. But Chrome itself is too big to run on a Lambda. It's over 250 megabytes. So you can't easily start it on a Lambda. So you maybe have to use something like Lambda layers to squeeze it in there. Maybe use a different Chromium build that's lighter. And you get it on the Lambda. Great. It works. But it runs super slowly. It's because Lambdas are very like resource limited. They only run like with one vCPU. You can run one process at a time. Remember, Chromium is super beefy. It's barely running on my MacBook Air. I'm still downloading it from a pre-run. Yeah, from the test earlier, right? I'm joking. But it's big, you know? So like Lambda, it just won't work really well. Maybe it'll work, but you need something faster. Your users want something faster. Okay. Well, let's put it on a beefier instance. Let's get an EC2 server running. Let's throw Chromium on there. Great. Okay. I can, that works well with one user. But what if I want to run like 10 Chromium instances, one for each of my users? Okay. Well, I might need two EC2 instances. Maybe 10. All of a sudden, you have multiple EC2 instances. This sounds like a problem for Kubernetes and Docker, right? Now, all of a sudden, you're using ECS or EKS, the Kubernetes or container solutions by Amazon. You're spending up and down containers, and you're spending a whole engineer's time on kind of maintaining this stateful distributed system. Those are some of the worst systems to run because when it's a stateful distributed system, it means that you are bound by the connections to that thing. You have to keep the browser open while someone is working with it, right? That's just a painful architecture to run. And there's all this other little gotchas with Chromium, like Chromium, which is the open source version of Chrome, by the way. You have to install all these fonts. You want emojis working in your browsers because your vision model is looking for the emoji. You need to make sure you have the emoji fonts. You need to make sure you have all the right extensions configured, like, oh, do you want ad blocking? How do you configure that? How do you actually record all these browser sessions? Like it's a headless browser. You can't look at it. So you need to have some sort of observability. Maybe you're recording videos and storing those somewhere. It all kind of adds up to be this just giant monster piece of your project when all you wanted to do was run a lot of browsers in production for this little script to go to google.com and search. And when I see a complex distributed system, I see an opportunity to build a great infrastructure company. And we really abstract that away with Browserbase where our customers can use these existing frameworks, Playwright, Publisher, Selenium, or our own stagehand and connect to our browsers in a serverless-like way. And control them, and then just disconnect when they're done. And they don't have to think about the complex distributed system behind all of that. They just get a browser running anywhere, anytime. Really easy to connect to.swyx [00:15:55]: I'm sure you have questions. My standard question with anything, so essentially you're a serverless browser company, and there's been other serverless things that I'm familiar with in the past, serverless GPUs, serverless website hosting. That's where I come from with Netlify. One question is just like, you promised to spin up thousands of servers. You promised to spin up thousands of browsers in milliseconds. I feel like there's no real solution that does that yet. And I'm just kind of curious how. The only solution I know, which is to kind of keep a kind of warm pool of servers around, which is expensive, but maybe not so expensive because it's just CPUs. So I'm just like, you know. Yeah.Browsers as a Core Primitive in AI InfrastructurePaul [00:16:36]: You nailed it, right? I mean, how do you offer a serverless-like experience with something that is clearly not serverless, right? And the answer is, you need to be able to run... We run many browsers on single nodes. We use Kubernetes at browser base. So we have many pods that are being scheduled. We have to predictably schedule them up or down. Yes, thousands of browsers in milliseconds is the best case scenario. If you hit us with 10,000 requests, you may hit a slower cold start, right? So we've done a lot of work on predictive scaling and being able to kind of route stuff to different regions where we have multiple regions of browser base where we have different pools available. You can also pick the region you want to go to based on like lower latency, round trip, time latency. It's very important with these types of things. There's a lot of requests going over the wire. So for us, like having a VM like Firecracker powering everything under the hood allows us to be super nimble and spin things up or down really quickly with strong multi-tenancy. But in the end, this is like the complex infrastructural challenges that we have to kind of deal with at browser base. And we have a lot more stuff on our roadmap to allow customers to have more levers to pull to exchange, do you want really fast browser startup times or do you want really low costs? And if you're willing to be more flexible on that, we may be able to kind of like work better for your use cases.swyx [00:17:44]: Since you used Firecracker, shouldn't Fargate do that for you or did you have to go lower level than that? We had to go lower level than that.Paul [00:17:51]: I find this a lot with Fargate customers, which is alarming for Fargate. We used to be a giant Fargate customer. Actually, the first version of browser base was ECS and Fargate. And unfortunately, it's a great product. I think we were actually the largest Fargate customer in our region for a little while. No, what? Yeah, seriously. And unfortunately, it's a great product, but I think if you're an infrastructure company, you actually have to have a deeper level of control over these primitives. I think it's the same thing is true with databases. We've used other database providers and I think-swyx [00:18:21]: Yeah, serverless Postgres.Paul [00:18:23]: Shocker. When you're an infrastructure company, you're on the hook if any provider has an outage. And I can't tell my customers like, hey, we went down because so-and-so went down. That's not acceptable. So for us, we've really moved to bringing things internally. It's kind of opposite of what we preach. We tell our customers, don't build this in-house, but then we're like, we build a lot of stuff in-house. But I think it just really depends on what is in the critical path. We try and have deep ownership of that.Alessio [00:18:46]: On the distributed location side, how does that work for the web where you might get sort of different content in different locations, but the customer is expecting, you know, if you're in the US, I'm expecting the US version. But if you're spinning up my browser in France, I might get the French version. Yeah.Paul [00:19:02]: Yeah. That's a good question. Well, generally, like on the localization, there is a thing called locale in the browser. You can set like what your locale is. If you're like in the ENUS browser or not, but some things do IP, IP based routing. And in that case, you may want to have a proxy. Like let's say you're running something in the, in Europe, but you want to make sure you're showing up from the US. You may want to use one of our proxy features so you can turn on proxies to say like, make sure these connections always come from the United States, which is necessary too, because when you're browsing the web, you're coming from like a, you know, data center IP, and that can make things a lot harder to browse web. So we do have kind of like this proxy super network. Yeah. We have a proxy for you based on where you're going, so you can reliably automate the web. But if you get scheduled in Europe, that doesn't happen as much. We try and schedule you as close to, you know, your origin that you're trying to go to. But generally you have control over the regions you can put your browsers in. So you can specify West one or East one or Europe. We only have one region of Europe right now, actually. Yeah.Alessio [00:19:55]: What's harder, the browser or the proxy? I feel like to me, it feels like actually proxying reliably at scale. It's much harder than spending up browsers at scale. I'm curious. It's all hard.Paul [00:20:06]: It's layers of hard, right? Yeah. I think it's different levels of hard. I think the thing with the proxy infrastructure is that we work with many different web proxy providers and some are better than others. Some have good days, some have bad days. And our customers who've built browser infrastructure on their own, they have to go and deal with sketchy actors. Like first they figure out their own browser infrastructure and then they got to go buy a proxy. And it's like you can pay in Bitcoin and it just kind of feels a little sus, right? It's like you're buying drugs when you're trying to get a proxy online. We have like deep relationships with these counterparties. We're able to audit them and say, is this proxy being sourced ethically? Like it's not running on someone's TV somewhere. Is it free range? Yeah. Free range organic proxies, right? Right. We do a level of diligence. We're SOC 2. So we have to understand what is going on here. But then we're able to make sure that like we route around proxy providers not working. There's proxy providers who will just, the proxy will stop working all of a sudden. And then if you don't have redundant proxying on your own browsers, that's hard down for you or you may get some serious impacts there. With us, like we intelligently know, hey, this proxy is not working. Let's go to this one. And you can kind of build a network of multiple providers to really guarantee the best uptime for our customers. Yeah. So you don't own any proxies? We don't own any proxies. You're right. The team has been saying who wants to like take home a little proxy server, but not yet. We're not there yet. You know?swyx [00:21:25]: It's a very mature market. I don't think you should build that yourself. Like you should just be a super customer of them. Yeah. Scraping, I think, is the main use case for that. I guess. Well, that leads us into CAPTCHAs and also off, but let's talk about CAPTCHAs. You had a little spiel that you wanted to talk about CAPTCHA stuff.Challenges of Scaling Browser InfrastructurePaul [00:21:43]: Oh, yeah. I was just, I think a lot of people ask, if you're thinking about proxies, you're thinking about CAPTCHAs too. I think it's the same thing. You can go buy CAPTCHA solvers online, but it's the same buying experience. It's some sketchy website, you have to integrate it. It's not fun to buy these things and you can't really trust that the docs are bad. What Browserbase does is we integrate a bunch of different CAPTCHAs. We do some stuff in-house, but generally we just integrate with a bunch of known vendors and continually monitor and maintain these things and say, is this working or not? Can we route around it or not? These are CAPTCHA solvers. CAPTCHA solvers, yeah. Not CAPTCHA providers, CAPTCHA solvers. Yeah, sorry. CAPTCHA solvers. We really try and make sure all of that works for you. I think as a dev, if I'm buying infrastructure, I want it all to work all the time and it's important for us to provide that experience by making sure everything does work and monitoring it on our own. Yeah. Right now, the world of CAPTCHAs is tricky. I think AI agents in particular are very much ahead of the internet infrastructure. CAPTCHAs are designed to block all types of bots, but there are now good bots and bad bots. I think in the future, CAPTCHAs will be able to identify who a good bot is, hopefully via some sort of KYC. For us, we've been very lucky. We have very little to no known abuse of Browserbase because we really look into who we work with. And for certain types of CAPTCHA solving, we only allow them on certain types of plans because we want to make sure that we can know what people are doing, what their use cases are. And that's really allowed us to try and be an arbiter of good bots, which is our long term goal. I want to build great relationships with people like Cloudflare so we can agree, hey, here are these acceptable bots. We'll identify them for you and make sure we flag when they come to your website. This is a good bot, you know?Alessio [00:23:23]: I see. And Cloudflare said they want to do more of this. So they're going to set by default, if they think you're an AI bot, they're going to reject. I'm curious if you think this is something that is going to be at the browser level or I mean, the DNS level with Cloudflare seems more where it should belong. But I'm curious how you think about it.Paul [00:23:40]: I think the web's going to change. You know, I think that the Internet as we have it right now is going to change. And we all need to just accept that the cat is out of the bag. And instead of kind of like wishing the Internet was like it was in the 2000s, we can have free content line that wouldn't be scraped. It's just it's not going to happen. And instead, we should think about like, one, how can we change? How can we change the models of, you know, information being published online so people can adequately commercialize it? But two, how do we rebuild applications that expect that AI agents are going to log in on their behalf? Those are the things that are going to allow us to kind of like identify good and bad bots. And I think the team at Clerk has been doing a really good job with this on the authentication side. I actually think that auth is the biggest thing that will prevent agents from accessing stuff, not captchas. And I think there will be agent auth in the future. I don't know if it's going to happen from an individual company, but actually authentication providers that have a, you know, hidden login as agent feature, which will then you put in your email, you'll get a push notification, say like, hey, your browser-based agent wants to log into your Airbnb. You can approve that and then the agent can proceed. That really circumvents the need for captchas or logging in as you and sharing your password. I think agent auth is going to be one way we identify good bots going forward. And I think a lot of this captcha solving stuff is really short-term problems as the internet kind of reorients itself around how it's going to work with agents browsing the web, just like people do. Yeah.Managing Distributed Browser Locations and Proxiesswyx [00:24:59]: Stitch recently was on Hacker News for talking about agent experience, AX, which is a thing that Netlify is also trying to clone and coin and talk about. And we've talked about this on our previous episodes before in a sense that I actually think that's like maybe the only part of the tech stack that needs to be kind of reinvented for agents. Everything else can stay the same, CLIs, APIs, whatever. But auth, yeah, we need agent auth. And it's mostly like short-lived, like it should not, it should be a distinct, identity from the human, but paired. I almost think like in the same way that every social network should have your main profile and then your alt accounts or your Finsta, it's almost like, you know, every, every human token should be paired with the agent token and the agent token can go and do stuff on behalf of the human token, but not be presumed to be the human. Yeah.Paul [00:25:48]: It's like, it's, it's actually very similar to OAuth is what I'm thinking. And, you know, Thread from Stitch is an investor, Colin from Clerk, Octaventures, all investors in browser-based because like, I hope they solve this because they'll make browser-based submission more possible. So we don't have to overcome all these hurdles, but I think it will be an OAuth-like flow where an agent will ask to log in as you, you'll approve the scopes. Like it can book an apartment on Airbnb, but it can't like message anybody. And then, you know, the agent will have some sort of like role-based access control within an application. Yeah. I'm excited for that.swyx [00:26:16]: The tricky part is just, there's one, one layer of delegation here, which is like, you're authoring my user's user or something like that. I don't know if that's tricky or not. Does that make sense? Yeah.Paul [00:26:25]: You know, actually at Twilio, I worked on the login identity and access. Management teams, right? So like I built Twilio's login page.swyx [00:26:31]: You were an intern on that team and then you became the lead in two years? Yeah.Paul [00:26:34]: Yeah. I started as an intern in 2016 and then I was the tech lead of that team. How? That's not normal. I didn't have a life. He's not normal. Look at this guy. I didn't have a girlfriend. I just loved my job. I don't know. I applied to 500 internships for my first job and I got rejected from every single one of them except for Twilio and then eventually Amazon. And they took a shot on me and like, I was getting paid money to write code, which was my dream. Yeah. Yeah. I'm very lucky that like this coding thing worked out because I was going to be doing it regardless. And yeah, I was able to kind of spend a lot of time on a team that was growing at a company that was growing. So it informed a lot of this stuff here. I think these are problems that have been solved with like the SAML protocol with SSO. I think it's a really interesting stuff with like WebAuthn, like these different types of authentication, like schemes that you can use to authenticate people. The tooling is all there. It just needs to be tweaked a little bit to work for agents. And I think the fact that there are companies that are already. Providing authentication as a service really sets it up. Well, the thing that's hard is like reinventing the internet for agents. We don't want to rebuild the internet. That's an impossible task. And I think people often say like, well, we'll have this second layer of APIs built for agents. I'm like, we will for the top use cases, but instead of we can just tweak the internet as is, which is on the authentication side, I think we're going to be the dumb ones going forward. Unfortunately, I think AI is going to be able to do a lot of the tasks that we do online, which means that it will be able to go to websites, click buttons on our behalf and log in on our behalf too. So with this kind of like web agent future happening, I think with some small structural changes, like you said, it feels like it could all slot in really nicely with the existing internet.Handling CAPTCHAs and Agent Authenticationswyx [00:28:08]: There's one more thing, which is the, your live view iframe, which lets you take, take control. Yeah. Obviously very key for operator now, but like, was, is there anything interesting technically there or that the people like, well, people always want this.Paul [00:28:21]: It was really hard to build, you know, like, so, okay. Headless browsers, you don't see them, right. They're running. They're running in a cloud somewhere. You can't like look at them. And I just want to really make, it's a weird name. I wish we came up with a better name for this thing, but you can't see them. Right. But customers don't trust AI agents, right. At least the first pass. So what we do with our live view is that, you know, when you use browser base, you can actually embed a live view of the browser running in the cloud for your customer to see it working. And that's what the first reason is the build trust, like, okay, so I have this script. That's going to go automate a website. I can embed it into my web application via an iframe and my customer can watch. I think. And then we added two way communication. So now not only can you watch the browser kind of being operated by AI, if you want to pause and actually click around type within this iframe that's controlling a browser, that's also possible. And this is all thanks to some of the lower level protocol, which is called the Chrome DevTools protocol. It has a API called start screencast, and you can also send mouse clicks and button clicks to a remote browser. And this is all embeddable within iframes. You have a browser within a browser, yo. And then you simulate the screen, the click on the other side. Exactly. And this is really nice often for, like, let's say, a capture that can't be solved. You saw this with Operator, you know, Operator actually uses a different approach. They use VNC. So, you know, you're able to see, like, you're seeing the whole window here. What we're doing is something a little lower level with the Chrome DevTools protocol. It's just PNGs being streamed over the wire. But the same thing is true, right? Like, hey, I'm running a window. Pause. Can you do something in this window? Human. Okay, great. Resume. Like sometimes 2FA tokens. Like if you get that text message, you might need a person to type that in. Web agents need human-in-the-loop type workflows still. You still need a person to interact with the browser. And building a UI to proxy that is kind of hard. You may as well just show them the whole browser and say, hey, can you finish this up for me? And then let the AI proceed on afterwards. Is there a future where I stream my current desktop to browser base? I don't think so. I think we're very much cloud infrastructure. Yeah. You know, but I think a lot of the stuff we're doing, we do want to, like, build tools. Like, you know, we'll talk about the stage and, you know, web agent framework in a second. But, like, there's a case where a lot of people are going desktop first for, you know, consumer use. And I think cloud is doing a lot of this, where I expect to see, you know, MCPs really oriented around the cloud desktop app for a reason, right? Like, I think a lot of these tools are going to run on your computer because it makes... I think it's breaking out. People are putting it on a server. Oh, really? Okay. Well, sweet. We'll see. We'll see that. I was surprised, though, wasn't I? I think that the browser company, too, with Dia Browser, it runs on your machine. You know, it's going to be...swyx [00:30:50]: What is it?Paul [00:30:51]: So, Dia Browser, as far as I understand... I used to use Arc. Yeah. I haven't used Arc. But I'm a big fan of the browser company. I think they're doing a lot of cool stuff in consumer. As far as I understand, it's a browser where you have a sidebar where you can, like, chat with it and it can control the local browser on your machine. So, if you imagine, like, what a consumer web agent is, which it lives alongside your browser, I think Google Chrome has Project Marina, I think. I almost call it Project Marinara for some reason. I don't know why. It's...swyx [00:31:17]: No, I think it's someone really likes the Waterworld. Oh, I see. The classic Kevin Costner. Yeah.Paul [00:31:22]: Okay. Project Marinara is a similar thing to the Dia Browser, in my mind, as far as I understand it. You have a browser that has an AI interface that will take over your mouse and keyboard and control the browser for you. Great for consumer use cases. But if you're building applications that rely on a browser and it's more part of a greater, like, AI app experience, you probably need something that's more like infrastructure, not a consumer app.swyx [00:31:44]: Just because I have explored a little bit in this area, do people want branching? So, I have the state. Of whatever my browser's in. And then I want, like, 100 clones of this state. Do people do that? Or...Paul [00:31:56]: People don't do it currently. Yeah. But it's definitely something we're thinking about. I think the idea of forking a browser is really cool. Technically, kind of hard. We're starting to see this in code execution, where people are, like, forking some, like, code execution, like, processes or forking some tool calls or branching tool calls. Haven't seen it at the browser level yet. But it makes sense. Like, if an AI agent is, like, using a website and it's not sure what path it wants to take to crawl this website. To find the information it's looking for. It would make sense for it to explore both paths in parallel. And that'd be a very, like... A road not taken. Yeah. And hopefully find the right answer. And then say, okay, this was actually the right one. And memorize that. And go there in the future. On the roadmap. For sure. Don't make my roadmap, please. You know?Alessio [00:32:37]: How do you actually do that? Yeah. How do you fork? I feel like the browser is so stateful for so many things.swyx [00:32:42]: Serialize the state. Restore the state. I don't know.Paul [00:32:44]: So, it's one of the reasons why we haven't done it yet. It's hard. You know? Like, to truly fork, it's actually quite difficult. The naive way is to open the same page in a new tab and then, like, hope that it's at the same thing. But if you have a form halfway filled, you may have to, like, take the whole, you know, container. Pause it. All the memory. Duplicate it. Restart it from there. It could be very slow. So, we haven't found a thing. Like, the easy thing to fork is just, like, copy the page object. You know? But I think there needs to be something a little bit more robust there. Yeah.swyx [00:33:12]: So, MorphLabs has this infinite branch thing. Like, wrote a custom fork of Linux or something that let them save the system state and clone it. MorphLabs, hit me up. I'll be a customer. Yeah. That's the only. I think that's the only way to do it. Yeah. Like, unless Chrome has some special API for you. Yeah.Paul [00:33:29]: There's probably something we'll reverse engineer one day. I don't know. Yeah.Alessio [00:33:32]: Let's talk about StageHand, the AI web browsing framework. You have three core components, Observe, Extract, and Act. Pretty clean landing page. What was the idea behind making a framework? Yeah.Stagehand: AI web browsing frameworkPaul [00:33:43]: So, there's three frameworks that are very popular or already exist, right? Puppeteer, Playwright, Selenium. Those are for building hard-coded scripts to control websites. And as soon as I started to play with LLMs plus browsing, I caught myself, you know, code-genning Playwright code to control a website. I would, like, take the DOM. I'd pass it to an LLM. I'd say, can you generate the Playwright code to click the appropriate button here? And it would do that. And I was like, this really should be part of the frameworks themselves. And I became really obsessed with SDKs that take natural language as part of, like, the API input. And that's what StageHand is. StageHand exposes three APIs, and it's a super set of Playwright. So, if you go to a page, you may want to take an action, click on the button, fill in the form, etc. That's what the act command is for. You may want to extract some data. This one takes a natural language, like, extract the winner of the Super Bowl from this page. You can give it a Zod schema, so it returns a structured output. And then maybe you're building an API. You can do an agent loop, and you want to kind of see what actions are possible on this page before taking one. You can do observe. So, you can observe the actions on the page, and it will generate a list of actions. You can guide it, like, give me actions on this page related to buying an item. And you can, like, buy it now, add to cart, view shipping options, and pass that to an LLM, an agent loop, to say, what's the appropriate action given this high-level goal? So, StageHand isn't a web agent. It's a framework for building web agents. And we think that agent loops are actually pretty close to the application layer because every application probably has different goals or different ways it wants to take steps. I don't think I've seen a generic. Maybe you guys are the experts here. I haven't seen, like, a really good AI agent framework here. Everyone kind of has their own special sauce, right? I see a lot of developers building their own agent loops, and they're using tools. And I view StageHand as the browser tool. So, we expose act, extract, observe. Your agent can call these tools. And from that, you don't have to worry about it. You don't have to worry about generating playwright code performantly. You don't have to worry about running it. You can kind of just integrate these three tool calls into your agent loop and reliably automate the web.swyx [00:35:48]: A special shout-out to Anirudh, who I met at your dinner, who I think listens to the pod. Yeah. Hey, Anirudh.Paul [00:35:54]: Anirudh's a man. He's a StageHand guy.swyx [00:35:56]: I mean, the interesting thing about each of these APIs is they're kind of each startup. Like, specifically extract, you know, Firecrawler is extract. There's, like, Expand AI. There's a whole bunch of, like, extract companies. They just focus on extract. I'm curious. Like, I feel like you guys are going to collide at some point. Like, right now, it's friendly. Everyone's in a blue ocean. At some point, it's going to be valuable enough that there's some turf battle here. I don't think you have a dog in a fight. I think you can mock extract to use an external service if they're better at it than you. But it's just an observation that, like, in the same way that I see each option, each checkbox in the side of custom GBTs becoming a startup or each box in the Karpathy chart being a startup. Like, this is also becoming a thing. Yeah.Paul [00:36:41]: I mean, like, so the way StageHand works is that it's MIT-licensed, completely open source. You bring your own API key to your LLM of choice. You could choose your LLM. We don't make any money off of the extract or really. We only really make money if you choose to run it with our browser. You don't have to. You can actually use your own browser, a local browser. You know, StageHand is completely open source for that reason. And, yeah, like, I think if you're building really complex web scraping workflows, I don't know if StageHand is the tool for you. I think it's really more if you're building an AI agent that needs a few general tools or if it's doing a lot of, like, web automation-intensive work. But if you're building a scraping company, StageHand is not your thing. You probably want something that's going to, like, get HTML content, you know, convert that to Markdown, query it. That's not what StageHand does. StageHand is more about reliability. I think we focus a lot on reliability and less so on cost optimization and speed at this point.swyx [00:37:33]: I actually feel like StageHand, so the way that StageHand works, it's like, you know, page.act, click on the quick start. Yeah. It's kind of the integration test for the code that you would have to write anyway, like the Puppeteer code that you have to write anyway. And when the page structure changes, because it always does, then this is still the test. This is still the test that I would have to write. Yeah. So it's kind of like a testing framework that doesn't need implementation detail.Paul [00:37:56]: Well, yeah. I mean, Puppeteer, Playwright, and Slenderman were all designed as testing frameworks, right? Yeah. And now people are, like, hacking them together to automate the web. I would say, and, like, maybe this is, like, me being too specific. But, like, when I write tests, if the page structure changes. Without me knowing, I want that test to fail. So I don't know if, like, AI, like, regenerating that. Like, people are using StageHand for testing. But it's more for, like, usability testing, not, like, testing of, like, does the front end, like, has it changed or not. Okay. But generally where we've seen people, like, really, like, take off is, like, if they're using, you know, something. If they want to build a feature in their application that's kind of like Operator or Deep Research, they're using StageHand to kind of power that tool calling in their own agent loop. Okay. Cool.swyx [00:38:37]: So let's go into Operator, the first big agent launch of the year from OpenAI. Seems like they have a whole bunch scheduled. You were on break and your phone blew up. What's your just general view of computer use agents is what they're calling it. The overall category before we go into Open Operator, just the overall promise of Operator. I will observe that I tried it once. It was okay. And I never tried it again.OpenAI's Operator and computer use agentsPaul [00:38:58]: That tracks with my experience, too. Like, I'm a huge fan of the OpenAI team. Like, I think that I do not view Operator as the company. I'm not a company killer for browser base at all. I think it actually shows people what's possible. I think, like, computer use models make a lot of sense. And I'm actually most excited about computer use models is, like, their ability to, like, really take screenshots and reasoning and output steps. I think that using mouse click or mouse coordinates, I've seen that proved to be less reliable than I would like. And I just wonder if that's the right form factor. What we've done with our framework is anchor it to the DOM itself, anchor it to the actual item. So, like, if it's clicking on something, it's clicking on that thing, you know? Like, it's more accurate. No matter where it is. Yeah, exactly. Because it really ties in nicely. And it can handle, like, the whole viewport in one go, whereas, like, Operator can only handle what it sees. Can you hover? Is hovering a thing that you can do? I don't know if we expose it as a tool directly, but I'm sure there's, like, an API for hovering. Like, move mouse to this position. Yeah, yeah, yeah. I think you can trigger hover, like, via, like, the JavaScript on the DOM itself. But, no, I think, like, when we saw computer use, everyone's eyes lit up because they realized, like, wow, like, AI is going to actually automate work for people. And I think seeing that kind of happen from both of the labs, and I'm sure we're going to see more labs launch computer use models, I'm excited to see all the stuff that people build with it. I think that I'd love to see computer use power, like, controlling a browser on browser base. And I think, like, Open Operator, which was, like, our open source version of OpenAI's Operator, was our first take on, like, how can we integrate these models into browser base? And we handle the infrastructure and let the labs do the models. I don't have a sense that Operator will be released as an API. I don't know. Maybe it will. I'm curious to see how well that works because I think it's going to be really hard for a company like OpenAI to do things like support CAPTCHA solving or, like, have proxies. Like, I think it's hard for them structurally. Imagine this New York Times headline, OpenAI CAPTCHA solving. Like, that would be a pretty bad headline, this New York Times headline. Browser base solves CAPTCHAs. No one cares. No one cares. And, like, our investors are bored. Like, we're all okay with this, you know? We're building this company knowing that the CAPTCHA solving is short-lived until we figure out how to authenticate good bots. I think it's really hard for a company like OpenAI, who has this brand that's so, so good, to balance with, like, the icky parts of web automation, which it can be kind of complex to solve. I'm sure OpenAI knows who to call whenever they need you. Yeah, right. I'm sure they'll have a great partnership.Alessio [00:41:23]: And is Open Operator just, like, a marketing thing for you? Like, how do you think about resource allocation? So, you can spin this up very quickly. And now there's all this, like, open deep research, just open all these things that people are building. We started it, you know. You're the original Open. We're the original Open operator, you know? Is it just, hey, look, this is a demo, but, like, we'll help you build out an actual product for yourself? Like, are you interested in going more of a product route? That's kind of the OpenAI way, right? They started as a model provider and then…Paul [00:41:53]: Yeah, we're not interested in going the product route yet. I view Open Operator as a model provider. It's a reference project, you know? Let's show people how to build these things using the infrastructure and models that are out there. And that's what it is. It's, like, Open Operator is very simple. It's an agent loop. It says, like, take a high-level goal, break it down into steps, use tool calling to accomplish those steps. It takes screenshots and feeds those screenshots into an LLM with the step to generate the right action. It uses stagehand under the hood to actually execute this action. It doesn't use a computer use model. And it, like, has a nice interface using the live view that we talked about, the iframe, to embed that into an application. So I felt like people on launch day wanted to figure out how to build their own version of this. And we turned that around really quickly to show them. And I hope we do that with other things like deep research. We don't have a deep research launch yet. I think David from AOMNI actually has an amazing open deep research that he launched. It has, like, 10K GitHub stars now. So he's crushing that. But I think if people want to build these features natively into their application, they need good reference projects. And I think Open Operator is a good example of that.swyx [00:42:52]: I don't know. Actually, I'm actually pretty bullish on API-driven operator. Because that's the only way that you can sort of, like, once it's reliable enough, obviously. And now we're nowhere near. But, like, give it five years. It'll happen, you know. And then you can sort of spin this up and browsers are working in the background and you don't necessarily have to know. And it just is booking restaurants for you, whatever. I can definitely see that future happening. I had this on the landing page here. This might be a slightly out of order. But, you know, you have, like, sort of three use cases for browser base. Open Operator. Or this is the operator sort of use case. It's kind of like the workflow automation use case. And it completes with UiPath in the sort of RPA category. Would you agree with that? Yeah, I would agree with that. And then there's Agents we talked about already. And web scraping, which I imagine would be the bulk of your workload right now, right?Paul [00:43:40]: No, not at all. I'd say actually, like, the majority is browser automation. We're kind of expensive for web scraping. Like, I think that if you're building a web scraping product, if you need to do occasional web scraping or you have to do web scraping that works every single time, you want to use browser automation. Yeah. You want to use browser-based. But if you're building web scraping workflows, what you should do is have a waterfall. You should have the first request is a curl to the website. See if you can get it without even using a browser. And then the second request may be, like, a scraping-specific API. There's, like, a thousand scraping APIs out there that you can use to try and get data. Scraping B. Scraping B is a great example, right? Yeah. And then, like, if those two don't work, bring out the heavy hitter. Like, browser-based will 100% work, right? It will load the page in a real browser, hydrate it. I see.swyx [00:44:21]: Because a lot of people don't render to JS.swyx [00:44:25]: Yeah, exactly.Paul [00:44:26]: So, I mean, the three big use cases, right? Like, you know, automation, web data collection, and then, you know, if you're building anything agentic that needs, like, a browser tool, you want to use browser-based.Alessio [00:44:35]: Is there any use case that, like, you were super surprised by that people might not even think about? Oh, yeah. Or is it, yeah, anything that you can share? The long tail is crazy. Yeah.Surprising use cases of BrowserbasePaul [00:44:44]: One of the case studies on our website that I think is the most interesting is this company called Benny. So, the way that it works is if you're on food stamps in the United States, you can actually get rebates if you buy certain things. Yeah. You buy some vegetables. You submit your receipt to the government. They'll give you a little rebate back. Say, hey, thanks for buying vegetables. It's good for you. That process of submitting that receipt is very painful. And the way Benny works is you use their app to take a photo of your receipt, and then Benny will go submit that receipt for you and then deposit the money into your account. That's actually using no AI at all. It's all, like, hard-coded scripts. They maintain the scripts. They've been doing a great job. And they build this amazing consumer app. But it's an example of, like, all these, like, tedious workflows that people have to do to kind of go about their business. And they're doing it for the sake of their day-to-day lives. And I had never known about, like, food stamp rebates or the complex forms you have to do to fill them. But the world is powered by millions and millions of tedious forms, visas. You know, Emirate Lighthouse is a customer, right? You know, they do the O1 visa. Millions and millions of forms are taking away humans' time. And I hope that Browserbase can help power software that automates away the web forms that we don't need anymore. Yeah.swyx [00:45:49]: I mean, I'm very supportive of that. I mean, forms. I do think, like, government itself is a big part of it. I think the government itself should embrace AI more to do more sort of human-friendly form filling. Mm-hmm. But I'm not optimistic. I'm not holding my breath. Yeah. We'll see. Okay. I think I'm about to zoom out. I have a little brief thing on computer use, and then we can talk about founder stuff, which is, I tend to think of developer tooling markets in impossible triangles, where everyone starts in a niche, and then they start to branch out. So I already hinted at a little bit of this, right? We mentioned more. We mentioned E2B. We mentioned Firecrawl. And then there's Browserbase. So there's, like, all this stuff of, like, have serverless virtual computer that you give to an agent and let them do stuff with it. And there's various ways of connecting it to the internet. You can just connect to a search API, like SERP API, whatever other, like, EXA is another one. That's what you're searching. You can also have a JSON markdown extractor, which is Firecrawl. Or you can have a virtual browser like Browserbase, or you can have a virtual machine like Morph. And then there's also maybe, like, a virtual sort of code environment, like Code Interpreter. So, like, there's just, like, a bunch of different ways to tackle the problem of give a computer to an agent. And I'm just kind of wondering if you see, like, everyone's just, like, happily coexisting in their respective niches. And as a developer, I just go and pick, like, a shopping basket of one of each. Or do you think that you eventually, people will collide?Future of browser automation and market competitionPaul [00:47:18]: I think that currently it's not a zero-sum market. Like, I think we're talking about... I think we're talking about all of knowledge work that people do that can be automated online. All of these, like, trillions of hours that happen online where people are working. And I think that there's so much software to be built that, like, I tend not to think about how these companies will collide. I just try to solve the problem as best as I can and make this specific piece of infrastructure, which I think is an important primitive, the best I possibly can. And yeah. I think there's players that are actually going to like it. I think there's players that are going to launch, like, over-the-top, you know, platforms, like agent platforms that have all these tools built in, right? Like, who's building the rippling for agent tools that has the search tool, the browser tool, the operating system tool, right? There are some. There are some. There are some, right? And I think in the end, what I have seen as my time as a developer, and I look at all the favorite tools that I have, is that, like, for tools and primitives with sufficient levels of complexity, you need to have a solution that's really bespoke to that primitive, you know? And I am sufficiently convinced that the browser is complex enough to deserve a primitive. Obviously, I have to. I'm the founder of BrowserBase, right? I'm talking my book. But, like, I think maybe I can give you one spicy take against, like, maybe just whole OS running. I think that when I look at computer use when it first came out, I saw that the majority of use cases for computer use were controlling a browser. And do we really need to run an entire operating system just to control a browser? I don't think so. I don't think that's necessary. You know, BrowserBase can run browsers for way cheaper than you can if you're running a full-fledged OS with a GUI, you know, operating system. And I think that's just an advantage of the browser. It is, like, browsers are little OSs, and you can run them very efficiently if you orchestrate it well. And I think that allows us to offer 90% of the, you know, functionality in the platform needed at 10% of the cost of running a full OS. Yeah.Open Operator: Browserbase's Open-Source Alternativeswyx [00:49:16]: I definitely see the logic in that. There's a Mark Andreessen quote. I don't know if you know this one. Where he basically observed that the browser is turning the operating system into a poorly debugged set of device drivers, because most of the apps are moved from the OS to the browser. So you can just run browsers.Paul [00:49:31]: There's a place for OSs, too. Like, I think that there are some applications that only run on Windows operating systems. And Eric from pig.dev in this upcoming YC batch, or last YC batch, like, he's building all run tons of Windows operating systems for you to control with your agent. And like, there's some legacy EHR systems that only run on Internet-controlled systems. Yeah.Paul [00:49:54]: I think that's it. I think, like, there are use cases for specific operating systems for specific legacy software. And like, I'm excited to see what he does with that. I just wanted to give a shout out to the pig.dev website.swyx [00:50:06]: The pigs jump when you click on them. Yeah. That's great.Paul [00:50:08]: Eric, he's the former co-founder of banana.dev, too.swyx [00:50:11]: Oh, that Eric. Yeah. That Eric. Okay. Well, he abandoned bananas for pigs. I hope he doesn't start going around with pigs now.Alessio [00:50:18]: Like he was going around with bananas. A little toy pig. Yeah. Yeah. I love that. What else are we missing? I think we covered a lot of, like, the browser-based product history, but. What do you wish people asked you? Yeah.Paul [00:50:29]: I wish people asked me more about, like, what will the future of software look like? Because I think that's really where I've spent a lot of time about why do browser-based. Like, for me, starting a company is like a means of last resort. Like, you shouldn't start a company unless you absolutely have to. And I remain convinced that the future of software is software that you're going to click a button and it's going to do stuff on your behalf. Right now, software. You click a button and it maybe, like, calls it back an API and, like, computes some numbers. It, like, modifies some text, whatever. But the future of software is software using software. So, I may log into my accounting website for my business, click a button, and it's going to go load up my Gmail, search my emails, find the thing, upload the receipt, and then comment it for me. Right? And it may use it using APIs, maybe a browser. I don't know. I think it's a little bit of both. But that's completely different from how we've built software so far. And that's. I think that future of software has different infrastructure requirements. It's going to require different UIs. It's going to require different pieces of infrastructure. I think the browser infrastructure is one piece that fits into that, along with all the other categories you mentioned. So, I think that it's going to require developers to think differently about how they've built software for, you know
Read the article here: https://journals.sagepub.com/doi/full/10.1177/30494826241296390
In this final episode of the four-part series on hepatocellular carcinoma (HCC), hosted by the Oncology Brothers, Drs Rohit and Rahul Gosain, the discussion focuses on the evolving role of immunotherapy (IO) in intermediate HCC. The episode explores multimodal approaches that combine IO and IO-based therapies with loco-regional treatments and highlights the essential role of a multidisciplinary care team. Drs Nina Sanford (radiation oncologist), Mark Yarchoan (medical oncologist), and Ed Kim (interventional radiologist) join the Oncology Brothers to share their insights on: • Current treatment options for intermediate HCC, addressing its heterogeneity and standard treatment pathways • Latest clinical trial data (EMERALD-1, LEAP-012) on combining IO with loco-regional therapies, and the clinical implications • The importance of effective collaboration within the multidisciplinary team for delivering optimal patient care • Combining IO with loco-regional therapy and future perspectives in the field Clinical takeaways • IO and IO-based treatments are moving earlier in the treatment paradigm for patients with intermediate HCC. Earlier integration of these therapies aims to achieve improved systemic control, allowing loco-regional therapy to target oligoprogression, residual lesions or reduce tumour burden • Emerging data supports combining systemic and loco-regional therapies for patients with intermediate HCC. EMERALD-1 and LEAP-012 show promising PFS data using IO-based combination regimens like durvalumab + bevacizumab or pembrolizumab + lenvatinib alongside TACE. Long-term OS data are awaited • Effective communication and coordinated care among specialists, such as medical oncologists, radiation oncologists, hepatologists, and interventional radiologists, are essential to developing optimal treatment strategies for patients with intermediate HCC Follow us on social media: • X/Twitter: https://twitter.com/oncbrothers • Instagram: https://www.instagram.com/oncbrothers • YouTube: https://www.youtube.com/channel/UCjfxKlVho5xWH5ltufj4F4A/ Subscribe to our channel for more insights on oncology treatments and patient care!
In this episode I interviewed Wee Kee, the co-founder of Virtuals. From here and there in the conversation you can tell that Virtuals team is a pragmatic and perseverant one. To build a successful product in this fast moving industry, there's no fast track but stay on the table and keep grinding. Shownotes:You guys have been dominating mindshare amongst all the crypto native folks lately. So first of all congratulations! But still would love to hear how you introduce yourself to our audience. There were quite many platforms that empower developers to launch agents. What key factors do you think led you to where you guys are today?Do you think the diversity of the agents came from the no-code requirement?What are some of the agents that grew from Virtuals that left people an impression? Also about Multimodality. What are the best performing categories for agents developed with GAME?How do you decide whether you're going to co-marketing with one of the ecosystem agents?I read on Messari that you guys have integrated Farcaster, but I haven't seen a lot of news that you guys are pushing on that front. Any plans? I was particularly interested to hear your answer on this one, because looks like with a web3 native social network, an agent developed under GAME framework will be capable of doing a lot more
Erum and Karl have an incredible chat with Nick Edwards, the innovative mind behind Potato AI. Nick goes into the inspiration behind the platform's unique name, rooted in childhood curiosity and scientific wonder, and shares his journey from neuroscience research to building a groundbreaking tool aimed at accelerating scientific discovery. He explains how Potato tackles the overwhelming deluge of scientific literature and the reproducibility crisis using AI to structure and analyze data in transformative ways. With stories from his own career, a stint in consulting, and running his podcast "Once a Scientist," Nick discusses the future of AI, science, and collaboration. This episode has amazing insights into how technology is reshaping the very fabric of research. Grow Everything brings the bioeconomy to life. Hosts Karl Schmieder and Erum Azeez Khan share stories and interview the leaders and influencers changing the world by growing everything. Biology is the oldest technology. And it can be engineered. What are we growing? Learn more at www.messaginglab.com/groweverything Chapters: 00:00:00 - Potatoes, Balloons, and a DIY Lab Setup 00:04:56 - Reproducibility Crisis: 70% of Research in Question 00:06:40 - Meet Nick Edwards: The Potato Visionary 00:10:00 - Organizing Scientific Chaos: How AI Helps 00:14:47 - AI Research Assistant vs. AI Scientist: The Journey 00:20:09 - From Neuroscience to Entrepreneurship: A Scientist's Leap 00:25:00 - Biopharma Meets Bioindustrials: Bridging Two Worlds 00:29:27 - Liquid Handlers and Shared Protocols: Optimizing Biotech 00:35:13 - Multimodality and the AI Scientist Dream 00:39:04 - Citizen Scientists and Accessible AI Tools 00:45:00 - The Next Scientific Revolution: AI and Decentralized Science Episode Links: Potato AI Once A Scientist Podcast Merck Digital Science Studio Wiley Elsevier Ginkgo Bioworks Topics Covered: Research, AI research assistant, AI Scientist, Biotech, Lab Automation, Reproducibility Have a question or comment? Message us here: Text or Call (804) 505-5553 Instagram / Twitter / LinkedIn / Youtube / Grow Everything Email: groweverything@messaginglab.com Music by: Nihilore Production by: Amplafy Media
Nathan welcomes back computational biochemist Amelie Schreiber for a fascinating update on AI's revolutionary impact in biology. In this episode of The Cognitive Revolution, we explore recent breakthroughs including AlphaFold3, ESM3, and new diffusion models transforming protein engineering and drug discovery. Join us for an insightful discussion about how AI is reshaping our understanding of molecular biology and making complex protein engineering tasks more accessible than ever before. Help shape our show by taking our quick listener survey at https://bit.ly/TurpentinePulse SPONSORS: Shopify: Shopify is the world's leading e-commerce platform, offering a market-leading checkout system and exclusive AI apps like Quikly. Nobody does selling better than Shopify. Get a $1 per month trial at https://shopify.com/cognitive SelectQuote: Finding the right life insurance shouldn't be another task you put off. SelectQuote compares top-rated policies to get you the best coverage at the right price. Even in our AI-driven world, protecting your family's future remains essential. Get your personalized quote at https://selectquote.com/cognitive Oracle Cloud Infrastructure (OCI): Oracle's next-generation cloud platform delivers blazing-fast AI and ML performance with 50% less for compute and 80% less for outbound networking compared to other cloud providers13. OCI powers industry leaders with secure infrastructure and application development capabilities. New U.S. customers can get their cloud bill cut in half by switching to OCI before December 31, 2024 at https://oracle.com/cognitive Weights & Biases RAG++: Advanced training for building production-ready RAG applications. Learn from experts to overcome LLM challenges, evaluate systematically, and integrate advanced features. Includes free Cohere credits. Visit https://wandb.me/cr to start the RAG++ course today. CHAPTERS: (00:00:00) Teaser (00:00:46) About the Episode (00:04:30) AI for Biology (00:07:14) David Baker's Impact (00:11:49) AlphaFold 3 & ESM3 (00:16:40) Protein Interaction Prediction (Part 1) (00:16:44) Sponsors: Shopify | SelectQuote (00:19:18) Protein Interaction Prediction (Part 2) (00:31:12) MSAs & Embeddings (Part 1) (00:32:32) Sponsors: Oracle Cloud Infrastructure (OCI) | Weights & Biases RAG++ (00:34:49) MSAs & Embeddings (Part 2) (00:35:57) Beyond Structure Prediction (00:51:13) Dynamics vs. Statics (00:57:24) In-Painting & Use Cases (00:59:48) Workflow & Platforms (01:06:45) Design Process & Success Rates (01:13:23) Ambition & Task Definition (01:19:25) New Models: PepFlow & GeoAB (01:28:23) Flow Matching vs. Diffusion (01:30:42) ESM3 & Multimodality (01:37:10) Summary & Future Directions (01:45:34) Outro SOCIAL LINKS: Website: https://www.cognitiverevolution.ai Twitter (Podcast): https://x.com/cogrev_podcast Twitter (Nathan): https://x.com/labenz LinkedIn: https://www.linkedin.com/in/nathanlabenz/ Youtube: https://www.youtube.com/@CognitiveRevolutionPodcast Apple: https://podcasts.apple.com/de/podcast/the-cognitive-revolution-ai-builders-researchers-and/id1669813431
Nathan interviews Google product managers Shrestha Basu Mallick and Logan Kilpatrick about the Gemini API and AI Studio. They discuss Google's new grounding feature, allowing Gemini models to access real-time web information via Google search. The conversation explores Gemini's rapid growth, its position in the AI landscape, and Google's competitive strategy. Nathan shares insights from integrating Gemini into his own application and ponders the future of large language model capabilities across providers. Tune in for an in-depth look at Google's AI API product strategy and the latest Gemini features. Be notified early when Turpentine's drops new publication: https://www.turpentine.co/exclusiveaccess SPONSORS: Weights & Biases RAG++: Advanced training for building production-ready RAG applications. Learn from experts to overcome LLM challenges, evaluate systematically, and integrate advanced features. Includes free Cohere credits. Visit https://wandb.me/cr to start the RAG++ course today. Shopify: Shopify is the world's leading e-commerce platform, offering a market-leading checkout system and exclusive AI apps like Quikly. Nobody does selling better than Shopify. Get a $1 per month trial at https://shopify.com/cognitive Notion: Notion offers powerful workflow and automation templates, perfect for streamlining processes and laying the groundwork for AI-driven automation. With Notion AI, you can search across thousands of documents from various platforms, generating highly relevant analysis and content tailored just for you - try it for free at https://notion.com/cognitiverevolution LMNT: LMNT is a zero-sugar electrolyte drink mix that's redefining hydration and performance. Ideal for those who fast or anyone looking to optimize their electrolyte intake. Support the show and get a free sample pack with any purchase at https://drinklmnt.com/tcr CHAPTERS: (00:00:00) About the Show (00:00:53) Sponsors: Weights & Biases RAG++ (00:01:28) About the Episode (00:04:15) Gemini API Growth (00:05:26) Intro to AI Studio (00:07:35) Vertex vs. AI Studio (00:09:33) Developer Adoption (00:14:23) Gemini Use Cases (Part 1) (00:17:41) Sponsors: Shopify | Notion (00:20:01) Gemini Use Cases (Part 2) (00:23:08) Multimodality & Flash (00:26:29) Free Tier & Costs (00:31:43) Inference Costs (00:32:55) Fine-tuning & Vision (00:36:59) Sponsors: LMNT (00:38:04) Search Grounding (00:44:42) Grounding Sources (00:46:58) Competitive Landscape (00:50:36) Design Decisions (00:54:54) Outro SOCIAL LINKS: Website: https://www.cognitiverevolution.ai Twitter (Podcast): https://x.com/cogrev_podcast Twitter (Nathan): https://x.com/labenz LinkedIn: https://www.linkedin.com/in/nathanlabenz/ Youtube: https://www.youtube.com/@CognitiveRevolutionPodcast Apple: https://podcasts.apple.com/de/podcast/the-cognitive-revolution-ai-builders-researchers-and/id1669813431 Spotify: https://open.spotify.com/show/6yHyok3M3BjqzR0VB5MSyk
In this podcast, we discuss the The Society of Thoracic Surgeons/American Society for Radiation Oncology Updated Clinical Practice Guidelines on Multimodality Therapy for Locally Advanced Cancer of the Esophagus or Gastroesophageal Junction. Joining in the discussion are Dr. Stephanie Worrell, Associate Professor and Thoracic Section Chief in the Division of Cardiothoracic Surgery at the University of Arizona College of Medicine, and Dr. Karyn Goodman, Professor and Vice Chair for Research and Quality at the Icahn School of Medicine at Mount Sinai, and Associate Director for Clinical Research at The Tisch Cancer Institute, who served as chair and co-chair of the guideline panel, respectively. Together, we cover important updates and recommendations that incorporate surgical aspects into the multi-disciplinary management of this disease along with practical considerations for everyday practice. Additionally, we discuss in depth the recently presented ESOPEC trial presented at the 2024 ASCO annual meeting and how it has impacted the standard of care for esophageal cancers.
In this podcast, we discuss the The Society of Thoracic Surgeons/American Society for Radiation Oncology Updated Clinical Practice Guidelines on Multimodality Therapy for Locally Advanced Cancer of the Esophagus or Gastroesophageal Junction. Joining in the discussion are Dr. Stephanie Worrell, Associate Professor and Thoracic Section Chief in the Division of Cardiothoracic Surgery at the University of Arizona College of Medicine, and Dr. Karyn Goodman, Professor and Vice Chair for Research and Quality at the Icahn School of Medicine at Mount Sinai, and Associate Director for Clinical Research at The Tisch Cancer Institute, who served as chair and co-chair of the guideline panel, respectively. Together, we cover important updates and recommendations that incorporate surgical aspects into the multi-disciplinary management of this disease along with practical considerations for everyday practice. Additionally, we discuss in depth the recently presented ESOPEC trial presented at the 2024 ASCO annual meeting and how it has impacted the standard of care for esophageal cancers.
In this podcast, we discuss the The Society of Thoracic Surgeons/American Society for Radiation Oncology Updated Clinical Practice Guidelines on Multimodality Therapy for Locally Advanced Cancer of the Esophagus or Gastroesophageal Junction. Joining in the discussion are Dr. Stephanie Worrell, Associate Professor and Thoracic Section Chief in the Division of Cardiothoracic Surgery at the University of Arizona College of Medicine, and Dr. Karyn Goodman, Professor and Vice Chair for Research and Quality at the Icahn School of Medicine at Mount Sinai, and Associate Director for Clinical Research at The Tisch Cancer Institute, who served as chair and co-chair of the guideline panel, respectively. Together, we cover important updates and recommendations that incorporate surgical aspects into the multi-disciplinary management of this disease along with practical considerations for everyday practice. Additionally, we discuss in depth the recently presented ESOPEC trial presented at the 2024 ASCO annual meeting and how it has impacted the standard of care for esophageal cancers.
In this podcast, we discuss the The Society of Thoracic Surgeons/American Society for Radiation Oncology Updated Clinical Practice Guidelines on Multimodality Therapy for Locally Advanced Cancer of the Esophagus or Gastroesophageal Junction. Joining in the discussion are Dr. Stephanie Worrell, Associate Professor and Thoracic Section Chief in the Division of Cardiothoracic Surgery at the University of Arizona College of Medicine, and Dr. Karyn Goodman, Professor and Vice Chair for Research and Quality at the Icahn School of Medicine at Mount Sinai, and Associate Director for Clinical Research at The Tisch Cancer Institute, who served as chair and co-chair of the guideline panel, respectively. Together, we cover important updates and recommendations that incorporate surgical aspects into the multi-disciplinary management of this disease along with practical considerations for everyday practice. Additionally, we discuss in depth the recently presented ESOPEC trial presented at the 2024 ASCO annual meeting and how it has impacted the standard of care for esophageal cancers.
CardioNerds (Dr. Dan Ambinder and Dr. Rick Ferraro) join Dr. Mansi Oberoi and Dr. Mohan Gudiwada from the University of Nebraska Medical Center discuss a case of constrictive pericarditis. Expert commentary is provided by Dr. Adam Burdorf, who serves as the Program Director for the Cardiovascular Medicine Fellowship at the University of Nebraska Medical Center. The case discussed involves a 76-year-old woman with a history of monoclonal gammopathy of undetermined significance, chronic obstructive pulmonary disease, type 2 diabetes mellitus, and squamous cell carcinoma was admitted to the hospital for worsening shortness of breath, swelling in lower extremities, hyponatremia, and urinary tract infection. CT chest to evaluate for pulmonary embolism showed incidental pericardial calcifications; the heart failure team was consulted for the management of her decompensated heart failure. Echo images were nondiagnostic. Subsequent invasive hemodynamic monitoring showed elevated right and left-sided filling pressures, diastolic equalization of LV and RV pressures, and positive RV square root sign with ventricular interdependence. Cardiac MRI showed septal flattening on deep inspiration and septal bounce, suggestive of interventricular dependence. After a heart team discussion and with shared-decision making the patient opted for medical management owing to her comorbidities and frailty. Enjoy this 2024 JACC State-of-the-Art Review to learn more about pericardial diseases and best practices for pericardiectomy (Al-Kazac et al., JACC 2024) US Cardiology Review is now the official journal of CardioNerds! Submit your manuscript here. CardioNerds Case Reports PageCardioNerds Episode PageCardioNerds AcademyCardionerds Healy Honor Roll CardioNerds Journal ClubSubscribe to The Heartbeat Newsletter!Check out CardioNerds SWAG!Become a CardioNerds Patron! Case Media - Constrictive Pericarditis Echo: Left Ventricular ejection fraction = 55-60%. Unclear septal motion in the setting of atrial fibrillation MRI: Diastolic septal flattening with deep inspiration as well as a septal bounce suggestive of interventricular dependence and constrictive physiology References Garcia, M. Constrictive Pericarditis Versus Restrictive Cardiomyopathy. Journal of the American College of Cardiology, vol. 67, no. 17, 2016, pp. 2061–2076. Pathophysiology and Diagnosis of Constrictive Pericarditis. American College of Cardiology, 2017. Geske, J., Anavekar, N., Nishimura, R., et al. Differentiation of Constriction and Restriction: Complex Cardiovascular Hemodynamics. Journal of the American College of Cardiology, vol. 68, no. 21, 2016, pp. 2329–2347. Constrictive Pericarditis. ScienceDirect. Constrictive Pericarditis. Journal of the American College of Cardiology, vol. 83, no. 12, 2024, pp. 1500-1512.
Descript CEO and founder Andrew Mason joins Lightspeed Partner and Host Michael Mignano on the podcast to talk about the future of content creation with AI tools. Michael and Andrew talk about the evolution of Descript as an AI product designed for podcast and video creators, navigating a world of synthetic content, and Descript's new features including Descript Rooms and Underlord. Andrew talks about raising a $50 Million Series C led by OpenAI Startup Fund and how seeing an early version of ChatGPT inspired confidence in Descript's foundational vision and goal to simplify media production. Episode Chapters (00:00) Introduction(00:09) Introducing Descript Rooms(01:10) Descript's Versatility with Media Creation(04:22) Social Clips and Longform Content(07:15) Craft and Control in AI in Content Creation(13:16) Descript AI Tools, OpenAI, and ChatGPT(17:30) Multimodality and Improving Quality(26:17) Trust and Adoption of AI Features(29:29) Detour and Groupon(37:31) Closing Thoughts Stay in touch: www.lsvp.com X: https://twitter.com/lightspeedvp LinkedIn: https://www.linkedin.com/company/lightspeed-venture-partners/ Instagram: https://www.instagram.com/lightspeedventurepartners/ Subscribe on your favorite podcast app: generativenow.co Email: generativenow@lsvp.com The content here does not constitute tax, legal, business or investment advice or an offer to provide such advice, should not be construed as advocating the purchase or sale of any security or investment or a recommendation of any company, and is not an offer, or solicitation of an offer, for the purchase or sale of any security or investment product. For more details please see lsvp.com/legal.
In this episode, experts discuss a crucial 2024 document outlining appropriate use criteria for multimodality imaging in cardiovascular evaluation before non-emergent non-cardiac surgery, addressing the rising annual surgeries and associated cardiac risks. They delve into balancing the necessity of imaging with cost-effectiveness while exploring the potential of artificial intelligence to enhance future evaluations.
Join Steve Lehmann & Jeremy Langsam from Portal's Stargaze team on a bimonthly segment of Lab Rats to Unicorns: Rising Stars. In this episode, they explore the groundbreaking work of Zachi Attia, the Director of Artificial Intelligence at Mayo Clinic. With a rich background in electrical engineering and a Ph.D. in Bioinformatics, Zachi discusses his pivotal role in advancing AI models that predict and screen cardiovascular diseases. From his innovative research to real-world applications that are saving lives, this episode offers an inspiring look into the future of healthcare.
In this episode, Lillian Erdahl, MD, FACS is joined by Todd Rosengart, MD, FACS, from the Baylor College of Medicine. They discuss Dr Rosengart's recent article, “Sustaining Lifelong Competency of Surgeons: Multimodality Empowerment Personal and Institutional Strategy,” which focuses on maintaining and ensuring the competency of an aging surgeon workforce. The study provides evidence-based guiding principles as part of a comprehensive “whole of career” strategy that can be adopted at a personal, institutional, and national level. Disclosure Information: Drs Erdahl and Rosengart have nothing to disclose. To earn 0.25 AMA PRA Category 1 Credits™ for this episode of the JACS Operative Word Podcast, click here to register for the course and complete the evaluation. Listeners can earn CME credit for this podcast for up to 2 years after the original air date. Learn more about the Journal of the American College of Surgeons, a monthly peer-reviewed journal publishing original contributions on all aspects of surgery, including scientific articles, collective reviews, experimental investigations, and more. #JACSOperativeWord
No Priors: Artificial Intelligence | Machine Learning | Technology | Startups
This week on No Priors, Sarah Guo and Elad Gil sit down with Karan Goel and Albert Gu from Cartesia. Karan and Albert first met as Stanford AI Lab PhDs, where their lab invented Space Models or SSMs, a fundamental new primitive for training large-scale foundation models. In 2023, they Founded Cartesia to build real-time intelligence for every device. One year later, Cartesia released Sonic which generates high quality and lifelike speech with a model latency of 135ms—the fastest for a model of this class. Sign up for new podcasts every week. Email feedback to show@no-priors.com Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil | @krandiash | @_albertgu Show Notes: (0:00) Introduction (0:28) Use Cases for Cartesia and Sonic (1:32) Karan Goel & Albert Gu's professional backgrounds (5:06) Steady State Models (SSMs) versus Transformer Based Architectures (11:51) Domain Applications for Hybrid Approaches (13:10) Text to Speech and Voice (17:29) Data, Size of Models and Efficiency (20:34) Recent Launch of Text to Speech Product (25:01) Multimodality & Building Blocks (25:54) What's Next at Cartesia? (28:28) Latency in Text to Speech (29:30) Choosing Research Problems Based on Aesthetic (31:23) Product Demo (32:48) Cartesia Team & Hiring
This conversation explores the different categories of technology available in Australia for clinical treatments with Nadine Dilong, Editor Spa+Clinic. The importance of clinical imaging in the diagnosis and treatment planning process is emphasised. The discussion covers the use of lasers and light therapies for redness and pigmentation, with a focus on the different treatment options and considerations for different skin tones. The conversation also touches on the legislation and safety measures surrounding laser treatments. We discuss skin conditions such as pigmentation, acne and rosacea and offer technology solutions that may accelerate a multi- modality treatment approach. Key Takeaways Clinical imaging is an important step in the diagnosis and treatment planning process for skin concerns. Laser and light therapies can be effective in treating redness and pigmentation, but the choice of treatment should be based on individual skin type and condition. Legislation and safety measures vary across different states in Australia, highlighting the importance of seeking treatment from qualified practitioners. Regular skincare and clinical imaging can help maintain skin health and prevent future concerns. A correct diagnosis is crucial for determining the appropriate treatment for different types of pigmentation. Multimodality treatments that combine various technologies and devices are becoming more popular in the field of aesthetics. Skincare plays a significant role in preparing the skin for laser and device treatments and maintaining the results. LED light therapy is a simple and effective at-home option for improving skin health. Different lasers and devices can be used to treat acne, acne scars, and pigmentation. CO2, erbium, and fractional lasers are commonly used for skin resurfacing. Laser coring and non-surgical lifting treatments offer alternatives to surgical procedures. Consumers should be cautious when navigating information on social media and seek expert advice. Chapters 00:00 Introduction and the Importance of Clinical Imaging03:16 Diagnosis and Technology in Skin Clinics06:25 Body Composition Analysis07:49 The Importance of Clinical Imaging for Treatment Results09:13 Skincare and Clinical Imaging13:15 Diagnosing Redness and Redness Treatment Options19:11 IPL and Photorejuvenation for Redness23:40 Legislation and Laser Treatment Safety27:38 Laser Treatment for Pigmentation31:45 Different Types of Pigmentation and the Importance of Correct Diagnosis34:24 Multimodality Treatments and the Full Range of Options35:20 The Partnership Between Skincare and Laser/Device Treatments39:08 LED Light Therapy and At-Home Options41:26 Treating Acne with Blue LED and Avicleer Laser45:43 Treating Acne Scars and Pigmentation with Lasers and Devices47:00 Resurfacing the Skin with CO2, Erbium, and Fractional Lasers52:34 Laser Coring and Non-Surgical Lifting Treatments58:25 RF Microneedling and Navigating Information on Social Media Watch the episode here:https://youtu.be/c5_6GcDHvAYSee omnystudio.com/listener for privacy information.
This week, we are joined by Dr. Naima Bhana Lopez, an assistant professor of special education and BCBA-D at Niagara University in New York. Dr. Lopez specializes in enhancing social-communication opportunities for children with developmental disabilities. Her work also explores diversity and equity in special education, empowering natural communication partners to improve outcomes for diverse students and families. With expertise in ABA therapy and special education, Dr. Lopez will be discussing the intersection of these fields with diversity. Download to learn more! Resources https://www.abainternational.org/media/180194/abai_interprofessional_collaboration_resource_document.pdf https://aac-learning-center-moodle.psu.edu/ https://abavisualized.com/collections/for-parents/products/aba-en-imagenes-una-guia-visual-para-padres-y-maestros (also available on amazon) https://abavisualized.com/collections/for-parents/products/aba-visualized-guidebook-2nd-edition (also on Amazon) ................................................................ Autism weekly is now found on all of the major listening apps including apple podcasts, google podcasts, stitcher, Spotify, amazon music, and more. Subscribe to be notified when we post a new podcast. Autism weekly is produced by ABS Kids. ABS Kids is proud to provide diagnostic assessments and ABA therapy to children with developmental delays like Autism Spectrum Disorder. You can learn more about ABS Kids and the Autism Weekly podcast by visiting abskids.com.
TL;DR: You can now buy tickets, apply to speak, or join the expo for the biggest AI Engineer event of 2024. We're gathering *everyone* you want to meet - see you this June.In last year's the Rise of the AI Engineer we put our money where our mouth was and announced the AI Engineer Summit, which fortunately went well:With ~500 live attendees and over ~500k views online, the first iteration of the AI Engineer industry affair seemed to be well received. Competing in an expensive city with 3 other more established AI conferences in the fall calendar, we broke through in terms of in-person experience and online impact.So at the end of Day 2 we announced our second event: the AI Engineer World's Fair. The new website is now live, together with our new presenting sponsor:We were delighted to invite both Ben Dunphy, co-organizer of the conference and Sam Schillace, the deputy CTO of Microsoft who wrote some of the first Laws of AI Engineering while working with early releases of GPT-4, on the pod to talk about the conference and how Microsoft is all-in on AI Engineering.Rise of the Planet of the AI EngineerSince the first AI Engineer piece, AI Engineering has exploded:and the title has been adopted across OpenAI, Meta, IBM, and many, many other companies:1 year on, it is clear that AI Engineering is not only in full swing, but is an emerging global industry that is successfully bridging the gap:* between research and product, * between general-purpose foundation models and in-context use-cases, * and between the flashy weekend MVP (still great!) and the reliable, rigorously evaluated AI product deployed at massive scale, assisting hundreds of employees and driving millions in profit.The greatly increased scope of the 2024 AI Engineer World's Fair (more stages, more talks, more speakers, more attendees, more expo…) helps us reflect the growth of AI Engineering in three major dimensions:* Global Representation: the 2023 Summit was a mostly-American affair. This year we plan to have speakers from top AI companies across five continents, and explore the vast diversity of approaches to AI across global contexts.* Topic Coverage: * In 2023, the Summit focused on the initial questions that the community wrestled with - LLM frameworks, RAG and Vector Databases, Code Copilots and AI Agents. Those are evergreen problems that just got deeper.* This year the AI Engineering field has also embraced new core disciplines with more explicit focus on Multimodality, Evals and Ops, Open Source Models and GPU/Inference Hardware providers.* Maturity/Production-readiness: Two new tracks are dedicated toward AI in the Enterprise, government, education, finance, and more highly regulated industries or AI deployed at larger scale: * AI in the Fortune 500, covering at-scale production deployments of AI, and* AI Leadership, a closed-door, side event for technical AI leaders to discuss engineering and product leadership challenges as VPs and Heads of AI in their respective orgs.We hope you will join Microsoft and the rest of us as either speaker, exhibitor, or attendee, in San Francisco this June. Contact us with any enquiries that don't fall into the categories mentioned below.Show Notes* Ben Dunphy* 2023 Summit* GitHub confirmed $100m ARR on stage* History of World's Fairs* Sam Schillace* Writely on Acquired.fm* Early Lessons From GPT-4: The Schillace Laws* Semantic Kernel* Sam on Kevin Scott (Microsoft CTO)'s podcast in 2022* AI Engineer World's Fair (SF, Jun 25-27)* Buy Super Early Bird tickets (Listeners can use LATENTSPACE for $100 off any ticket until April 8, or use GROUP if coming in 4 or more)* Submit talks and workshops for Speaker CFPs (by April 8)* Enquire about Expo Sponsorship (Asap.. selling fast)Timestamps* [00:00:16] Intro* [00:01:04] 2023 AI Engineer Summit* [00:03:11] Vendor Neutral* [00:05:33] 2024 AIE World's Fair* [00:07:34] AIE World's Fair: 9 Tracks* [00:08:58] AIE World's Fair Keynotes* [00:09:33] Introducing Sam* [00:12:17] AI in 2020s vs the Cloud in 2000s* [00:13:46] Syntax vs Semantics* [00:14:22] Bill Gates vs GPT-4* [00:16:28] Semantic Kernel and Schillace's Laws of AI Engineering* [00:17:29] Orchestration: Break it into pieces* [00:19:52] Prompt Engineering: Ask Smart to Get Smart* [00:21:57] Think with the model, Plan with Code* [00:23:12] Metacognition vs Stochasticity* [00:24:43] Generating Synthetic Textbooks* [00:26:24] Trade leverage for precision; use interaction to mitigate* [00:27:18] Code is for syntax and process; models are for semantics and intent.* [00:28:46] Hands on AI Leadership* [00:33:18] Multimodality vs "Text is the universal wire protocol"* [00:35:46] Azure OpenAI vs Microsoft Research vs Microsoft AI Division* [00:39:40] On Satya* [00:40:44] Sam at AI Leadership Track* [00:42:05] Final Plug for Tickets & CFPTranscript[00:00:00] Alessio: Hey everyone, welcome to the Latent Space Podcast. This is Alessio, partner and CTO in residence at Decibel Partners, and I'm joined by my co host Swyx, founder of Small[00:00:16] Intro[00:00:16] swyx: AI. Hey, hey, we're back again with a very special episode, this time with two guests and talking about the very in person events rather than online stuff.[00:00:27] swyx: So first I want to welcome Ben Dunphy, who is my co organizer on AI engineer conferences. Hey, hey, how's it going? We have a very special guest. Anyone who's looking at the show notes and the title will preview this later. But I guess we want to set the context. We are effectively doing promo for the upcoming AI Engineer World's Fair that's happening in June.[00:00:49] swyx: But maybe something that we haven't actually recapped much on the pod is just the origin of the AI Engineer Summit and why, what happens and what went down. Ben, I don't know if you'd like to start with the raw numbers that people should have in mind.[00:01:04] 2023 AI Engineer Summit[00:01:04] Ben Dunphy: Yeah, perhaps your listeners would like just a quick background on the summit.[00:01:09] Ben Dunphy: I mean, I'm sure many folks have heard of our events. You know, you launched, we launched the AI Engineer Summit last June with your, your article kind of coining the term that was on the tip of everyone's tongue, but curiously had not been actually coined, which is the term AI Engineer, which is now many people's, Job titles, you know, we're seeing a lot more people come to this event, with the job description of AI engineer, with the job title of AI engineer so, is an event that you and I really talked about since February of 2023, when we met at a hackathon you organized we were both excited by this movement and it hasn't really had a name yet.[00:01:48] Ben Dunphy: We decided that an event was warranted and that's why we move forward with the AI Engineer Summit, which Ended up being a great success. You know, we had over 5, 000 people apply to attend in person. We had over 9, 000 folks attend, online with over 20, 000 on the live stream.[00:02:06] Ben Dunphy: In person, we accepted about 400 attendees and had speakers, workshop instructors and sponsors, all congregating in San Francisco over, two days, um, two and a half days with a, with a welcome reception. So it was quite the event to kick off kind of this movement that's turning into quite an exciting[00:02:24] swyx: industry.[00:02:25] swyx: The overall idea of this is that I kind of view AI engineering, at least in all my work in Latent Space and the other stuff, as starting an industry.[00:02:34] swyx: And I think every industry, every new community, needs a place to congregate. And I definitely think that AI engineer, at least at the conference, is that it's meant to be like the biggest gathering of technical engineering people working with AI. Right. I think we kind of got that spot last year. There was a very competitive conference season, especially in San Francisco.[00:02:54] swyx: But I think as far as I understand, in terms of cultural impact, online impact, and the speakers that people want to see, we, we got them all and it was very important for us to be a vendor neutral type of event. Right. , The reason I partnered with Ben is that Ben has a lot of experience, a lot more experience doing vendor neutral stuff.[00:03:11] Vendor Neutral[00:03:11] swyx: I first met you when I was speaking at one of your events, and now we're sort of business partners on that. And yeah, I mean, I don't know if you have any sort of Thoughts on make, making things vendor neutral, making things more of a community industry conference rather than like something that's owned by one company.[00:03:25] swyx: Yeah.[00:03:25] Ben Dunphy: I mean events that are owned by a company are great, but this is typically where you have product pitches and this smaller internet community. But if you want the truly internet community, if you want a more varied audience and you know, frankly, better content for, especially for a technical audience, you want a vendor neutral event. And this is because when you have folks that are running the event that are focused on one thing and one thing alone, which is quality, quality of content, quality of speakers, quality of the in person experience, and just of general relevance it really elevates everything to the next level.[00:04:01] Ben Dunphy: And when you have someone like yourself who's coming To this content curation the role that you take at this event, and bringing that neutrality with, along with your experience, that really helps to take it to the next level, and then when you have someone like myself, focusing on just the program curation, and the in person experience, then both of our forces combined, we can like, really create this epic event, and so, these vendor neutral events if you've been to a small community event, Typically, these are vendor neutral, but also if you've been to a really, really popular industry event, many of the top industry events are actually vendor neutral.[00:04:37] Ben Dunphy: And that's because of the fact that they're vendor neutral, not in spite of[00:04:41] swyx: it. Yeah, I've been pretty open about the fact that my dream is to build the KubeCon of AI. So if anyone has been in the Kubernetes world, they'll understand what that means. And then, or, or instead of the NeurIPS, NeurIPS for engineers, where engineers are the stars and engineers are sharing their knowledge.[00:04:57] swyx: Perspectives, because I think AI is definitely moving over from research to engineering and production. I think one of my favorite parts was just honestly having GitHub and Microsoft support, which we'll cover in a bit, but you know, announcing finally that GitHub's copilot was such a commercial success I think was the first time that was actually confirmed by anyone in public.[00:05:17] swyx: For me, it's also interesting as sort of the conference curator to put Microsoft next to competitors some of which might be much smaller AI startups and to see what, where different companies are innovating in different areas.[00:05:27] swyx: Well, they're next to[00:05:27] Ben Dunphy: each other in the arena. So they can be next to each other on stage too.[00:05:33] Why AIE World's Fair[00:05:33] swyx: Okay, so this year World's Fair we are going a lot bigger what details are we disclosing right now? Yeah,[00:05:39] Ben Dunphy: I guess we should start with the name why are we calling it the World's Fair? And I think we need to go back to what inspired this, what actually the original World's Fair was, which was it started in the late 1700s and went to the early 1900s.[00:05:53] Ben Dunphy: And it was intended to showcase the incredible achievements. Of nation states, corporations, individuals in these grand expos. So you have these miniature cities actually being built for these grand expos. In San Francisco, for example, you had the entire Marina District built up in absolutely new construction to showcase the achievements of industry, architecture, art, and culture.[00:06:16] Ben Dunphy: And many of your listeners will know that in 1893, the Nikola Tesla famously provided power to the Chicago World's Fair with his 8 seat power generator. There's lots of great movies and documentaries about this. That was the first electric World's Fair, which thereafter it was referred to as the White City.[00:06:33] Ben Dunphy: So in today's world we have technological change that's similar to what was experienced during the industrial revolution in how it's, how it's just upending our entire life, how we live, work, and play. And so we have artificial intelligence, which has long been the dream of humanity.[00:06:51] Ben Dunphy: It's, it's finally here. And the pace of technological change is just accelerating. So with this event, as you mentioned, we, we're aiming to create a singular event where the world's foremost experts, builders, and practitioners can come together to exchange and reflect. And we think this is not only good for business, but it's also good for our mental health.[00:07:12] Ben Dunphy: It slows things down a bit from the Twitter news cycle to an in person festival of smiles, handshakes, connections, and in depth conversations that online media and online events can only ever dream of replicating. So this is an expo led event where the world's top companies will mingle with the world's top founders and AI engineers who are building and enhanced by AI.[00:07:34] AIE World's Fair: 9 Tracks[00:07:34] Ben Dunphy: And not to mention, we're featuring over a hundred talks and workshops across[00:07:37] swyx: nine tracks. Yeah, I mean, those nine tracks will be fun. Actually, do we have a little preview of the tracks in the, the speakers?[00:07:43] Ben Dunphy: We do. Folks can actually see them today at our website. We've updated that at ai.[00:07:48] Ben Dunphy: engineer. So we'd encourage them to go there to see that. But for those just listening, we have nine tracks. So we have multimodality. We have retrieval augmented generation. Featuring LLM frameworks and vector databases, evals and LLM ops, open source models, code gen and dev tools, GPUs and inference, AI agent applications, AI in the fortune 500, and then we have a special track for AI leadership which you can access by purchasing the VP pass which is different from the, the other passes we have.[00:08:20] Ben Dunphy: And I won't go into the Each of these tracks in depth, unless you want to, Swyx but there's more details on the website at ai. engineer.[00:08:28] swyx: I mean, I, I, very much looking forward to talking to our special guests for the last track, I think, which is the what a lot of yeah, leaders are thinking about, which is how to, Inspire innovation in their companies, especially the sort of larger organizations that might not have the in house talents for that kind of stuff.[00:08:47] swyx: So yeah, we can talk about the expo, but I'm very keen to talk about the presenting sponsor if you want to go slightly out of order from our original plan.[00:08:58] AIE World's Fair Keynotes[00:08:58] Ben Dunphy: Yeah, absolutely. So you know, for the stage of keynotes, we have talks confirmed from Microsoft, OpenAI, AWS, and Google.[00:09:06] Ben Dunphy: And our presenting sponsor is joining the stage with those folks. And so that presenting sponsor this year is a dream sponsor. It's Microsoft. It's the company really helping to lead the charge. And into this wonderful new era that we're all taking part in. So, yeah,[00:09:20] swyx: you know, a bit of context, like when we first started planning this thing, I was kind of brainstorming, like, who would we like to get as the ideal presenting sponsors, as ideal partners long term, just in terms of encouraging the AI engineering industry, and it was Microsoft.[00:09:33] Introducing Sam[00:09:33] swyx: So Sam, I'm very excited to welcome you onto the podcast. You are CVP and Deputy CTO of Microsoft. Welcome.[00:09:40] Sam Schillace: Nice to be here. I'm looking forward to, I was looking for, to Lessio saying my last name correctly this time. Oh[00:09:45] swyx: yeah. So I, I studiously avoided saying, saying your last name, but apparently it's an Italian last name.[00:09:50] swyx: Ski Lache. Ski[00:09:51] Alessio: Lache. Yeah. No, that, that's great, Sean. That's great as a musical person.[00:09:54] swyx: And it, it's also, yeah, I pay attention to like the, the, the lilt. So it's ski lache and the, the slow slowing of the law is, is what I focused[00:10:03] Sam Schillace: on. You say both Ls. There's no silent letters, you say[00:10:07] Alessio: both of those. And it's great to have you, Sam.[00:10:09] Alessio: You know, we've known each other now for a year and a half, two years, and our first conversation, well, it was at Lobby Conference, and then we had a really good one in the kind of parking lot of a Safeway, because we didn't want to go into Starbucks to meet, so we sat outside for about an hour, an hour and a half, and then you had to go to a Bluegrass concert, so it was great.[00:10:28] Alessio: Great meeting, and now, finally, we have you on Lanespace.[00:10:31] Sam Schillace: Cool, cool. Yeah, I'm happy to be here. It's funny, I was just saying to Swyx before you joined that, like, it's kind of an intimidating podcast. Like, when I listen to this podcast, it seems to be, like, one of the more intelligent ones, like, more, more, like, deep technical folks on it.[00:10:44] Sam Schillace: So, it's, like, it's kind of nice to be here. It's fun. Bring your A game. Hopefully I'll, I'll bring mine. I[00:10:49] swyx: mean, you've been programming for longer than some of our listeners have been alive, so I don't think your technical chops are in any doubt. So you were responsible for Rightly as one of your early wins in your career, which then became Google Docs, and obviously you were then responsible for a lot more G Suite.[00:11:07] swyx: But did you know that you covered in Acquired. fm episode 9, which is one of the podcasts that we model after.[00:11:13] Sam Schillace: Oh, cool. I didn't, I didn't realize that the most fun way to say this is that I still have to this day in my personal GDocs account, the very first Google doc, like I actually have it.[00:11:24] Sam Schillace: And I looked it up, like it occurred to me like six months ago that it was probably around and I went and looked and it's still there. So it's like, and it's kind of a funny thing. Cause it's like the backend has been rewritten at least twice that I know of the front end has been re rewritten at least twice that I know of.[00:11:38] Sam Schillace: So. I'm not sure what sense it's still the original one it's sort of more the idea of the original one, like the NFT of it would probably be more authentic. I[00:11:46] swyx: still have it. It's a ship athesia thing. Does it, does it say hello world or something more mundane?[00:11:52] Sam Schillace: It's, it's, it's me and Steve Newman trying to figure out if some collaboration stuff is working, and also a picture of Edna from the Incredibles that I probably pasted in later, because that's That's too early for that, I think.[00:12:05] swyx: People can look up your LinkedIn, and we're going to link it on the show notes, but you're also SVP of engineering for Box, and then you went back to Google to do Google, to lead Google Maps, and now you're deputy CTO.[00:12:17] AI in 2020s vs the Cloud in 2000s[00:12:17] swyx: I mean, there's so many places to start, but maybe one place I like to start off with is do you have a personal GPT 4 experience.[00:12:25] swyx: Obviously being at Microsoft, you have, you had early access and everyone talks about Bill Gates's[00:12:30] Sam Schillace: demo. Yeah, it's kind of, yeah, that's, it's kind of interesting. Like, yeah, we got access, I got access to it like in September of 2022, I guess, like before it was really released. And I it like almost instantly was just like mind blowing to me how good it was.[00:12:47] Sam Schillace: I would try experiments like very early on, like I play music. There's this thing called ABC notation. That's like an ASCII way to represent music. And like, I was like, I wonder if it can like compose a fiddle tune. And like it composed a fiddle tune. I'm like, I wonder if it can change key, change the key.[00:13:01] Sam Schillace: Like it's like really, it was like very astonishing. And I sort of, I'm very like abstract. My background is actually more math than CS. I'm a very abstract thinker and sort of categorical thinker. And the, the thing that occurred to me with, with GPT 4 the first time I saw it was. This is really like the beginning, it's the beginning of V2 of the computer industry completely.[00:13:23] Sam Schillace: I had the same feeling I had when, of like a category shifting that I had when the cloud stuff happened with the GDocs stuff, right? Where it's just like, all of a sudden this like huge vista opens up of capabilities. And I think the way I characterized it, which is a little bit nerdy, but I'm a nerd so lean into it is like everything until now has been about syntax.[00:13:46] Syntax vs Semantics[00:13:46] Sam Schillace: Like, we have to do mediation. We have to describe the real world in forms that the digital world can manage. And so we're the mediation, and we, like, do that via things like syntax and schema and programming languages. And all of a sudden, like, this opens the door to semantics, where, like, you can express intention and meaning and nuance and fuzziness.[00:14:04] Sam Schillace: And the machine itself is doing, the model itself is doing a bunch of the mediation for you. And like, that's obviously like complicated. We can talk about the limits and stuff, and it's getting better in some ways. And we're learning things and all kinds of stuff is going on around it, obviously.[00:14:18] Sam Schillace: But like, that was my immediate reaction to it was just like, Oh my God.[00:14:22] Bill Gates vs GPT-4[00:14:22] Sam Schillace: Like, and then I heard about the build demo where like Bill had been telling Kevin Scott this, This investment is a waste. It's never going to work. AI is blah, blah, blah. And come back when it can pass like an AP bio exam.[00:14:33] Sam Schillace: And they actually literally did that at one point, they brought in like the world champion of the, like the AP bio test or whatever the AP competition and like it and chat GPT or GPT 4 both did the AP bio and GPT 4 beat her. So that was the moment that convinced Bill that this was actually real.[00:14:53] Sam Schillace: Yeah, it's fun. I had a moment with him actually about three weeks after that when we had been, so I started like diving in on developer tools almost immediately and I built this thing with a small team that's called the Semantic Kernel which is one of the very early orchestrators just because I wanted to be able to put code and And inference together.[00:15:10] Sam Schillace: And that's probably something we should dig into more deeply. Cause I think there's some good insights in there, but I I had a bunch of stuff that we were building and then I was asked to go meet with Bill Gates about it and he's kind of famously skeptical and, and so I was a little bit nervous to meet him the first time.[00:15:25] Sam Schillace: And I started the conversation with, Hey, Bill, like three weeks ago, you would have called BS on everything I'm about to show you. And I would probably have agreed with you, but we've both seen this thing. And so we both know it's real. So let's skip that part and like, talk about what's possible.[00:15:39] Sam Schillace: And then we just had this kind of fun, open ended conversation and I showed him a bunch of stuff. So that was like a really nice, fun, fun moment as well. Well,[00:15:46] swyx: that's a nice way to meet Bill Gates and impress[00:15:48] Sam Schillace: him. A little funny. I mean, it's like, I wasn't sure what he would think of me, given what I've done and his.[00:15:54] Sam Schillace: Crown Jewel. But he was nice. I think he likes[00:15:59] swyx: GDocs. Crown Jewel as in Google Docs versus Microsoft Word? Office.[00:16:03] Sam Schillace: Yeah. Yeah, versus Office. Yeah, like, I think, I mean, I can imagine him not liking, I met Steven Snofsky once and he sort of respectfully, but sort of grimaced at me. You know, like, because of how much trauma I had caused him.[00:16:18] Sam Schillace: So Bill was very nice to[00:16:20] swyx: me. In general it's like friendly competition, right? They keep you, they keep you sharp, you keep each[00:16:24] Sam Schillace: other sharp. Yeah, no, I think that's, it's definitely respect, it's just kind of funny.[00:16:28] Semantic Kernel and Schillace's Laws of AI Engineering[00:16:28] Sam Schillace: Yeah,[00:16:28] swyx: So, speaking of semantic kernel, I had no idea that you were that deeply involved, that you actually had laws named after you.[00:16:35] swyx: This only came up after looking into you for a little bit. Skelatches laws, how did those, what's the, what's the origin[00:16:41] Sam Schillace: story? Hey! Yeah, that's kind of funny. I'm actually kind of a modest person and so I'm sure I feel about having my name attached to them. Although I do agree with all, I believe all of them because I wrote all of them.[00:16:49] Sam Schillace: This is like a designer, John Might, who works with me, decided to stick my name on them and put them out there. Seriously, but like, well, but like, so this was just I, I'm not, I don't build models. Like I'm not an AI engineer in the sense of, of like AI researcher that's like doing inference. Like I'm somebody who's like consuming the models.[00:17:09] Sam Schillace: Exactly. So it's kind of funny when you're talking about AI engineering, like it's a good way of putting it. Cause that's how like I think about myself. I'm like, I'm an app builder. I just want to build with this tool. Yep. And so we spent all of the fall and into the winter in that first year, like Just trying to build stuff and learn how this tool worked.[00:17:29] Orchestration: Break it into pieces[00:17:29] Sam Schillace: And I guess those are a little bit in the spirit of like Robert Bentley's programming pearls or something. I was just like, let's kind of distill some of these ideas down of like. How does this thing work? I saw something I still see today with people doing like inference is still kind of expensive.[00:17:46] Sam Schillace: GPUs are still kind of scarce. And so people try to get everything done in like one shot. And so there's all this like prompt tuning to get things working. And one of the first laws was like, break it into pieces. Like if it's hard for you, it's going to be hard for the model. But if it's you know, there's this kind of weird thing where like, it's.[00:18:02] Sam Schillace: It's absolutely not a human being, but starting to think about, like, how would I solve the problem is often a good way to figure out how to architect the program so that the model can solve the problem. So, like, that was one of the first laws. That came from me just trying to, like, replicate a test of a, like, a more complicated, There's like a reasoning process that you have to go through that, that Google was, was the react, the react thing, and I was trying to get GPT 4 to do it on its own.[00:18:32] Sam Schillace: And, and so I'd ask it the question that was in this paper, and the answer to the question is like the year 2000. It's like, what year did this particular author who wrote this book live in this country? And you've kind of got to carefully reason through it. And like, I could not get GPT 4 to Just to answer the question with the year 2000.[00:18:50] Sam Schillace: And if you're thinking about this as like the kernel is like a pipelined orchestrator, right? It's like very Unix y, where like you have a, some kind of command and you pipe stuff to the next parameters and output to the next thing. So I'm thinking about this as like one module in like a pipeline, and I just want it to give me the answer.[00:19:05] Sam Schillace: I don't want anything else. And I could not prompt engineer my way out of that. I just like, it was giving me a paragraph or reasoning. And so I sort of like anthropomorphized a little bit and I was like, well, the only way you can think about stuff is it can think out loud because there's nothing else that the model does.[00:19:19] Sam Schillace: It's just doing token generation. And so it's not going to be able to do this reasoning if it can't think out loud. And that's why it's always producing this. But if you take that paragraph of output, which did get to the right answer and you pipe it into a second prompt. That just says read this conversation and just extract the answer and report it back.[00:19:38] Sam Schillace: That's an easier task. That would be an easier task for you to do or me to do. It's easier reasoning. And so it's an easier thing for the model to do and it's much more accurate. And that's like 100 percent accurate. It always does that. So like that was one of those, those insights on the that led to the, the choice loss.[00:19:52] Prompt Engineering: Ask Smart to Get Smart[00:19:52] Sam Schillace: I think one of the other ones that's kind of interesting that I think people still don't fully appreciate is that GPT 4 is the rough equivalent of like a human being sitting down for centuries or millennia and reading all the books that they can find. It's this vast mind, right, and the embedding space, the latent space, is 100, 000 K, 100, 000 dimensional space, right?[00:20:14] Sam Schillace: Like it's this huge, high dimensional space, and we don't have good, um, Intuition about high dimensional spaces, like the topology works in really weird ways, connectivity works in weird ways. So a lot of what we're doing is like aiming the attention of a model into some part of this very weirdly connected space.[00:20:30] Sam Schillace: That's kind of what prompt engineering is. But that kind of, like, what we observed to begin with that led to one of those laws was You know, ask smart to get smart. And I think we've all, we all understand this now, right? Like this is the whole field of prompt engineering. But like, if you ask like a simple, a simplistic question of the model, you'll get kind of a simplistic answer.[00:20:50] Sam Schillace: Cause you're pointing it at a simplistic part of that high dimensional space. And if you ask it a more intelligent question, you get more intelligent stuff back out. And so I think that's part of like how you think about programming as well. It's like, how are you directing the attention of the model?[00:21:04] Sam Schillace: And I think we still don't have a good intuitive feel for that. To me,[00:21:08] Alessio: the most interesting thing is how do you tie the ask smart, get smart with the syntax and semantics piece. I gave a talk at GDC last week about the rise of full stack employees and how these models are like semantic representation of tasks that people do.[00:21:23] Alessio: But at the same time, we have code. Also become semantic representation of code. You know, I give you the example of like Python that sort it's like really a semantic function. It's not code, but it's actually code underneath. How do you think about tying the two together where you have code?[00:21:39] Alessio: To then extract the smart parts so that you don't have to like ask smart every time and like kind of wrap them in like higher level functions.[00:21:46] Sam Schillace: Yeah, this is, this is actually, we're skipping ahead to kind of later in the conversation, but I like to, I usually like to still stuff down in these little aphorisms that kind of help me remember them.[00:21:57] Think with the model, Plan with Code[00:21:57] Sam Schillace: You know, so we can dig into a bunch of them. One of them is pixels are free, one of them is bots are docs. But the one that's interesting here is Think with the model, plan with code. And so one of the things, so one of the things we've realized, we've been trying to do lots of these like longer running tasks.[00:22:13] Sam Schillace: Like we did this thing called the infinite chatbot, which was the successor to the semantic kernel, which is an internal project. It's a lot like GPTs. The open AI GPT is, but it's like a little bit more advanced in some ways, kind of deep exploration of a rag based bot system. And then we did multi agents from that, trying to do some autonomy stuff and we're, and we're kind of banging our head against this thing.[00:22:34] Sam Schillace: And you know, one of the things I started to realize, this is going to get nerdy for a second. I apologize, but let me dig in on it for just a second. No apology needed. Um, we realized is like, again, this is a little bit of an anthropomorphism and an illusion that we're having. So like when we look at these models, we think there's something continuous there.[00:22:51] Sam Schillace: We're having a conversation with chat GPT or whatever with Azure open air or like, like what's really happened. It's a little bit like watching claymation, right? Like when you watch claymation, you don't think that the model is actually the clay model is actually really alive. You know, that there's like a bunch of still disconnected slot screens that your mind is connecting into a continuous experience.[00:23:12] Metacognition vs Stochasticity[00:23:12] Sam Schillace: And that's kind of the same thing that's going on with these models. Like they're all the prompts are disconnected no matter what. Which means you're putting a lot of weight on memory, right? This is the thing we talked about. You're like, you're putting a lot of weight on precision and recall of your memory system.[00:23:27] Sam Schillace: And so like, and it turns out like, because the models are stochastic, they're kind of random. They'll make stuff up if things are missing. If you're naive about your, your memory system, you'll get lots of like accumulated similar memories that will kind of clog the system, things like that. So there's lots of ways in which like, Memory is hard to manage well, and, and, and that's okay.[00:23:47] Sam Schillace: But what happens is when you're doing plans and you're doing these longer running things that you're talking about, that second level, the metacognition is very vulnerable to that stochastic noise, which is like, I totally want to put this on a bumper sticker that like metacognition is susceptible to stochasticity would be like the great bumper sticker.[00:24:07] Sam Schillace: So what, these things are very vulnerable to feedback loops when they're trying to do autonomy, and they're very vulnerable to getting lost. So we've had these, like, multi agent Autonomous agent things get kind of stuck on like complimenting each other, or they'll get stuck on being quote unquote frustrated and they'll go on strike.[00:24:22] Sam Schillace: Like there's all kinds of weird like feedback loops you get into. So what we've learned to answer your question of how you put all this stuff together is You have to, the model's good at thinking, but it's not good at planning. So you do planning in code. So you have to describe the larger process of what you're doing in code somehow.[00:24:38] Sam Schillace: So semantic intent or whatever. And then you let the model kind of fill in the pieces.[00:24:43] Generating Synthetic Textbooks[00:24:43] Sam Schillace: I'll give a less abstract like example. It's a little bit of an old example. I did this like last year, but at one point I wanted to see if I could generate textbooks. And so I wrote this thing called the textbook factory.[00:24:53] Sam Schillace: And it's, it's tiny. It's like a Jupyter notebook with like. You know, 200 lines of Python and like six very short prompts, but what you basically give it a sentence. And it like pulls out the topic and the level of, of, from that sentence, so you, like, I would like fifth grade reading. I would like eighth grade English.[00:25:11] Sam Schillace: His English ninth grade, US history, whatever. That by the way, all, all by itself, like would've been an almost impossible job like three years ago. Isn't, it's like totally amazing like that by itself. Just parsing an arbitrary natural language sentence to get these two pieces of information out is like almost trivial now.[00:25:27] Sam Schillace: Which is amazing. So it takes that and it just like makes like a thousand calls to the API and it goes and builds a full year textbook, like decides what the curriculum is with one of the prompts. It breaks it into chapters. It writes all the lessons and lesson plans and like builds a teacher's guide with all the answers to all the questions.[00:25:42] Sam Schillace: It builds a table of contents, like all that stuff. It's super reliable. You always get a textbook. It's super brittle. You never get a cookbook or a novel like but like you could kind of define that domain pretty care, like I can describe. The metacognition, the high level plan for how do you write a textbook, right?[00:25:59] Sam Schillace: You like decide the curriculum and then you write all the chapters and you write the teacher's guide and you write the table content, like you can, you can describe that out pretty well. And so having that like code exoskeleton wrapped around the model is really helpful, like it keeps the model from drifting off and then you don't have as many of these vulnerabilities around memory that you would normally have.[00:26:19] Sam Schillace: So like, that's kind of, I think where the syntax and semantics comes together right now.[00:26:24] Trade leverage for precision; use interaction to mitigate[00:26:24] Sam Schillace: And then I think the question for all of us is. How do you get more leverage out of that? Right? So one of the things that I don't love about virtually everything anyone's built for the last year and a half is people are holding the hands of the model on everything.[00:26:37] Sam Schillace: Like the leverage is very low, right? You can't turn. These things loose to do anything really interesting for very long. You can kind of, and the places where people are getting more work out per unit of work in are usually where somebody has done exactly what I just described. They've kind of figured out what the pattern of the problem is in enough of a way that they can write some code for it.[00:26:59] Sam Schillace: And then that that like, so I've seen like sales support stuff. I've seen like code base tuning stuff of like, there's lots of things that people are doing where like, you can get a lot of value in some relatively well defined domain using a little bit of the model's ability to think for you and a little, and a little bit of code.[00:27:18] Code is for syntax and process; models are for semantics and intent.[00:27:18] Sam Schillace: And then I think the next wave is like, okay, do we do stuff like domain specific languages to like make the planning capabilities better? Do we like start to build? More sophisticated primitives. We're starting to think about and talk about like power automate and a bunch of stuff inside of Microsoft that we're going to wrap in these like building blocks.[00:27:34] Sam Schillace: So the models have these chunks of reliable functionality that they can invoke as part of these plans, right? Because you don't want like, if you're going to ask the model to go do something and the output's going to be a hundred thousand lines of code, if it's got to generate that code every time, the randomness, the stochasticity is like going to make that basically not reliable.[00:27:54] Sam Schillace: You want it to generate it like a 10 or 20 line high level semantic plan for this thing that gets handed to some markup executor that runs it and that invokes that API, that 100, 000 lines of code behind it, API call. And like, that's a really nice robust system for now. And then as the models get smarter as new models emerge, then we get better plans, we get more sophistication.[00:28:17] Sam Schillace: In terms of what they can choose, things like that. Right. So I think like that feels like that's probably the path forward for a little while, at least, like there was, there was a lot there. I, sorry, like I've been thinking, you can tell I've been thinking about it a lot. Like this is kind of all I think about is like, how do you build.[00:28:31] Sam Schillace: Really high value stuff out of this. And where do we go? Yeah. The, the role where[00:28:35] swyx: we are. Yeah. The intermixing of code and, and LMS is, is a lot of the role of the AI engineer. And I, I, I think in a very real way, you were one of the first to, because obviously you had early access. Honestly, I'm surprised.[00:28:46] Hands on AI Leadership[00:28:46] swyx: How are you so hands on? How do you choose to, to dedicate your time? How do you advise other tech leaders? Right. You know, you, you are. You have people working for you, you could not be hands on, but you seem to be hands on. What's the allocation that people should have, especially if they're senior tech[00:29:03] Sam Schillace: leaders?[00:29:04] Sam Schillace: It's mostly just fun. Like, I'm a maker, and I like to build stuff. I'm a little bit idiosyncratic. I I've got ADHD, and so I won't build anything. I won't work on anything I'm bored with. So I have no discipline. If I'm not actually interested in the thing, I can't just, like, do it, force myself to do it.[00:29:17] Sam Schillace: But, I mean, if you're not interested in what's going on right now in the industry, like, go find a different industry, honestly. Like, I seriously, like, this is, I, well, it's funny, like, I don't mean to be snarky, but, like, I was at a dinner, like, a, I don't know, six months ago or something, And I was sitting next to a CTO of a large, I won't name the corporation because it would name the person, but I was sitting next to the CTO of a very large Japanese technical company, and he was like, like, nothing has been interesting since the internet, and this is interesting now, like, this is fun again.[00:29:46] Sam Schillace: And I'm like, yeah, totally, like this is like, the most interesting thing that's happened in 35 years of my career, like, we can play with semantics and natural language, and we can have these things that are like sort of active, can kind of be independent in certain ways and can do stuff for us and can like, reach all of these interesting problems.[00:30:02] Sam Schillace: So like that's part of it of it's just kind of fun to, to do stuff and to build stuff. I, I just can't, can't resist. I'm not crazy hands-on, like, I have an eng like my engineering team's listening right now. They're like probably laughing 'cause they, I never, I, I don't really touch code directly 'cause I'm so obsessive.[00:30:17] Sam Schillace: I told them like, if I start writing code, that's all I'm gonna do. And it's probably better if I stay a little bit high level and like, think about. I've got a really great couple of engineers, a bunch of engineers underneath me, a bunch of designers underneath me that are really good folks that we just bounce ideas off of back and forth and it's just really fun.[00:30:35] Sam Schillace: That's the role I came to Microsoft to do, really, was to just kind of bring some energy around innovation, some energy around consumer, We didn't know that this was coming when I joined. I joined like eight months before it hit us, but I think Kevin might've had an idea it was coming. And and then when it hit, I just kind of dove in with both feet cause it's just so much fun to do.[00:30:55] Sam Schillace: Just to tie it back a little bit to the, the Google Docs stuff. When we did rightly originally the world it's not like I built rightly in jQuery or anything. Like I built that thing on bare metal back before there were decent JavaScript VMs.[00:31:10] Sam Schillace: I was just telling somebody today, like you were rate limited. So like just computing the diff when you type something like doing the string diff, I had to write like a binary search on each end of the string diff because like you didn't have enough iterations of a for loop to search character by character.[00:31:24] Sam Schillace: I mean, like that's how rough it was none of the browsers implemented stuff directly, whatever. It's like, just really messy. And like, that's. Like, as somebody who's been doing this for a long time, like, that's the place where you want to engage, right? If things are easy, and it's easy to go do something, it's too late.[00:31:42] Sam Schillace: Even if it's not too late, it's going to be crowded, but like the right time to do something new and disruptive and technical is, first of all, still when it's controversial, but second of all, when you have this, like, you can see the future, you ask this, like, what if question, and you can see where it's going, But you have this, like, pit in your stomach as an engineer as to, like, how crappy this is going to be to do.[00:32:04] Sam Schillace: Like, that's really the right moment to engage with stuff. We're just like, this is going to suck, it's going to be messy, I don't know what the path is, I'm going to get sticks and thorns in my hair, like I, I, it's going to have false starts, and I don't really, I'm going to This is why those skeletchae laws are kind of funny, because, like, I, I, like You know, I wrote them down at one point because they were like my best guess, but I'm like half of these are probably wrong, and I think they've all held up pretty well, but I'm just like guessing along with everybody else, we're just trying to figure this thing out still, right, and like, and I think the only way to do that is to just engage with it.[00:32:34] Sam Schillace: You just have to like, build stuff. If you're, I can't tell you the number of execs I've talked to who have opinions about AI and have not sat down with anything for more than 10 minutes to like actually try to get anything done. You know, it's just like, it's incomprehensible to me that you can watch this stuff through the lens of like the press and forgive me, podcasts and feel like you actually know what you're talking about.[00:32:59] Sam Schillace: Like, you have to like build stuff. Like, break your nose on stuff and like figure out what doesn't work.[00:33:04] swyx: Yeah, I mean, I view us as a starting point, as a way for people to get exposure on what we're doing. They should be looking at, and they still have to do the work as do we. Yeah, I'll basically endorse, like, I think most of the laws.[00:33:18] Multimodality vs "Text is the universal wire protocol"[00:33:18] swyx: I think the one I question the most now is text is the universal wire protocol. There was a very popular article, a text that used a universal interface by Rune who now works at OpenAI. And I, actually, we just, we just dropped a podcast with David Luan, who's CEO of Adept now, but he was VP of Eng, and he pitched Kevin Scott for the original Microsoft investment in OpenAI.[00:33:40] swyx: Where he's basically pivoting to or just betting very hard on multimodality. I think that's something that we don't really position very well. I think this year, we're trying to all figure it out. I don't know if you have an updated perspective on multi modal models how that affects agents[00:33:54] Sam Schillace: or not.[00:33:55] Sam Schillace: Yeah, I mean, I think the multi I think multi modality is really important. And I, I think it's only going to get better from here. For sure. Yeah, the text is the universal wire protocol. You're probably right. Like, I don't know that I would defend that one entirely. Note that it doesn't say English, right?[00:34:09] Sam Schillace: Like it's, it's not, that's even natural language. Like there's stuff like Steve Luko, who's the guy who created TypeScript, created TypeChat, right? Which is this like way to get LLMs to be very precise and return syntax and correct JavaScript. So like, I, yeah, I think like multimodality, like, I think part of the challenge with it is like, it's a little harder to access.[00:34:30] Sam Schillace: Programatically still like I think you know and I do think like, You know like when when like dahly and stuff started to come Out I was like, oh photoshop's in trouble cuz like, you know I'm just gonna like describe images And you don't need photos of Photoshop anymore Which hasn't played out that way like they're actually like adding a bunch of tools who look like you want to be able to you know for multimodality be really like super super charged you need to be able to do stuff like Descriptively, like, okay, find the dog in this picture and mask around it.[00:34:58] Sam Schillace: Okay, now make it larger and whatever. You need to be able to interact with stuff textually, which we're starting to be able to do. Like, you can do some of that stuff. But there's probably a whole bunch of new capabilities that are going to come out that are going to make it more interesting.[00:35:11] Sam Schillace: So, I don't know, like, I suspect we're going to wind up looking kind of like Unix at the end of the day, where, like, there's pipes and, like, Stuff goes over pipes, and some of the pipes are byte character pipes, and some of them are byte digital or whatever like binary pipes, and that's going to be compatible with a lot of the systems we have out there, so like, that's probably still And I think there's a lot to be gotten from, from text as a language, but I suspect you're right.[00:35:37] Sam Schillace: Like that particular law is not going to hold up super well. But we didn't have multimodal going when I wrote it. I'll take one out as well.[00:35:46] Azure OpenAI vs Microsoft Research vs Microsoft AI Division[00:35:46] swyx: I know. Yeah, I mean, the innovations that keep coming out of Microsoft. You mentioned multi agent. I think you're talking about autogen.[00:35:52] swyx: But there's always research coming out of MSR. Yeah. PHY1, PHY2. Yeah, there's a bunch of[00:35:57] Sam Schillace: stuff. Yeah.[00:35:59] swyx: What should, how should the outsider or the AI engineer just as a sort of final word, like, How should they view the Microsoft portfolio things? I know you're not here to be a salesman, but What, how do you explain You know, Microsoft's AI[00:36:12] Sam Schillace: work to people.[00:36:13] Sam Schillace: There's a lot of stuff going on. Like, first of all, like, I should, I'll be a little tiny bit of a salesman for, like, two seconds and just point out that, like, one of the things we have is the Microsoft for Startups Founders Hub. So, like, you can get, like, Azure credits and stuff from us. Like, up to, like, 150 grand, I think, over four years.[00:36:29] Sam Schillace: So, like, it's actually pretty easy to get. Credit you can start, I 500 bucks to start or something with very little other than just an idea. So like there's, that's pretty cool. Like, I like Microsoft is very much all in on AI at, at many levels. And so like that, you mentioned, you mentioned Autogen, like, So I sit in the office of the CTO, Microsoft Research sits under him, under the office of the CTO as well.[00:36:51] Sam Schillace: So the Autogen group came out of somebody in MSR, like in that group. So like there's sort of. The spectrum of very researchy things going on in research, where we're doing things like Phi, which is the small language model efficiency exploration that's really, really interesting. Lots of very technical folks there that are building different kinds of models.[00:37:10] Sam Schillace: And then there's like, groups like my group that are kind of a little bit in the middle that straddle product and, and, and research and kind of have a foot in both worlds and are trying to kind of be a bridge into the product world. And then there's like a whole bunch of stuff on the product side of things.[00:37:23] Sam Schillace: So there's. All the Azure OpenAI stuff, and then there's all the stuff that's in Office and Windows. And I, so I think, like, the way, I don't know, the way to think about Microsoft is we're just powering AI at every level we can, and making it as accessible as we can to both end users and developers.[00:37:42] Sam Schillace: There's this really nice research arm at one end of that spectrum that's really driving the cutting edge. The fee stuff is really amazing. It broke the chinchella curves. Right, like we didn't, that's the textbooks are all you need paper, and it's still kind of controversial, but like that was really a surprising result that came out of MSR.[00:37:58] Sam Schillace: And so like I think Microsoft is both being a thought leader on one end, on the other end with all the Azure OpenAI, all the Azure tooling that we have, like very much a developer centric, kind of the tinkerer's paradise that Microsoft always was. It's like a great place to come and consume all these things.[00:38:14] Sam Schillace: There's really amazing stuff ideas that we've had, like these very rich, long running, rag based chatbots that we didn't talk about that are like now possible to just go build with Azure AI Studio for yourself. You can build and deploy like a chatbot that's trained on your data specifically, like very easily and things like that.[00:38:31] Sam Schillace: So like there's that end of things. And then there's all this stuff that's in Office, where like, you could just like use the copilots both in Bing, but also just like daily your daily work. So like, it's just kind of everywhere at this point, like everyone in the company thinks about it all the time.[00:38:43] Sam Schillace: There's like no single answer to that question. That was way more salesy than I thought I was capable of, but like, that is actually the genuine truth. Like, it is all the time, it is all levels, it is all the way from really pragmatic, approachable stuff for somebody starting out who doesn't know things, all the way to like Absolutely cutting edge research, silicon, models, AI for science, like, we didn't talk about any of the AI for science stuff, I've seen magical stuff coming out of the research group on that topic, like just crazy cool stuff that's coming, so.[00:39:13] Sam Schillace: You've[00:39:14] swyx: called this since you joined Microsoft. I point listeners to the podcast that you did in 2022, pre ChatGBT with Kevin Scott. And yeah, you've been saying this from the beginning. So this is not a new line of Talk track for you, like you've, you, you've been a genuine believer for a long time.[00:39:28] swyx: And,[00:39:28] Sam Schillace: and just to be clear, like I haven't been at Microsoft that long. I've only been here for like two, a little over two years and you know, it's a little bit weird for me 'cause for a lot of my career they were the competitor and the enemy and you know, it's kind of funny to be here, but like it's really remarkable.[00:39:40] On Satya[00:39:40] Sam Schillace: It's going on. I really, really like Satya. I've met a, met and worked with a bunch of big tech CEOs and I think he's a genuinely awesome person and he's fun to work with and has a really great. vision. So like, and I obviously really like Kevin, we've been friends for a long time. So it's a cool place.[00:39:56] Sam Schillace: I think there's a lot of interesting stuff. We[00:39:57] swyx: have some awareness Satya is a listener. So obviously he's super welcome on the pod anytime. You can just drop in a good word for us.[00:40:05] Sam Schillace: He's fun to talk to. It's interesting because like CEOs can be lots of different personalities, but he is you were asking me about how I'm like, so hands on and engaged.[00:40:14] Sam Schillace: I'm amazed at how hands on and engaged he can be given the scale of his job. Like, he's super, super engaged with stuff, super in the details, understands a lot of the stuff that's going on. And the science side of things, as well as the product and the business side, I mean, it's really remarkable. I don't say that, like, because he's listening or because I'm trying to pump the company, like, I'm, like, genuinely really, really impressed with, like, how, what he's, like, I look at him, I'm like, I love this stuff, and I spend all my time thinking about it, and I could not do what he's doing.[00:40:42] Sam Schillace: Like, it's just incredible how much you can get[00:40:43] Ben Dunphy: into his head.[00:40:44] Sam at AI Leadership Track[00:40:44] Ben Dunphy: Sam, it's been an absolute pleasure to hear from you here, hear the war stories. So thank you so much for coming on. Quick question though you're here on the podcast as the presenting sponsor for the AI Engineer World's Fair, will you be taking the stage there, or are we going to defer that to Satya?[00:41:01] Ben Dunphy: And I'm happy[00:41:02] Sam Schillace: to talk to folks. I'm happy to be there. It's always fun to like I, I like talking to people more than talking at people. So I don't love giving keynotes. I love giving Q and A's and like engaging with engineers and like. I really am at heart just a builder and an engineer, and like, that's what I'm happiest doing, like being creative and like building things and figuring stuff out.[00:41:22] Sam Schillace: That would be really fun to do, and I'll probably go just to like, hang out with people and hear what they're working on and working about.[00:41:28] swyx: The AI leadership track is just AI leaders, and then it's closed doors, so you know, more sort of an unconference style where people just talk[00:41:34] Sam Schillace: about their issues.[00:41:35] Sam Schillace: Yeah, that would be, that's much more fun. That's really, because we are really all wrestling with this, trying to figure out what it means. Right. So I don't think anyone I, the reason I have the Scalache laws kind of give me the willies a little bit is like, I, I was joking that we should just call them the Scalache best guesses, because like, I don't want people to think that that's like some iron law.[00:41:52] Sam Schillace: We're all trying to figure this stuff out. Right. Like some of it's right. Some it's not right. It's going to be messy. We'll have false starts, but yeah, we're all working it out. So that's the fun conversation. All[00:42:02] Ben Dunphy: right. Thanks for having me. Yeah, thanks so much for coming on.[00:42:05] Final Plug for Tickets & CFP[00:42:05] Ben Dunphy: For those of you listening, interested in attending AI Engineer World's Fair, you can purchase your tickets today.[00:42:11] Ben Dunphy: Learn more about the event at ai. engineer. You can purchase even group discounts. If you purchase four more tickets, use the code GROUP, and one of those four tickets will be free. If you want to speak at the event CFP closes April 8th, so check out the link at ai. engineer, send us your proposals for talks, workshops, or discussion groups.[00:42:33] Ben Dunphy: So if you want to come to THE event of the year for AI engineers, the technical event of the year for AI engineers this is at June 25, 26, and 27 in San Francisco. That's it! Get full access to Latent Space at www.latent.space/subscribe
Guest host Dr. Sarah Bastawrous summarizes the article titled “Multimodality Imaging in Metabolic Syndrome: State-of-the-art Review” from the March 2024 RadioGraphics issue. Multimodality Imaging in Metabolic Syndrome:State-of-the-Art Review. Kalisz et al. RadioGraphics 2024; 44(3):e230083.
In this week's episode, Katherine Forrest and Anna Gressel take a deeper dive into large language models (LLMs) and multimodal models (MLLMs). What do you need to know about LLMs and MLLMs, and should you buy one or build your own? ## Learn More About Paul, Weiss's Artificial Intelligence Practice: https://www.paulweiss.com/practices/litigation/artificial-intelligence
The emergence of ChatGPT has sent shockwaves through many secondary and post-secondary English departments. There's no shortage of doomsaying and prognosticating about the future of writing instruction, even the discipline itself, in the wake of the large language model revolution. Luckily for us, my guest today is Dr. J Palmeri—Professor of English and Director of the Writing Program at Georgetown University. J's work exploring the past, present, and future of multimodal composition is some of the richest, most comprehensive scholarship I've seen. Better still, J practices what they preach in the classroom. Over the course of our dialogue, J details the ways they use new media pedagogy to learn with students, embrace play, compose for real audiences, hack technology, center learning, and ultimately to rethink teaching and learning. There is no shortage of philosophical questions and practical suggestions, but my favorite part of this episode is the way J situates his work on multimodality within a broader story—one that will likely resonate with many of you. This episode is a powerful reminder of why technology is only a tool. Whether that technology is tactile, digital, or artificial intelligence, there is no replacing the deeply human parts of teaching, learning, and communicating alongside others.Faculty Page100 Years of New Media Pedagogy (Open Source Book)Academic Research (Google Scholar)Support the show
The Role of Multimodality Imaging for Pericarditis Guest: Prajwal (Praj) Reddy, MD Hosts: Malcolm R. Bell, M.D. Imaging can play a crucial role in diagnosis and management of pericarditis, particularly if recurrent. Echocardiography, cardiac computed tomography (CCT), and cardiac magnetic resonance imaging (CMR) offer complimentary evaluation of pericardial disease in its various presentations. In this podcast, we review the characteristic signs and features on multi-modality imaging in patients with acute and recurrent pericardial inflammation and its utility in tailoring therapy. Topics Discussed: What is the role of imaging in the initial diagnosis of pericarditis? How is multi-modality cardiac imaging utilized in recurrent pericarditis? Follow up question to the above: Is serial imaging with cardiac MRI helpful in treatment of recurrent pericarditis? Connect with Mayo Clinic's Cardiovascular Continuing Medical Education online at https://cveducation.mayo.edu or on Twitter @MayoClinicCV and @MayoCVservices. LinkedIn: Mayo Clinic Cardiovascular Services Cardiovascular Education App: The Mayo Clinic Cardiovascular CME App is an innovative educational platform that features cardiology-focused continuing medical education wherever and whenever you need it. Use this app to access other free content and browse upcoming courses. Download it for free in Apple or Google stores today! No CME credit offered for this episode. Podcast episode transcript found here.
In this new podcast episode, we dive deep into the realms of somatic psychotherapy, interdisciplinary practices, and the art of embodying multi-modality in therapy with our distinguished guest, Dr. Brian Tierney. Known as the Somatic Doctor, Brian brings a unique blend of expertise in somatic psychotherapy, dance, and bodywork, shining a light on the intricate connections between the body, mind, and spirit.Throughout our conversation, Brian shares his journey from a business student with a budding interest in somatic practices to becoming a leader in the field of integrative therapy. His eclectic approach, combining talk therapy, dance, and structural integration bodywork, offers a fresh perspective on therapeutic practices. Brian's work emphasizes the importance of navigating multiple therapeutic angles, including trauma resolution, dance and movement, and the integration of various bodywork modalities, to support human development and well-being.A highlight of our discussion revolves around Brian's deep commitment to interdisciplinary practice and research. He challenges the trend towards excessive specialization in therapy, advocating for a broader, more holistic approach to understanding and treating the human psyche. Brian's insights into the role of dance, movement, and body awareness in therapy provide valuable lessons on the power of embodying multi-modality to foster healing and personal growth.Listeners are invited to explore the full episode to delve into Brian's profound insights and experiences. For more information on Dr. Brian Tierney's work, visit his website at somaticdoctor.com and follow him on Instagram. Engage with his teachings and join us in exploring the vast potential of integrative and somatic therapies to enrich our lives and therapeutic practices.Key Highlights:00:00 Introduction to Dr. Brian Tierney02:12 Interdisciplinary practice and research02:54 Origin story and path to somatic psychotherapy04:26 The transformative power of bodywork and dance06:23 The journey through graduate studies and beyond08:30 The significance of men's work in therapy11:59 An integrative approach to psychology17:01 The tension between opposites in therapy21:51 The future of multimodal therapeutic practices25:39 Self-regulation and the role of dynamic experiences29:04 Embracing playfulness in therapy33:01 The challenge of othering in therapeutic communities:37:06 Brian's clinical practice and teaching philosophy41:27 The importance of facial work in therapy46:18 Polyvagal theory and its place in modern therapy50:43 The evolving landscape of somatic psychotherapyFor those interested in the intersection of somatic practices, dance, and psychotherapy, this episode is a must-listen. Dive into the full conversation to uncover the depths of Dr. Brian Tierney's work and the transformative potential of integrating multiple modalities into therapeutic practice.Links and Resources Mentioned: Website: https://somaticdoctor.com/Facebook: https://www.facebook.com/SomaticDoctor/ Instagram: https://www.instagram.com/thesomaticdoctor
Luke's ENGLISH Podcast - Learn British English with Luke Thompson
This episode is all about the different modes of communication that we use beyond the 4 linguistic skills of reading, writing, listening speaking. My guest is Nik Peachy who has helped to write a new paper published by OUP called "Multimodality in ELT: Communication Skills for Today's Generation". Listen to Nik and me chatting about the importance of multimodal literacy in our social interactions and in the ways we consume and produce media online.
Note for Latent Space Community members: we have now soft-launched meetups in Singapore, as well as two new virtual paper club/meetups for AI in Action and LLM Paper Club. We're also running Latent Space: Final Frontiers, our second annual demo day hackathon from last year.For the first time, we are doing an audio version of monthly AI Engineering recap that we publish on Latent Space! This month it's “The Four Wars of the AI Stack”; you can find the full recap with all the show notes here: https://latent.space/p/dec-2023* [00:00:00] Intro* [00:01:42] The Four Wars of the AI stack: Data quality, GPU rich vs poor, Multimodality, and Rag/Ops war* [00:03:17] Selection process for the four wars and notable mentions* [00:06:58] The end of low background tokens and the impact on data engineering* [00:08:36] The Quality Data Wars (UGC, licensing, synthetic data, and more)* [00:14:51] Synthetic Data* [00:17:49] The GPU Rich/Poors War* [00:18:21] Anyscale benchmark drama* [00:22:00] The math behind Mixtral inference costs* [00:28:48] Transformer alternatives and why they matter* [00:34:40] The Multimodality Wars* [00:38:10] Multiverse vs Metaverse* [00:45:00] The RAG/Ops Wars* [00:50:00] Will frameworks expand up, or will cloud providers expand down?* [00:54:32] Syntax to Semantics* [00:56:41] Outer Loop vs Inner Loop* [00:59:54] Highlight of the month Get full access to Latent Space at www.latent.space/subscribe
Happy 2024! We appreciated all the feedback on the listener survey (still open, link here)! Surprising to see that some people's favorite episodes were others' least, but we'll always work on improving our audio quality and booking great guests. Help us out by leaving reviews on Twitter, YouTube, and Apple Podcasts!
We're sharing a few of Nathan's favorite AI scouting episodes from other shows. Today, Shane Legg, Cofounder at Deepmind and its current Chief AGI Scientist, shares his insights with Dwarkesh Patel on AGI's timeline, the new architectures needed for AGI, and why multimodality will be the next big landmark. If you need an ecommerce platform, check out our sponsor Shopify: https://shopify.com/cognitive for a $1/month trial period. You can subscribe to The Dwarkesh Podcast here: https://www.youtube.com/@DwarkeshPatel We're hiring across the board at Turpentine and for Erik's personal team on other projects he's incubating. He's hiring a Chief of Staff, EA, Head of Special Projects, Investment Associate, and more. For a list of JDs, check out: eriktorenberg.com. --- SPONSORS: Shopify is the global commerce platform that helps you sell at every stage of your business. Shopify powers 10% of ALL eCommerce in the US. And Shopify's the global force behind Allbirds, Rothy's, and Brooklinen, and 1,000,000s of other entrepreneurs across 175 countries.From their all-in-one e-commerce platform, to their in-person POS system – wherever and whatever you're selling, Shopify's got you covered. With free Shopify Magic, sell more with less effort by whipping up captivating content that converts – from blog posts to product descriptions using AI. Sign up for $1/month trial period: https://shopify.com/cognitive Omneky is an omnichannel creative generation platform that lets you launch hundreds of thousands of ad iterations that actually work customized across all platforms, with a click of a button. Omneky combines generative AI and real-time advertising data. Mention "Cog Rev" for 10% off www.omneky.com NetSuite has 25 years of providing financial software for all your business needs. More than 36,000 businesses have already upgraded to NetSuite by Oracle, gaining visibility and control over their financials, inventory, HR, eCommerce, and more. If you're looking for an ERP platform ✅ head to NetSuite: http://netsuite.com/cognitive and download your own customized KPI checklist. X/SOCIAL: @labenz (Nathan) @dwarkesh_sp (Dwarkesh) @shanelegg (Shane) @CogRev_Podcast (Cognitive Revolution) TIMESTAMPS: (00:00:00) - Episode Preview with Nathan's Intro (00:02:45) - Conversation with Dwarkesh and Shane begins (00:14:26) - Do we need new architectures? (00:17:31) - Sponsors: Shopify (00:19:40) - Is search needed for creativity? (00:31:46) - Impact of Deepmind on safety vs capabilities (00:32:48) - Sponsors: Netsuite | Omneky (00:37:10) - Timelines (00:45:18) - Multimodality
CardioNerds cofounder Dr. Amit Goyal and cardiology fellows from the Cleveland Clinic (Drs. Alejandro Duran Crane, Gary Parizher, and Simrat Kaur) discuss the following case: A 61-year-old man presented with symptoms of heart failure and left ventricular hypertrophy. He was given a diagnosis of obstructive hypertrophic cardiomyopathy. He eventually underwent septal myectomy, mitral valve replacement, aortic aneurysm repair, and aortic valve replacement with findings of Fabry's disease on surgical pathology. The case discussion focuses on the differential diagnosis for LVH and covers Fabry disease as an HCM mimic. Expert commentary was provided by Dr. Angelika Ewrin. The episode audio was edited by student Dr. Diane Masket. US Cardiology Review is now the official journal of CardioNerds! Submit your manuscript here. CardioNerds Case Reports PageCardioNerds Episode PageCardioNerds AcademyCardionerds Healy Honor Roll CardioNerds Journal ClubSubscribe to The Heartbeat Newsletter!Check out CardioNerds SWAG!Become a CardioNerds Patron! Case Media - An Unusual Cause of Hypertrophic Cardiomyopathy – Cleveland Clinic Pearls - An Unusual Cause of Hypertrophic Cardiomyopathy – Cleveland Clinic Left ventricular hypertrophy is a cardiac manifestation of several different systemic and cardiac processes, and its etiology should be clarified to avoid missed diagnosis and treatment opportunities. Fabry disease is a rare, X-linked inherited disease that can present cardiac and extra-cardiac manifestations, the former of which include hypertrophic cardiomyopathy, conduction defects, coronary artery disease, conduction abnormalities, arrhythmias, and heart failure. The diagnosis of Fabry disease includes measurement of alpha-galactosidase enzyme activity as well as genetic testing to evaluate for pathogenic variants or variants of unknown significance in the GLA gene. Family members of patients diagnosed with Fabry disease should be screened based on the inheritance pattern. Multimodality imaging can be helpful in the diagnosis of Fabry disease. Echocardiography can show left ventricular hypertrophy (LVH), reduced global strain, aortic and mitral valve thickening, and aortic root dilation with associated mild to moderate aortic regurgitation. Cardiac MRI can show hypertrophy of papillary muscles, mid-wall late gadolinium enhancement and low-native T1 signal. The treatment of Fabry disease involves a multi-disciplinary approach with geneticists, nephrologists, cardiologists, nephrologists, and primary care doctors. Enzyme replacement therapy can delay the progression of cardiac disease. Show Notes - An Unusual Cause of Hypertrophic Cardiomyopathy – Cleveland Clinic What are the causes of left ventricular hypertrophy? LVH is extremely common. It is present in 15-20% of the general population, and is more common in Black individuals, the elderly, obese or hypertensive individuals, with most cases being secondary to hypertension and aortic valve stenosis. In general terms, it is helpful to divide the causes of LVH into three main groups: high afterload states, obstruction to LV ejection, and intrinsic myocardial problems. Increased afterload states include both primary and secondary hypertension and renal artery stenosis. Mechanical obstruction includes aortic stenosis, subaortic stenosis, and coarctation of the aorta. Lastly, several intrinsic problems of the myocardium can cause LV hypertrophy, such as athletic heart with physiological LVH, hypertrophic cardiomyopathy with or without outflow obstruction, and infiltrative or storage diseases such as cardiac amyloidosis, Fabry's disease, or Danon disease, among others. How does Fabry disease present? Fabry disease is present in all races and is an X-linked lysosomal storage disorder caused by pathogenic variants in the GLA gene that result in reduced alpha-galactosidase enzyme activity,
What new possibilities do you see emerging with voice technology? How might it influence our interactions with businesses and services in the future and what if your voice could transform the way we interact with technology? Imagine a world where your voice can effortlessly interact with devices and transform the way we navigate our surroundings.In this episode of This Anthro Life, we explore the world of future technology with guest Tobias Dengel, a leading expert in digital transformation, and discuss the power of voice technology and its potential to transform how we interact with devices and the world around us. Dengel sheds light on the reasons why integrating voice technology into existing platforms is perceived as a safer approach compared to building entirely new platforms from scratch. He emphasizes the importance of leveraging the familiarity and trust already established with these platforms, enabling a smoother transition for users. Additionally, Dengel delves into the widespread adoption of voice assistants such as Alexa and Siri, highlighting their increasing presence in our daily lives. Furthermore, the discussion extends to the role of voice technology in banking applications, where it plays a crucial role in enhancing security measures and making our lives safer. The exploration of voice technology in this episode showcases its transformative potential and the various ways it is revolutionizing our interactions with devices and services.Tune in to discover Dengel's captivating insights and expertise as we envision the transformative power of voice technology.Key takeaways:Voice technology is evolving and becoming increasingly sophisticated, with the adoption of voice assistants like Alexa and Siri skyrocketing.Adding voice to existing platforms feels safer than creating new ones altogether, as users are already familiar with the platform and trust it.Voice technology solves the problem of faster communication, as humans speak three times faster than they type.The interface of voice technology needs to be redesigned to be more efficient, as listening to machines is slower than reading or interacting with visuals.The more human-like voice assistants become, the less users trust them, as they feel like they are being tricked.Multimodality is important in voice technology, as it allows for a combination of voice, visuals, and other forms of communication to enhance the user experience.Voice technology has applications in various industries, such as law enforcement, warehouses, retail, and safety in industrial settings.The combination of generative AI and conversational AI is where the magic happens in voice technology, allowing for more accurate interpretation and response.Conversational designers will play a crucial role in designing effective voice experiences, considering factors like speed, efficiency, and user preferences.Voice technology has the potential to reshape business processes and models, such as centralized restaurants, telemedicine, and global healthcare access.Timestamps:00:00:07 Voice technology is evolving.00:05:14 Design voice experiences in multimodal.00:09:37 Voice is a powerful interface.00:18:38 Conversational AI and generative AI.00:21:08 Context is crucial for conversation.00:28:31 The blend of generative AI and conversational AI is creating a user experience breakthrough.00:29:15 Voice experiences are becoming multimodal.00:34:06 Voice technology revolutionizes business processes.00:39:03 The future of technology is voice-based.00:43:30 Spread anthropological thinking to audiences.Tobias Dengel is a seasoned technology executive with over 20 years of experience in mobility, digital media, and interactive marketing. He currently holds the position of President at WillowTree, a TELUS International Company, a global leader in digital product design and development. Dengel's expertise and leadership have contributed to WillowTree's continuous growth and recognition as one of America's fastest-growing companies, as listed by Inc. magazine for 11 consecutive years. He is also the author of the book "The Sound of the Future: The Coming Age of AI-Enabled Voice Technology," where he explores the transformative potential of voice technology in various aspects of business and society.About This Anthro Life This Anthro Life is a thought-provoking podcast that explores the human side of technology, culture, and business. Hosted by Adam Gamwell, we unravel fascinating narratives and connect them to the wider context of our lives. Tune in to https://thisanthrolife.org and subscribe to our Substack at https://thisanthrolife.substack.com for more captivating episodes and engaging content.Connect with Tobias DengelLinkedin: https://www.linkedin.com/in/tobiasdengel/ Twitter: https://x.com/TobiasDengel?s=20Website: https://www.tobiasdengel.com/ Facebook: https://www.facebook.com/tobias.denge.7/ Connect with This Anthro Life:Instagram: https://www.instagram.com/thisanthrolife/ Facebook: https://www.facebook.com/thisanthrolife LinkedIn: https://www.linkedin.com/company/this-anthro-life-podcast/ This Anthro Life website: https://www.thisanthrolife.org/ Substack blog: https://thisanthrolife.substack.comThis show is part of the Spreaker Prime Network, if you are interested in advertising on this podcast, contact us at https://www.spreaker.com/show/5168968/advertisement
In this episode, CardioNerds co-founder Amit Goyal joins Dr. Iva Minga, Dr. Kevin Lee, and Dr. Juan Pablo Salazar Adum from the University of Chicago - Northshore in Evanston, IL to discuss a case of primary cardiac diffuse large B-cell lymphoma. The ECPR for this episode is provided by Dr. Amit Pursnani (Advanced Cardiac Imaging, Fellowship program director, NorthShore University HealthSystem). Audio editing by CardioNerds Academy Intern, Dr. Akiva Rosenzveig. Case synopsis: A 77-year-old man with no significant medical history presents to the emergency department with progressive shortness of breath for 1 week. He reports an unintentional 15-pound weight loss in the prior month as well as constipation and abdominal/flank pain. On examination he was found to be tachycardic with a regular rhythm and further evaluation with a chest X-ray and chest CT scan demonstrated a large pericardial effusion. This was further investigated with an urgent echocardiogram that revealed a large pericardial effusion with a large mass attached to the pericardial side of the RV free wall, as well as signs of early cardiac tamponade. A pericardiocentesis was performed and 550mL of bloody fluid was withdrawn. The fluid was sent for laboratory analysis and cytology. A cardiac MRI demonstrated a large invasive mass in the pericardium and RV wall consistent with cardiac lymphoma. Cytology confirmed diffuse large B-cell lymphoma. Subsequent CT and PET scans did not find any other site of malignancy, giving the patient a diagnosis of primary cardiac diffuse large B-cell lymphoma. The patient underwent R-CHOP chemotherapy and was followed closely with repeat cardiac MRI and PET scans which demonstrated resolution of the cardiac mass at his one-year surveillance follow-up. This case was published in US Cardiology Review, the official journal of CardioNerds. To learn more, access the case report article here. CardioNerds is collaborating with Radcliffe Cardiology and US Cardiology Review journal (USC) for a ‘call for cases', with the intention to co-publish high impact cardiovascular case reports, subject to double-blind peer review. Case Reports that are accepted in USC journal and published as the version of record (VOR), will also be indexed in Scopus and the Directory of Open Access Journals (DOAJ). CardioNerds Case Reports PageCardioNerds Episode PageCardioNerds AcademyCardionerds Healy Honor Roll CardioNerds Journal ClubSubscribe to The Heartbeat Newsletter!Check out CardioNerds SWAG!Become a CardioNerds Patron! Pearls - A Mystery Mass in the Heart - Cardiac Lymphoma The most common cause of malignant cardiac masses is metastasis. Primary cardiac tumors are rare. Cardiac tumors are separated into 2 categories: benign and malignant. They are often differentiated based on their location and their degree of tissue invasion. Multimodality imaging is essential in the diagnosis, management, and surveillance of cardiac masses. A multidisciplinary team approach is invaluable for management of patients with cardiac tumors. Show Notes - A Mystery Mass in the Heart - Cardiac Lymphoma 1. What is the clinical presentation of cardiac masses? Cardiac masses can have a variable presentation. They can present with arrhythmias, angina, heart failure symptoms, or pericardial effusion. Patients can also be asymptomatic; the masses can be found incidentally on cardiac or chest imagining. 2. What is the differential diagnosis for cardiac masses? Cardiac masses are separated into benign and malignant. The most common malignant cardiac masses are metastases from a distant source. The location of the mass is important in narrowing the differential. 3. What imaging modalities are used to diagnose cardiac masses? Multimodality imaging is needed to describe the mass in detail and guide diagnosis. An echocardiogram is usually the first imaging modality. Cardiac MRI is a great modality that allows for the...