POPULARITY
Picture this. You've got a web app built with Rust and Solid.js. It started life running on a dusty on-prem server, but now it's time to move it to the cloud. The clock is ticking. You could take the well-worn AWS path: set up a VPC, configure subnets, attach an ALB, define IAM roles, and deploy with Fargate. Or you could try something different. In this episode of AWS Bites, we share the real story of migrating a monolithic containerized app to AWS App Runner. It promises to take your code, build it, deploy it, and scale it with minimal effort. But does it really deliver? We compare App Runner with Fargate based on hands-on experience. You'll learn where App Runner shines, where it gets in the way, and how we handled everything from custom domains to background job processing. You'll also hear when we would still choose Fargate, and why. If you've ever hoped for a Heroku-like experience on AWS, or you want to simplify your container deployments without giving up too much control, this episode is for you.AWS Bites is brought to you in association with fourTheorem. At fourTheorem, we believe serverless should be simple, scalable, and cost-effective — and we help teams do just that. Whether you're diving into containers, stepping into event-driven architecture, or scaling a global SaaS platform on AWS, our team has your back. Visit https://fourTheorem.com to see how we can help you build faster, better, and with more confidence using AWS cloud!
"I think what we're doing is trailblazing." We don't often hear 'trailblazing' associated with associations, but that's exactly how Thad Lurie describes the most recent initiative of the American Geophysical Union (AGU) in episode 3.Effectively reaching the youngest career professionals is an ongoing challenge for associations in the quest to remain relevant. In less than 13-minutes, you'll learn how AGU is using personalization to provide members with the content and connections they are looking for. And, how they are adapting the new technology to meet immediate and evolving member needs.The Association RevUP Podcast is presented by the Professionals for Association Revenue (PAR) and hosted by Carolyn Shomali.Learn more about PAR and its annual conference, The RevUP Summit. Special Thanks to podcast partner, the event production company VPC, Inc.Helpful Links:Learn more about AGUAGU partner: Tasio Labs
Send us a textThe tech world gives and takes away as Google introduces CloudWAN while MITRE nearly loses CVE funding, showcasing both innovation and vulnerability in our digital infrastructure landscape. Politics increasingly intersects with technology as we examine controversial security clearance revocations alongside much-needed technical improvements in cloud networking.• Google Cloud Next introduces CloudWAN service with two use cases: high-performance data center connectivity and premium branch networking• Google's approach differs from AWS, encouraging single global VPC deployments across regions• MITRE loses funding for the CVE program, threatening the global vulnerability tracking system• CISA provides 11-month bridge funding, but fragmentation begins as EU launches alternative vulnerability tracking• Azure announces general availability of route maps for Virtual WAN, bringing traditional networking capabilities to cloud• Former CISA director Chris Krebs targeted in federal investigation for debunking 2020 election fraud claims• Security clearance revocations increasingly used as political weapons against technology professionalsSubscribe to Cables to Clouds Fortnightly News and tell a friend about the show to stay informed about the evolving cloud technology landscape.Purchase Chris and Tim's new book on AWS Cloud Networking: https://www.amazon.com/Certified-Advanced-Networking-Certification-certification/dp/1835080839/ Check out the Fortnightly Cloud Networking Newshttps://docs.google.com/document/d/1fkBWCGwXDUX9OfZ9_MvSVup8tJJzJeqrauaE6VPT2b0/Visit our website and subscribe: https://www.cables2clouds.com/Follow us on BlueSky: https://bsky.app/profile/cables2clouds.comFollow us on YouTube: https://www.youtube.com/@cables2clouds/Follow us on TikTok: https://www.tiktok.com/@cables2cloudsMerch Store: https://store.cables2clouds.com/Join the Discord Study group: https://artofneteng.com/iaatj
In this news episode, Phil Gervasi and Justin Ryburn cover some of the latest tech headlines, such as Starlink entering the service provider space with Community Gateways, OpenAI raising $40 billion for AGI development, Comcast acquiring Nitel to expand its channel distribution strategy, and AWS announcing support for VPC route server running BGP. Tune in for the latest news and a quick roundup of upcoming network industry events.
Send us a textWhat exactly is cloud networking? This seemingly simple question quickly descends into a fascinating philosophical debate as we welcome back Nico Vibert, Senior Staff Technical Marketing Engineer at Isovalent/Cisco, to tackle this identity crisis head-on.The conversation begins with a startling observation from Nico about analyst reports that group wildly different vendors together under the "cloud networking" umbrella. From there, we explore how defining cloud networking has become increasingly complex as technologies evolve and converge. We trace the origins back to AWS's introduction of VPC in 2009 and discuss how different cloud providers approach networking based on their unique company cultures.One clear consensus emerges: true cloud networking must be API-driven. Whether consumed directly via APIs or through infrastructure-as-code tools like Terraform, programmability stands as a non-negotiable requirement. But beyond this foundation, the boundaries blur significantly when examining various technologies that might qualify.Does Kubernetes networking fall under the cloud networking umbrella? What about Middle Mile providers like Equinix or Megaport that physically connect clouds? Are CDNs part of cloud networking, or something entirely different? We dissect these questions without settling on definitive answers, highlighting how technology's rapid evolution makes categorization increasingly difficult.Looking ahead, we explore how AI is reshaping cloud networking in two critical ways: networks optimized for AI workloads and AI-enhanced network management. Cloud providers are investing billions in infrastructure upgrades, developing custom silicon to reduce dependency on GPU manufacturers, signaling massive transformation on the horizon.Whether you're a network engineer, cloud architect, or technology leader trying to understand this evolving landscape, this episode provides valuable perspective on cloud networking's past, present, and future directions.Connect with Nico: https://www.linkedin.com/in/nicolasvibert/Purchase Chris and Tim's new book on AWS Cloud Networking: https://www.amazon.com/Certified-Advanced-Networking-Certification-certification/dp/1835080839/ Check out the Fortnightly Cloud Networking Newshttps://docs.google.com/document/d/1fkBWCGwXDUX9OfZ9_MvSVup8tJJzJeqrauaE6VPT2b0/Visit our website and subscribe: https://www.cables2clouds.com/Follow us on BlueSky: https://bsky.app/profile/cables2clouds.comFollow us on YouTube: https://www.youtube.com/@cables2clouds/Follow us on TikTok: https://www.tiktok.com/@cables2cloudsMerch Store: https://store.cables2clouds.com/Join the Discord Study group: https://artofneteng.com/iaatj
"We need to take these things that are ideas and figure out if they even have legs before we spend a bunch of time and money and invest in them."What would it look like if your association acted like a startup? That's the question explored in episode 2 with Chrissy Bagby of the American Association of Veterinary State Boards (AAVSB) and Elizabeth Engel of Spark Consulting. Learn how AAVSB is implementing a culture of innovation where employees at all levels of the organization have learned how to identify and develop non-dues revenue ideas with the help of a proven framework.Learn how to understand your audience, their needs and potential solutions to their problems before investing valuable time and resources moving in the wrong direction.The Association RevUP Podcast is presented by the Professionals for Association Revenue (PAR) and hosted by Carolyn Shomali.Learn more about PAR and its annual conference, The RevUP Summit. Special Thanks to podcast partner, the event production company VPC, Inc.Helpful Links:Lean Startup by Eric RiesSkateboard to Car: MVP analogyVideo: Eric Ries 2011 Google Talk
In 2022, Lin Qiao decided to leave Meta, where she was managing several hundred engineers, to start Fireworks AI. In this episode, we sit down with Lin for a deep dive on her work, starting with her leadership on PyTorch, now one of the most influential machine learning frameworks in the industry, powering research and production at scale across the AI industry. Now at the helm of Fireworks AI, Lin is leading a new wave in generative AI infrastructure, simplifying model deployment and optimizing performance to empower all developers building with Gen AI technologies.We dive into the technical core of Fireworks AI, uncovering their innovative strategies for model optimization, Function Calling in agentic development, and low-level breakthroughs at the GPU and CUDA layers.Fireworks AIWebsite - https://fireworks.aiX/Twitter - https://twitter.com/FireworksAI_HQLin QiaoLinkedIn - https://www.linkedin.com/in/lin-qiao-22248b4X/Twitter - https://twitter.com/lqiaoFIRSTMARKWebsite - https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCapMatt Turck (Managing Director)LinkedIn - https://www.linkedin.com/in/turck/X/Twitter - https://twitter.com/mattturck(00:00) Intro(01:20) What is Fireworks AI?(02:47) What is PyTorch?(12:50) Traditional ML vs GenAI(14:54) AI's enterprise transformation(16:16) From Meta to Fireworks(19:39) Simplifying AI infrastructure(20:41) How Fireworks clients use GenAI(22:02) How many models are powered by Fireworks(30:09) LLM partitioning(34:43) Real-time vs pre-set search(36:56) Reinforcement learning(38:56) Function calling(44:23) Low-level architecture overview(45:47) Cloud GPUs & hardware support(47:16) VPC vs on-prem vs local deployment(49:50) Decreasing inference costs and its business implications(52:46) Fireworks roadmap(55:03) AI future predictions
Bonjour à tous et bienvenue sur DM&V, j'espère que vous allez pour le mieux ! Aujourd'hui, je vous propose l'essai d'une montre singulière qui a fait couler beaucoup d'encre depuis sa sortie et qui intrigue énormément d'amateurs. D'autant qu'elle fait partie de ces montres que très peu de personnes ont eu entre les mains afin de réellement pouvoir la juger. Cette montre, c'est la VPC type 37HW, créé par un des membres de Fratello, un des médias horlogers les plus influents du monde. Alors, quand des professionnels aussi passionnés, qui ont eu autant de pièces entre les mains sortent leur propre montre, forcément, on s'attend à quelque chose d'ultra-abouti et on a envie de savoir si la pièce tient toutes ses promesses. Dans cet épisode, je vais donc vous retracer l'histoire de cette montre, mais aussi vous livrer mes impressions après plusieurs jours au poignet : - Est-elle à la hauteur de mes attentes ? - Fait-elle désormais partie de mes préférées ? - Fratello a-t'il coché toutes les cases avec cette pièce murement réfléchie ? Je vais tout vous dire ! Mais avant de commencer, sachez que cet épisode est disponible en version audio sur toutes les plateformes de podcast mais également en vidéo sur ma chaine Youtube Des Montres & Vous. Si vous aimez la chaine et son contenu, N'hésitez pas à liker, à vous abonner et à activer les notifications pour ne rien louper et pour aider DM&V à progresser. Pour ceux qui écoutent en version podcast, pensez à laisser une note 5 étoiles et un commentaire, ça fait toujours plaisir Au passage, je tiens à vous remercier pour votre adhésion en masse au cercle DM&V, le groupe Whatsapp que j'ai créé en février et dans lequel nous échangeons autour de notre passion commune. Vous êtes déjà plus de 700, c'est incroyable ! Pour ceux qui ne sont pas encore dans le cercle, rien de plus simple : https://chat.whatsapp.com/F96PntzE9C5GVqxFC7xpBX Bonne écoute ! Building A Watch Brand Episode 1 — Introduction https://www.fratellowatches.com/building-a-watch-brand-follow-thomas-as-he-develops-a-watch-of-his-own/ Building A Watch Brand Episode 2 — Brand and name https://www.fratellowatches.com/building-a-watch-brand-episode-2-revealing-the-name-and-brand-concept/ Building A Watch Brand Episode 3 — Finances, risk mitigation, and a designer https://www.fratellowatches.com/building-a-watch-brand-ep-3-designer-finances/ Building A Watch Brand Episode 4 — Unveiling the watch concept https://www.fratellowatches.com/building-a-watch-brand-episode-4-unveiling-the-watch-concept/ Building A Watch Brand Episode 5 — The first design ideation sketches https://www.fratellowatches.com/building-a-watch-brand-episode-5-the-first-design-ideation-sketches/ Building A Watch Brand Episode 6 — Caliber, pouch, case and bracelet updates, typography https://www.fratellowatches.com/building-a-watch-brand-ep-6/ Building A Watch Brand Episode 7 — Dial design https://www.fratellowatches.com/building-a-watch-brand-episode-7-dial-design/ Building A Watch Brand Episode 8 — Geeking out on typography https://www.fratellowatches.com/building-a-watch-brand-episode-8-geeking-out-on-typography/ Building A Watch Brand Episode 9 — 3D Modeling, Tech Development, And Opening Up On The Mentally Challenging Side https://www.fratellowatches.com/building-a-watch-brand-episode-9-3d-modeling-tech-development-and-opening-up-on-the-mentally-challenging-side/Hébergé par Ausha. Visitez ausha.co/politique-de-confidentialite pour plus d'informations.
"Clinical research is not cheap."Season 2 of the Association RevUP podcast celebrates the stories of associations who are utilizing business to bring about real change in their industries.In this debut episode, discover how the Society of Interventional Oncology and Jena Eberly Stack united multiple industry partners to fund a high-cost clinical trial with real-world health impacts. Then Dan Cole and Brittany Shoul share 5 ways your association can level up negotiations for your next big deal, no matter the size.About Association RevUP : Episodes are less than 20-minutes, written and produced to keep you engaged, and full of actionable insights. Hosted by Carolyn Shomali.- VPC, Inc., the production company PAR trusts with our in-person event the RevUP Summit, and the parter of this podcast.- MyPar.org: learn more about the PAR member community- RevUP Summit: November 4-6, 2025 in Annapolis, MD
Bonjour à tous et bienvenue sur DM&V, j'espère que vous allez pour le mieux.Aujourd'hui je reçois un invité que beaucoup d'entre vous attendaient avec impatience : Morgan Saignes, Un des photographes horlogers les plus talentueux du moment, bien connu pour avoir shooté les plus belles pièces sous l'égide Fratello ! Dans cet épisode tout en simplicité, Morgan revient sur sa passion pour les montres : Comment tout cela est né, comment la photo est arrivée, comment a-t'il rejoint l'équipe Fratello mais aussi quelles sont ses pièces préférées, après en avoir eu plusieurs centaines, voire plusieurs milliers entre les mains... Evidemment, nous ne pouvions faire cet épisode sans parler du projet VPC...déjà un Blockbuster (bientôt la revue sur DM&V) Cet épisode est, comme d'habitude, disponible en version audio sur toutes les plateformes de podcast mais également en vidéo sur ma chaine Youtube Des Montres & Vous. Si vous aimez la chaine et son contenu, N'hésitez pas à liker, à vous abonner et à activer les notifications pour ne rien louper et pour aider DM&V à progresser. Pour ceux qui écoutent en version podcast, pensez à laisser une note 5 étoiles et un commentaire, ça fait toujours plaisir. Bonne écoute ! Voici les références abordées lors de cet épisode : - Silo, la série disponible sur Apple TV (2 saisons) - Foundation, la série disponible sur Apple TV (2 saisons) - Dark Matter, la série disposable sur Apple TV (1 saison) L'insta de Morgan : morgansaignes Le site web de Morgan : morgansaignes.comHébergé par Ausha. Visitez ausha.co/politique-de-confidentialite pour plus d'informations.
AWS Morning Brief for the week of February 17, with Corey Quinn. Links:Amazon DynamoDB now supports auto-approval of quota adjustmentsAmazon Elastic Block Store (EBS) now adds full snapshot size information in Console and APIAmazon RDS for MySQL announces Extended Support minor 5.7.44-RDS.20250103Amazon Redshift Serverless announces reduction in IP Address Requirements to 3 per SubnetAWS Deadline Cloud now supports Adobe After Effects in Service-Managed FleetsAWS Network Load Balancer now supports removing availability zonesAWS CloudTrail network activity events for VPC endpoints now generally availableHarness Amazon Bedrock Agents to Manage SAP InstancesTimestamp writes for write hedging in Amazon DynamoDBUpdating AWS SDK defaults – AWS STS service endpoint and Retry StrategyLearning AWS best practices from Amazon Q in the ConsoleAutomating Cost Optimization Governance with AWS ConfigAmazon Q Developer in chat applications rename - Summary of changes - AWS Chatbot
Brendan Carroll is a Senior Partner at Victory Park Capital, which he co-founded in 2007. In this episode, we discussed the evolution of private credit investments, how they can be made more friendly to insurers and the acquisition of VPC by Janus Henderson Investors. Enjoy the show! Overview of podcast with Brendan Carroll, Victory Park Capital 01:00 Getting started in investing 03:00 Setting up the firm in GFC 06:00 Asset backed lending 07:30 Insurance-friendly strategies 09:00 Insurers are the fastest growing LP types 13:00 The industry has always been competitive 17:30 Historic data sees through cycle 18:30 Business review triggers 28:00 The Janus Henderson acquisition 31:00 Developing ChatVPC
Ну что? Готовы к 10000 подписчикам до конца года? Мы уже готовы два года, но все никак :) 2025 наступил и решили начать с новостного выпуска.
AWS Morning Brief for the week of December 23, with Corey Quinn. Links:Amazon AppStream 2.0 introduces client for macOSAmazon EC2 instances support bandwidth configurations for VPC and EBSAmazon Timestream for InfluxDB now supports Internet Protocol Version 6 (IPv6) connectivityAmazon WorkSpaces Thin Client now available to purchase in IndiaAWS Backup launches support for search and item-level recoveryAWS Mainframe Modernization now supports connectivity over Internet Protocol version 6 (IPv6)AWS Marketplace now supports self-service promotional media on seller product detail pagesAWS re:Post now supports Spanish and PortugueseAWS Resource Explorer supports 59 new resource typesAWS offers a self-service feature to update business names on AWS InvoicesAnnouncing CloudFormation support for AWS Parallel Computing ServiceAnnouncing Node Health Monitoring and Auto-Repair for Amazon EKS - AWSAnd that's a wrap!Best practices for creating a VPC for Amazon RDS for Db2How the Amazon TimeHub team handled disruption in AWS DMS CDC task caused by Oracle RESETLOGS: Part 3How to detect and monitor Amazon Simple Storage Service (S3) access with AWS CloudTrail and Amazon CloudWatchEnforce resource configuration to control access to new features with AWSMaximizing your cloud journey: Engaging an AWS Solutions Architect
It's Wednesday morning and that means another edition of the Purely Cloud Guest Series featuring co-host Ondrej Bursik and the Cloud Technical Product Specialist team delivering all the technical details around Pure's growing portfolio of cloud solutions. In this one, Ondrej and Pavel Kovar take you deep into the depths of Pure's Cloud Block Store for AWS offering, including architectural decisions and deployment best practices. Learn about the technical aspect of Cloud Block Store for CBS including virtual drives and shelves, controllers, networking best practices, and VPC end points. From there, Ondrej and Pavel take you through deployment recommendations via portal or CLI and deployment steps to ensure success across security and user management aspects. Thanks for checking out this episode of the Pure Report podcast and for more information on Cloud Block Store for AWS, go to: https://www.purestorage.com/products/cloud-block-storage/cbs.html
In this episode, Meg Ashby, a senior cloud security engineer shares how her team tackled AWS's centralized VPC interface endpoints, a design often seen as an anti-pattern. She explains how they turned this unconventional approach into a cost-efficient and scalable solution, all while maintaining granular controls and network visibility. She shares why centralized VPC endpoints are considered an AWS anti-pattern, how to implement granular IAM controls in a centralized model and the challenges of monitoring and detecting VPC endpoint traffic. Guest Socials: Meg's Linkedin Podcast Twitter - @CloudSecPod If you want to watch videos of this LIVE STREAMED episode and past episodes - Check out our other Cloud Security Social Channels: - Cloud Security Podcast- Youtube - Cloud Security Newsletter - Cloud Security BootCamp Questions asked: (00:00) Introduction (02:48) A bit about Meg Ashby (03:44) What is VPC interface endpoints? (05:26) Egress and Ingress for Private Networks (08:21) Reason for using VPC endpoints (14:22) Limitations when using centralised endpoint VPCs (19:01) Marrying VPC endpoint and IAM policy (21:34) VPC endpoint specific conditions (27:52) Is this solution for everyone? (38:16) Does VPC endpoint have logging? (41:24) Improvements for the next phase Thank you to our episode sponsor Wiz. Cloud Security Podcast listeners can also get a free cloud security health scan by going to wiz.io/csp
In this episode, David Lynam provides an overview of AWS Transit Gateway, which aims to simplify complex network connectivity between VPCs, VPNs, and on-premises networks. We discuss the limitations of using VPC peering and the benefits Transit Gateway provides through its hub-and-spoke model. The main components of Transit Gateway are explained, including attachments, route tables, associations, and route propagation. We go through some example use cases like sharing Transit Gateways across accounts, network isolation for compliance, routing traffic through security services, and bandwidth/scaling capabilities. In this episode, we mentioned the following resources: - How Amazon VPC Transit Gateways work Do you have any AWS questions you would like us to address? Leave a comment here or connect with us on X/Twitter: - https://twitter.com/eoins - https://twitter.com/loige
AWS Morning Brief for the week of December 9, with Corey Quinn. Links:AWS announces access to VPC resources over AWS PrivateLinkAnnouncing Amazon Aurora DSQL (Preview)Announcing Amazon Bedrock IDE in preview as part of Amazon SageMaker Unified StudioAWS announces Amazon CloudWatch Database InsightsAmazon DynamoDB global tables previews multi-Region strong consistencyAmazon EC2 introduces Allowed AMIs to enhance AMI governanceAnnouncing Amazon EC2 I8g instancesAnnouncing Amazon EKS Auto ModeAnnouncing Amazon EKS Hybrid NodesAnnouncing Amazon Elastic VMware Service (Preview)Announcing Amazon FSx Intelligent-Tiering, a new storage class for FSxAmazon Q Developer can now automate code reviewsAmazon Q Developer announces automatic unit test generation to accelerate feature developmentAmazon S3 adds new default data integrity protectionsAnnouncing Amazon S3 Metadata (Preview) – Easiest and fastest way to manage your metadataAmazon S3 launches storage classes for AWS Dedicated Local ZonesAnnouncing Amazon S3 Tables – Fully managed Apache Iceberg tables optimized for analytics workloadsAWS announces Amazon SageMaker LakehouseAWS Control Tower launches managed controls using declarative policiesAWS announces AWS Data Transfer Terminal for high-speed data uploadsAmazon Web Services announces declarative policiesIntroducing AWS Glue 5.0AWS announces Invoice ConfigurationAWS Marketplace now offers EC2 Image Builder components from independent software vendorsAWS announces AWS Security Incident Response for general availabilityAnnouncing AWS Transfer Family web appsBuy with AWS accelerates solution discovery and procurement on AWS Partner websitesOracle Database@AWS is now in limited previewPartyRock improves app discovery and announces upcoming free daily useAnnouncing the preview of Amazon SageMaker Unified StudioVPC Lattice now includes TCP support with VPC ResourcesAnnouncing the 2024 Geo and Global AWS Partners of the YearAmazon MemoryDB Multi-Region is now generally availableTop announcements of AWS re:Invent 2024SponsorThe Duckbill Group: https://www.duckbillgroup.com/
Hoje é dia de sobre carreira! No episódio de estreia da série especial do podcast, conversamos com Erika Nagamine, Golden Jacket da AWS, sobre a sua trajetória, sobre as suas decisões, e sobre o poder que a curiosidade teve para lhe impulsionar ao longo de toda a sua carreira. Vem ver quem participou desse papo: Paulo Silveira, o host que gosta de certificação André David, o cohost que está rolando até agora Erika Nagamine, Arquiteta de Soluções Especialista em Dados & AI - Analytics na AWS
The full schedule for Latent Space LIVE! at NeurIPS has been announced, featuring Best of 2024 overview talks for the AI Startup Landscape, Computer Vision, Open Models, Transformers Killers, Synthetic Data, Agents, and Scaling, and speakers from Sarah Guo of Conviction, Roboflow, AI2/Meta, Recursal/Together, HuggingFace, OpenHands and SemiAnalysis. Join us for the IRL event/Livestream! Alessio will also be holding a meetup at AWS Re:Invent in Las Vegas this Wednesday. See our new Events page for dates of AI Engineer Summit, Singapore, and World's Fair in 2025. LAST CALL for questions for our big 2024 recap episode! Submit questions and messages on Speakpipe here for a chance to appear on the show!When we first observed that GPT Wrappers are Good, Actually, we did not even have Bolt on our radar. Since we recorded our Anthropic episode discussing building Agents with the new Claude 3.5 Sonnet, Bolt.new (by Stackblitz) has easily cleared the $8m ARR bar, repeating and accelerating its initial $4m feat.There are very many AI code generators and VS Code forks out there, but Bolt probably broke through initially because of its incredible zero shot low effort app generation:But as we explain in the pod, Bolt also emphasized deploy (Netlify)/ backend (Supabase)/ fullstack capabilities on top of Stackblitz's existing WebContainer full-WASM-powered-developer-environment-in-the-browser tech. Since then, the team has been shipping like mad (with weekly office hours), with bugfixing, full screen, multi-device, long context, diff based edits (using speculative decoding like we covered in Inference, Fast and Slow).All of this has captured the imagination of low/no code builders like Greg Isenberg and many others on YouTube/TikTok/Reddit/X/Linkedin etc:Just as with Fireworks, our relationship with Bolt/Stackblitz goes a bit deeper than normal - swyx advised the launch and got a front row seat to this epic journey, as well as demoed it with Realtime Voice at the recent OpenAI Dev Day. So we are very proud to be the first/closest to tell the full open story of Bolt/Stackblitz!Flow Engineering + Qodo/AlphaCodium UpdateIn year 2 of the pod we have been on a roll getting former guests to return as guest cohosts (Harrison Chase, Aman Sanger, Jon Frankle), and it was a pleasure to catch Itamar Friedman back on the pod, giving us an update on all things Qodo and Testing Agents from our last catchup a year and a half ago:Qodo (they renamed in September) went viral in early January this year with AlphaCodium (paper here, code here) beating DeepMind's AlphaCode with high efficiency:With a simple problem solving code agent:* The first step is to have the model reason about the problem. They describe it using bullet points and focus on the goal, inputs, outputs, rules, constraints, and any other relevant details.* Then, they make the model reason about the public tests and come up with an explanation of why the input leads to that particular output. * The model generates two to three potential solutions in text and ranks them in terms of correctness, simplicity, and robustness. * Then, it generates more diverse tests for the problem, covering cases not part of the original public tests. * Iteratively, pick a solution, generate the code, and run it on a few test cases. * If the tests fail, improve the code and repeat the process until the code passes every test.swyx has previously written similar thoughts on types vs tests for putting bounds on program behavior, but AlphaCodium extends this to AI generated tests and code.More recently, Itamar has also shown that AlphaCodium's techniques also extend well to the o1 models:Making Flow Engineering a useful technique to improve code model performance on every model. This is something we see AI Engineers uniquely well positioned to do compared to ML Engineers/Researchers.Full Video PodcastLike and subscribe!Show Notes* Itamar* Qodo* First episode* Eric* Bolt* StackBlitz* Thinkster* AlphaCodium* WebContainersChapters* 00:00:00 Introductions & Updates* 00:06:01 Generic vs. Specific AI Agents* 00:07:40 Maintaining vs Creating with AI* 00:17:46 Human vs Agent Computer Interfaces* 00:20:15 Why Docker doesn't work for Bolt* 00:24:23 Creating Testing and Code Review Loops* 00:28:07 Bolt's Task Breakdown Flow* 00:31:04 AI in Complex Enterprise Environments* 00:41:43 AlphaCodium* 00:44:39 Strategies for Breaking Down Complex Tasks* 00:45:22 Building in Open Source* 00:50:35 Choosing a product as a founder* 00:59:03 Reflections on Bolt Success* 01:06:07 Building a B2C GTM* 01:18:11 AI Capabilities and Pricing Tiers* 01:20:28 What makes Bolt unique* 01:23:07 Future Growth and Product Development* 01:29:06 Competitive Landscape in AI Engineering* 01:30:01 Advice to Founders and Embracing AI* 01:32:20 Having a baby and completing an Iron ManTranscriptAlessio [00:00:00]: Hey everyone, welcome to the Latent Space Podcast. This is Alessio, partner and CTO at Decibel Partners, and I'm joined by my co-host Swyx, founder of Smol.ai.Swyx [00:00:12]: Hey, and today we're still in our sort of makeshift in-between studio, but we're very delighted to have a former returning guest host, Itamar. Welcome back.Itamar [00:00:21]: Great to be here after a year or more. Yeah, a year and a half.Swyx [00:00:24]: You're one of our earliest guests on Agents. Now you're CEO co-founder of Kodo. Right. Which has just been renamed. You also raised a $40 million Series A, and we can get caught up on everything, but we're also delighted to have our new guest, Eric. Welcome.Eric [00:00:42]: Thank you. Excited to be here. Should I say Bolt or StackBlitz?Swyx [00:00:45]: Like, is it like its own company now or?Eric [00:00:47]: Yeah. Bolt's definitely bolt.new. That's the thing that we're probably the most known for, I imagine, at this point.Swyx [00:00:54]: Which is ridiculous to say because you were working at StackBlitz for so long.Eric [00:00:57]: Yeah. I mean, within a week, we were doing like double the amount of traffic. And StackBlitz had been online for seven years, and we were like, what? But anyways, yeah. So we're StackBlitz, the company behind bolt.new. If you've heard of bolt.new, that's our stuff. Yeah.Swyx [00:01:12]: Yeah.Itamar [00:01:13]: Excellent. I see, by the way, that the founder mode, you need to know to capture opportunities. So kudos on doing that, right? You're working on some technology, and then suddenly you can exploit that to a new world. Yeah.Eric [00:01:24]: Totally. And I think, well, not to jump, but 100%, I mean, a couple of months ago, we had the idea for Bolt earlier this year, but we haven't really shared this too much publicly. But we actually had tried to build it with some of those state-of-the-art models back in January, February, you can kind of imagine which, and they just weren't good enough to actually do the code generation where the code was accurate and it was fast and whatever have you without a ton of like rag, but then there was like issues with that. So we put it on the shelf and then we got kind of a sneak peek of some of the new models that have come out in the past couple of months now. And so once we saw that, once we actually saw the code gen from it, we were like, oh my God, like, okay, we can build a product around this. And so that was really the impetus of us building the thing. But with that, it was StackBlitz, the core StackBlitz product the past seven years has been an IDE for developers. So the entire user experience flow we've built up just didn't make sense. And so when we kind of went out to build Bolt, we just thought, you know, if we were inventing our product today, what would the interface look like given what is now possible with the AI code gen? And so there's definitely a lot of conversations we had internally, but you know, just kind of when we logically laid it out, we were like, yeah, I think it makes sense to just greenfield a new thing and let's see what happens. If it works great, then we'll figure it out. If it doesn't work great, then it'll get deleted at some point. So that's kind of how it actually came to be.Swyx [00:02:49]: I'll mention your background a little bit. You were also founder of Thinkster before you started StackBlitz. So both of you are second time founders. Both of you have sort of re-founded your company recently. Yours was more of a rename. I think a slightly different direction as well. And then we can talk about both. Maybe just chronologically, should we get caught up on where Kodo is first and then you know, just like what people should know since the last pod? Sure.Itamar [00:03:12]: The last pod was two months after we launched and we basically had the vision that we talked about. The idea that software development is about specification, test and code, etc. We are more on the testing part as in essence, we think that if you solve testing, you solve software development. The beautiful chart that we'll put up on screen. And testing is a really big field, like there are many dimensions, unit testing, the level of the component, how big it is, how large it is. And then there is like different type of testing, is it regression or smoke or whatever. So back then we only had like one ID extension with unit tests as in focus. One and a half year later, first ID extension supports more type of testing as context aware. We index local, local repos, but also 10,000s of repos for Fortune 500 companies. We have another agent, another tool that is called, the pure agent is the open source and the commercial one is CodoMerge. And then we have another open source called CoverAgent, which is not yet a commercial product coming very soon. It's very impressive. It could be that already people are approving automated pull requests that they don't even aware in really big open sources. So once we have enough of these, we will also launch another agent. So for the first one and a half year, what we did is grew in our offering and mostly on the side of, does this code actually works, testing, code review, et cetera. And we believe that's the critical milestone that needs to be achieved to actually have the AI engineer for enterprise software. And then like for the first year was everything bottom up, getting to 1 million installation. 2024, that was 2023, 2024 was starting to monetize, to feel like how it is to make the first buck. So we did the teams offering, it went well with a thousand of teams, et cetera. And then we started like just a few months ago to do enterprise with everything you need, which is a lot of things that discussed in the last post that was just released by Codelm. So that's how we call it at Codelm. Just opening the brackets, our company name was Codelm AI, and we renamed to Codo and we call our models Codelm. So back to my point, so we started Enterprise Motion and already have multiple Fortune 100 companies. And then with that, we raised a series of $40 million. And what's exciting about it is that enables us to develop more agents. That's our focus. I think it's very different. We're not coming very soon with an ID or something like that.Swyx [00:06:01]: You don't want to fork this code?Itamar [00:06:03]: Maybe we'll fork JetBrains or something just to be different.Swyx [00:06:08]: I noticed that, you know, I think the promise of general purpose agents has kind of died. Like everyone is doing kind of what you're doing. There's Codogen, Codomerge, and then there's a third one. What's the name of it?Itamar [00:06:17]: Yeah. Codocover. Cover. Which is like a commercial version of a cover agent. It's coming soon.Swyx [00:06:23]: Yeah. It's very similar with factory AI, also doing like droids. They all have special purpose doing things, but people don't really want general purpose agents. Right. The last time you were here, we talked about AutoGBT, the biggest thing of 2023. This year, not really relevant anymore. And I think it's mostly just because when you give me a general purpose agent, I don't know what to do with it.Eric [00:06:42]: Yeah.Itamar [00:06:43]: I totally agree with that. We're seeing it for a while and I think it will stay like that despite the computer use, et cetera, that supposedly can just replace us. You can just like prompt it to be, hey, now be a QA or be a QA person or a developer. I still think that there's a few reasons why you see like a dedicated agent. Again, I'm a bit more focused, like my head is more on complex software for big teams and enterprise, et cetera. And even think about permissions and what are the data sources and just the same way you manage permissions for users. Developers, you probably want to have dedicated guardrails and dedicated approvals for agents. I intentionally like touched a point on not many people think about. And of course, then what you can think of, like maybe there's different tools, tool use, et cetera. But just the first point by itself is a good reason why you want to have different agents.Alessio [00:07:40]: Just to compare that with Bot.new, you're almost focused on like the application is very complex and now you need better tools to kind of manage it and build on top of it. On Bot.new, it's almost like I was using it the other day. There's basically like, hey, look, I'm just trying to get started. You know, I'm not very opinionated on like how you're going to implement this. Like this is what I want to do. And you build a beautiful app with it. What people ask as the next step, you know, going back to like the general versus like specific, have you had people say, hey, you know, this is great to start, but then I want a specific Bot.new dot whatever else to do a more vertical integration and kind of like development or what's the, what do people say?Eric [00:08:18]: Yeah. I think, I think you kind of hit the, hit it head on, which is, you know, kind of the way that we've, we've kind of talked about internally is it's like people are using Bolt to go from like 0.0 to 1.0, like that's like kind of the biggest unlock that Bolt has versus most other things out there. I mean, I think that's kind of what's, what's very unique about Bolt. I think the, you know, the working on like existing enterprise applications is, I mean, it's crazy important because, you know, there's a, you look, when you look at the fortune 500, I mean, these code bases, some of these have been around for 20, 30 plus years. And so it's important to be going from, you know, 101.3 to 101.4, et cetera. I think for us, so what's been actually pretty interesting is we see there's kind of two different users for us that are coming in and it's very distinct. It's like people that are developers already. And then there's people that have never really written software and more if they have, it's been very, very minimal. And so in the first camp, what these developers are doing, like to go from zero to one, they're coming to Bolt and then they're ejecting the thing to get up or just downloading it and, you know, opening cursor, like whatever to, to, you know, keep iterating on the thing. And sometimes they'll bring it back to Bolt to like add in a huge piece of functionality or something. Right. But for the people that don't know how to code, they're actually just, they, they live in this thing. And that was one of the weird things when we launched is, you know, within a day of us being online, one of the most popular YouTube videos, and there's been a ton since, which was, you know, there's like, oh, Bolt is the cursor killer. And I originally saw the headlines and I was like, thanks for the views. I mean, I don't know. This doesn't make sense to me. That's not, that's not what we kind of thought.Swyx [00:09:44]: It's how YouTubers talk to each other. Well, everything kills everything else.Eric [00:09:47]: Totally. But what blew my mind was that there was any comparison because it's like cursor is a, is a local IDE product. But when, when we actually kind of dug into it and we, and we have people that are using our product saying this, I'm not using cursor. And I was like, what? And it turns out there are hundreds of thousands of people that we have seen that we're using cursor and we're trying to build apps with that where they're not traditional software does, but we're heavily leaning on the AI. And as you can imagine, it is very complicated, right? To do that with cursor. So when Bolt came out, they're like, wow, this thing's amazing because it kind of inverts the complexity where it's like, you know, it's not an IDE, it's, it's a, it's a chat-based sort of interface that we have. So that's kind of the split, which is rather interesting. We've had like the first startups now launch off of Bolt entirely where this, you know, tomorrow I'm doing a live stream with this guy named Paul, who he's built an entire CRM using this thing and you know, with backend, et cetera. And people have made their first money on the internet period, you know, launching this with Stripe or whatever have you. So that's, that's kind of the two main, the two main categories of folks that we see using Bolt though.Itamar [00:10:51]: I agree that I don't understand the comparison. It doesn't make sense to me. I think like we have like two type of families of tools. One is like we re-imagine the software development. I think Bolt is there and I think like a cursor is more like a evolution of what we already have. It's like taking the IDE and it's, it's amazing and it's okay, let's, let's adapt the IDE to an era where LLMs can do a lot for us. And Bolt is more like, okay, let's rethink everything totally. And I think we see a few tools there, like maybe Vercel, Veo and maybe Repl.it in that area. And then in the area of let's expedite, let's change, let's, let's progress with what we already have. You can see Cursor and Kodo, but we're different between ourselves, Cursor and Kodo, but definitely I think that comparison doesn't make sense.Alessio [00:11:42]: And just to set the context, this is not a Twitter demo. You've made 4 million of revenue in four weeks. So this is, this is actually working, you know, it's not a, what, what do you think that is? Like, there's been so many people demoing coding agents on Twitter and then it doesn't really work. And then you guys were just like, here you go, it's live, go use it, pay us for it. You know, is there anything in the development that was like interesting and maybe how that compares to building your own agents?Eric [00:12:08]: We had no idea, honestly, like we, we, we've been pretty blown away and, and things have just kind of continued to grow faster since then. We're like, oh, today is week six. So I, I kind of came back to the point you just made, right, where it's, you, you kind of outlined, it's like, there's kind of this new market of like kind of rethinking the software development and then there's heavily augmenting existing developers. I think that, you know, both of which are, you know, AI code gen being extremely good, it's allowed existing developers, it's allowing existing developers to camera out software far faster than they could have ever before, right? It's like the ultimate power tool for an existing developer. But this code gen stuff is now so good. And then, and we saw this over the past, you know, from the beginning of the year when we tried to first build, it's actually lowered the barrier to people that, that aren't traditionally software engineers. But the kind of the key thing is if you kind of think about it from, imagine you've never written software before, right? My co-founder and I, he and I grew up down the street from each other in Chicago. We learned how to code when we were 13 together and we've been building stuff ever since. And this is back in like the mid 2000s or whatever, you know, there was nothing for free to learn from online on the internet and how to code. For our 13th birthdays, we asked our parents for, you know, O'Reilly books cause you couldn't get this at the library, right? And so instead of like an Xbox, we got, you know, programming books. But the hardest part for everyone learning to code is getting an environment set up locally, you know? And so when we built StackBlitz, like kind of the key thesis, like seven years ago, the insight we had was that, Hey, it seems like the browser has a lot of new APIs like WebAssembly and service workers, et cetera, where you could actually write an operating system that ran inside the browser that could boot in milliseconds. And you, you know, basically there's this missing capability of the web. Like the web should be able to build apps for the web, right? You should be able to build the web on the web. Every other platform has that, Visual Studio for Windows, Xcode for Mac. The web has no built in primitive for this. And so just like our built in kind of like nerd instinct on this was like, that seems like a huge hole and it's, you know, it will be very valuable or like, you know, very valuable problem to solve. So if you want to set up that environments, you know, this is what we spent the past seven years doing. And the reality is existing developers have running locally. They already know how to set up that environment. So the problem isn't as acute for them. When we put Bolt online, we took that technology called WebContainer and married it with these, you know, state of the art frontier models. And the people that have the most pain with getting stuff set up locally is people that don't code. I think that's been, you know, really the big explosive reason is no one else has been trying to make dev environments work inside of a browser tab, you know, for the past if since ever, other than basically our company, largely because there wasn't an immediate demand or need. So I think we kind of find ourselves at the right place at the right time. And again, for this market of people that don't know how to write software, you would kind of expect that you should be able to do this without downloading something to your computer in the same way that, hey, I don't have to download Photoshop now to make designs because there's Figma. I don't have to download Word because there's, you know, Google Docs. They're kind of looking at this as that sort of thing, right? Which was kind of the, you know, our impetus and kind of vision from the get-go. But you know, the code gen, the AI code gen stuff that's come out has just been, you know, an order of magnitude multiplier on how magic that is, right? So that's kind of my best distillation of like, what is going on here, you know?Alessio [00:15:21]: And you can deploy too, right?Eric [00:15:22]: Yeah.Alessio [00:15:23]: Yeah.Eric [00:15:24]: And so that's, what's really cool is it's, you know, we have deployment built in with Netlify and this is actually, I think, Sean, you actually built this at Netlify when you were there. Yeah. It's one of the most brilliant integrations actually, because, you know, effectively the API that Sean built, maybe you can speak to it, but like as a provider, we can just effectively give files to Netlify without the user even logging in and they have a live website. And if they want to keep, hold onto it, they can click a link and claim it to their Netlify account. But it basically is just this really magic experience because when you come to Bolt, you say, I want a website. Like my mom, 70, 71 years old, made her first website, you know, on the internet two weeks ago, right? It was about her nursing days.Swyx [00:16:03]: Oh, that's fantastic though. It wouldn't have been made.Eric [00:16:06]: A hundred percent. Cause even in, you know, when we've had a lot of people building personal, like deeply personal stuff, like in the first week we launched this, the sales guy from the East Coast, you know, replied to a tweet of mine and he said, thank you so much for building this to your team. His daughter has a medical condition and so for her to travel, she has to like line up donors or something, you know, so ahead of time. And so he actually used Bolt to make a website to do that, to actually go and send it to folks in the region she was going to travel to ahead of time. I was really touched by it, but I also thought like, why, you know, why didn't he use like Wix or Squarespace? Right? I mean, this is, this is a solved problem, quote unquote, right? And then when I thought, I actually use Squarespace for my, for my, uh, the wedding website for my wife and I, like back in 2021, so I'm familiar, you know, it was, it was faster. I know how to code. I was like, this is faster. Right. And I thought back and I was like, there's a whole interface you have to learn how to use. And it's actually not that simple. There's like a million things you can configure in that thing. When you come to Bolt, there's a, there's a text box. You just say, I need a, I need a wedding website. Here's the date. Here's where it is. And here's a photo of me and my wife, put it somewhere relevant. It's actually the simplest way. And that's what my, when my mom came, she said, uh, I'm Pat Simons. I was a nurse in the seventies, you know, and like, here's the things I did and a website came out. So coming back to why is this such a, I think, why are we seeing this sort of growth? It's, this is the simplest interface I think maybe ever created to actually build it, a deploy a website. And then that website, my mom made, she's like, okay, this looks great. And there's, there's one button, you just click it, deploy, and it's live and you can buy a domain name, attach it to it. And you know, it's as simple as it gets, it's getting even simpler with some of the stuff we're working on. But anyways, so that's, it's, it's, uh, it's been really interesting to see some of the usage like that.Swyx [00:17:46]: I can offer my perspective. So I, you know, I probably should have disclosed a little bit that, uh, I'm a, uh, stack list investor.Alessio [00:17:53]: Canceled the episode. I know, I know. Don't play it now. Pause.Eric actually reached out to ShowMeBolt before the launch. And we, you know, we talked a lot about, like, the framing of, of what we're going to talk about how we marketed the thing, but also, like, what we're So that's what Bolt was going to need, like a whole sort of infrastructure.swyx: Netlify, I was a maintainer but I won't take claim for the anonymous upload. That's actually the origin story of Netlify. We can have Matt Billman talk about it, but that was [00:18:00] how Netlify started. You could drag and drop your zip file or folder from your desktop onto a website, it would have a live URL with no sign in.swyx: And so that was the origin story of Netlify. And it just persists to today. And it's just like it's really nice, interesting that both Bolt and CognitionDevIn and a bunch of other sort of agent type startups, they all use Netlify to deploy because of this one feature. They don't really care about the other features.swyx: But, but just because it's easy for computers to use and talk to it, like if you build an interface for computers specifically, that it's easy for them to Navigate, then they will be used in agents. And I think that's a learning that a lot of developer tools companies are having. That's my bolt launch story and now if I say all that stuff.swyx: And I just wanted to come back to, like, the Webcontainers things, right? Like, I think you put a lot of weight on the technical modes. I think you also are just like, very good at product. So you've, you've like, built a better agent than a lot of people, the rest of us, including myself, who have tried to build these things, and we didn't get as far as you did.swyx: Don't shortchange yourself on products. But I think specifically [00:19:00] on, on infra, on like the sandboxing, like this is a thing that people really want. Alessio has Bax E2B, which we'll have on at some point, talking about like the sort of the server full side. But yours is, you know, inside of the browser, serverless.swyx: It doesn't cost you anything to serve one person versus a million people. It doesn't, doesn't cost you anything. I think that's interesting. I think in theory, we should be able to like run tests because you can run the full backend. Like, you can run Git, you can run Node, you can run maybe Python someday.swyx: We talked about this. But ideally, you should be able to have a fully gentic loop, running code, seeing the errors, correcting code, and just kind of self healing, right? Like, I mean, isn't that the dream?Eric: Totally.swyx: Yeah,Eric: totally. At least in bold, we've got, we've got a good amount of that today. I mean, there's a lot more for us to do, but one of the nice things, because like in web container, you know, there's a lot of kind of stuff you go Google like, you know, turn docker container into wasm.Eric: You'll find a lot of stuff out there that will do that. The problem is it's very big, it's slow, and that ruins the experience. And so what we ended up doing is just writing an operating system from [00:20:00] scratch that was just purpose built to, you know, run in a browser tab. And the reason being is, you know, Docker 2 awesome things will give you an image that's like out 60 to 100 megabits, you know, maybe more, you know, and our, our OS, you know, kind of clocks in, I think, I think we're in like a, maybe, maybe a megabyte or less or something like that.Eric: I mean, it's, it's, you know, really, really, you know, stripped down.swyx: This is basically the task involved is I understand that it's. Mapping every single, single Linux call to some kind of web, web assembly implementation,Eric: but more or less, and, and then there's a lot of things actually, like when you're looking at a dev environment, there's a lot of things that you don't need that a traditional OS is gonna have, right?Eric: Like, you know audio drivers or you like, there's just like, there's just tons of things. Oh, yeah. Right. Yeah. That goes . Yeah. You can just kind, you can, you can kind of tos them. Or alternatively, what you can do is you can actually be the nice thing. And this is, this kind of comes back to the origins of browsers, which is, you know, they're, they're at the beginning of the web and, you know, the late nineties, there was two very different kind of visions for the web where Alan Kay vehemently [00:21:00] disagree with the idea that should be document based, which is, you know, Tim Berners Lee, you know, that, and that's kind of what ended up winning, winning was this document based kind of browsing documents on the web thing.Eric: Alan Kay, he's got this like very famous quote where he said, you know, you want web browsers to be mini operating systems. They should download little mini binaries and execute with like a little mini virtualized operating system in there. And what's kind of interesting about the history, not to geek out on this aspect, what's kind of interesting about the history is both of those folks ended up being right.Eric: Documents were actually the pragmatic way that the web worked. Was, you know, became the most ubiquitous platform in the world to the degree now that this is why WebAssembly has been invented is that we're doing, we need to do more low level things in a browser, same thing with WebGPU, et cetera. And so all these APIs, you know, to build an operating system came to the browser.Eric: And that was actually the realization we had in 2017 was, holy heck, like you can actually, you know, service workers, which were designed for allowing your app to work offline. That was the kind of the key one where it was like, wait a second, you can actually now run. Web servers within a [00:22:00] browser, like you can run a server that you open up.Eric: That's wild. Like full Node. js. Full Node. js. Like that capability. Like, I can have a URL that's programmatically controlled. By a web application itself, boom. Like the web can build the web. The primitive is there. Everyone at the time, like we talked to people that like worked on, you know Chrome and V8 and they were like, uhhhh.Eric: You know, like I don't know. But it's one of those things you just kind of have to go do it to find out. So we spent a couple of years, you know, working on it and yeah. And, and, and got to work in back in 2021 is when we kind of put the first like data of web container online. Butswyx: in partnership with Google, right?swyx: Like Google actually had to help you get over the finish line with stuff.Eric: A hundred percent, because well, you know, over the years of when we were doing the R and D on the thing. Kind of the biggest challenge, the two ways that you can kind of test how powerful and capable a platform are, the two types of applications are one, video games, right, because they're just very compute intensive, a lot of calculations that have to happen, right?Eric: The second one are IDEs, because you're talking about actually virtualizing the actual [00:23:00] runtime environment you are in to actually build apps on top of it, which requires sophisticated capabilities, a lot of access to data. You know, a good amount of compute power, right, to effectively, you know, building app in app sort of thing.Eric: So those, those are the stress tests. So if your platform is missing stuff, those are the things where you find out. Those are, those are the people building games and IDEs. They're the ones filing bugs on operating system level stuff. And for us, browser level stuff.Eric [00:23:47]: yeah, what ended up happening is we were just hammering, you know, the Chromium bug tracker, and they're like, who are these guys? Yeah. And, and they were amazing because I mean, just making Chrome DevTools be able to debug, I mean, it's, it's not, it wasn't originally built right for debugging an operating system, right? They've been phenomenal working with us and just kind of really pushing the limits, but that it's a rising tide that's kind of lifted all boats because now there's a lot of different types of applications that you can debug with Chrome Dev Tools that are running a browser that runs more reliably because just the stress testing that, that we and, you know, games that are coming to the web are kind of pushing as well, but.Itamar [00:24:23]: That's awesome. About the testing, I think like most, let's say coding assistant from different kinds will need this loop of testing. And even I would add code review to some, to some extent that you mentioned. How is testing different from code review? Code review could be, for example, PR review, like a code review that is done at the point of when you want to merge branches. But I would say that code review, for example, checks best practices, maintainability, and so on. It's not just like CI, but more than CI. And testing is like a more like checking functionality, et cetera. So it's different. We call, by the way, all of these together code integrity, but that's a different story. Just to go back to the, to the testing and specifically. Yeah. It's, it's, it's since the first slide. Yeah. We're consistent. So if we go back to the testing, I think like, it's not surprising that for us testing is important and for Bolt it's testing important, but I want to shed some light on a different perspective of it. Like let's think about autonomous driving. Those startups that are doing autonomous driving for highway and autonomous driving for the city. And I think like we saw the autonomous of the highway much faster and reaching to a level, I don't know, four or so much faster than those in the city. Now, in both cases, you need testing and quote unquote testing, you know, verifying validation that you're doing the right thing on the road and you're reading and et cetera. But it's probably like so different in the city that it could be like actually different technology. And I claim that we're seeing something similar here. So when you're building the next Wix, and if I was them, I was like looking at you and being a bit scared. That's what you're disrupting, what you just said. Then basically, I would say that, for example, the UX UI is freaking important. And because you're you're more aiming for the end user. In this case, maybe it's an end user that doesn't know how to develop for developers. It's also important. But let alone those that do not know to develop, they need a slick UI UX. And I think like that's one reason, for example, I think Cursor have like really good technology. I don't know the underlying what's under the hood, but at least what they're saying. But I think also their UX UI is great. It's a lot because they did their own ID. While if you're aiming for the city AI, suddenly like there's a lot of testing and code review technology that it's not necessarily like that important. For example, let's talk about integration tests. Probably like a lot of what you're building involved at the moment is isolated applications. Maybe the vision or the end game is maybe like having one solution for everything. It could be that eventually the highway companies will go into the city and the other way around. But at the beginning, there is a difference. And integration tests are a good example. I guess they're a bit less important. And when you think about enterprise software, they're really important. So to recap, like I think like the idea of looping and verifying your test and verifying your code in different ways, testing or code review, et cetera, seems to be important in the highway AI and the city AI, but in different ways and different like critical for the city, even more and more variety. Actually, I was looking to ask you like what kind of loops you guys are doing. For example, when I'm using Bolt and I'm enjoying it a lot, then I do see like sometimes you're trying to catch the errors and fix them. And also, I noticed that you're breaking down tasks into smaller ones and then et cetera, which is already a common notion for a year ago. But it seems like you're doing it really well. So if you're willing to share anything about it.Eric [00:28:07]: Yeah, yeah. I realized I never actually hit the punchline of what I was saying before. I mentioned the point about us kind of writing an operating system from scratch because what ended up being important about that is that to your point, it's actually a very, like compared to like a, you know, if you're like running cursor on anyone's machine, you kind of don't know what you're dealing with, with the OS you're running on. There could be an error happens. It could be like a million different things, right? There could be some config. There could be, it could be God knows what, right? The thing with WebConnect is because we wrote the entire thing from scratch. It's actually a unified image basically. And we can instrument it at any level that we think is going to be useful, which is exactly what we did when we started building Bolt is we instrumented stuff at like the process level, at the runtime level, you know, et cetera, et cetera, et cetera. Stuff that would just be not impossible to do on local, but to do that in a way that works across any operating system, whatever is, I mean, would just be insanely, you know, insanely difficult to do right and reliably. And that's what you saw when you've used Bolt is that when an error actually will occur, whether it's in the build process or the actual web application itself is failing or anything kind of in between, you can actually capture those errors. And today it's a very primitive way of how we've implemented it largely because the product just didn't exist 90 days ago. So we're like, we got some work ahead of us and we got to hire some more a little bit, but basically we present and we say, Hey, this is, here's kind of the things that went wrong. There's a fix it button and then a ignore button, and then you can just hit fix it. And then we take all that telemetry through our agent, you run it through our agent and say, kind of, here's the state of the application. Here's kind of the errors that we got from Node.js or the browser or whatever, and like dah, dah, dah, dah. And it can take a crack at actually solving it. And it's actually pretty darn good at being able to do that. That's kind of been a, you know, closing the loop and having it be a reliable kind of base has seemed to be a pretty big upgrade over doing stuff locally, just because I think that's a pretty key ingredient of it. And yeah, I think breaking things down into smaller tasks, like that's, that's kind of a key part of our agent. I think like Claude did a really good job with artifacts. I think, you know, us and kind of everyone else has, has kind of taken their approach of like actually breaking out certain tasks in a certain order into, you know, kind of a concrete way. And, and so actually the core of Bolt, I know we actually made open source. So you can actually go and check out like the system prompts and et cetera, and you can run it locally and whatever have you. So anyone that's interested in this stuff, I'd highly recommend taking a look at. There's not a lot of like stuff that's like open source in this realm. It's, that was one of the fun things that we've we thought would be cool to do. And people, people seem to like it. I mean, there's a lot of forks and people adding different models and stuff. So it's been cool to see.Swyx [00:30:41]: Yeah. I'm happy to add, I added real-time voice for my opening day demo and it was really fun to hack with. So thank you for doing that. Yeah. Thank you. I'm going to steal your code.Eric [00:30:52]: Because I want that.Swyx [00:30:52]: It's funny because I built on top of the fork of Bolt.new that already has the multi LLM thing. And so you just told me you're going to merge that in. So then you're going to merge two layers of forks down into this thing. So it'll be fun.Eric [00:31:03]: Heck yeah.Alessio [00:31:04]: Just to touch on like the environment, Itamar, you maybe go into the most complicated environments that even the people that work there don't know how to run. How much of an impact does that have on your performance? Like, you know, it's most of the work you're doing actually figuring out environment and like the libraries, because I'm sure they're using outdated version of languages, they're using outdated libraries, they're using forks that have not been on the public internet before. How much of the work that you're doing is like there versus like at the LLM level?Itamar [00:31:32]: One of the reasons I was asking about, you know, what are the steps to break things down, because it really matters. Like, what's the tech stack? How complicated the software is? It's hard to figure it out when you're dealing with the real world, any environment of enterprise as a city, when I'm like, while maybe sometimes like, I think you do enable like in Bolt, like to install stuff, but it's quite a like controlled environment. And that's a good thing to do, because then you narrow down and it's easier to make things work. So definitely, there are two dimensions, I think, actually spaces. One is the fact just like installing our software without yet like doing anything, making it work, just installing it because we work with enterprise and Fortune 500, etc. Many of them want on prem solution.Swyx [00:32:22]: So you have how many deployment options?Itamar [00:32:24]: Basically, we had, we did a metric metrics, say 96 options, because, you know, they're different dimensions. Like, for example, one dimension, we connect to your code management system to your Git. So are you having like GitHub, GitLab? Subversion? Is it like on cloud or deployed on prem? Just an example. Which model agree to use its APIs or ours? Like we have our Is it TestGPT? Yeah, when we started with TestGPT, it was a huge mistake name. It was cool back then, but I don't think it's a good idea to name a model after someone else's model. Anyway, that's my opinion. So we gotSwyx [00:33:02]: I'm interested in these learnings, like things that you change your mind on.Itamar [00:33:06]: Eventually, when you're building a company, you're building a brand and you want to create your own brand. By the way, when I thought about Bolt.new, I also thought about if it's not a problem, because when I think about Bolt, I do think about like a couple of companies that are already called this way.Swyx [00:33:19]: Curse companies. You could call it Codium just to...Itamar [00:33:24]: Okay, thank you. Touche. Touche.Eric [00:33:27]: Yeah, you got to imagine the board meeting before we launched Bolt, one of our investors, you can imagine they're like, are you sure? Because from the investment side, it's kind of a famous, very notorious Bolt. And they're like, are you sure you want to go with that name? Oh, yeah. Yeah, absolutely.Itamar [00:33:43]: At this point, we have actually four models. There is a model for autocomplete. There's a model for the chat. There is a model dedicated for more for code review. And there is a model that is for code embedding. Actually, you might notice that there isn't a good code embedding model out there. Can you name one? Like dedicated for code?Swyx [00:34:04]: There's code indexing, and then you can do sort of like the hide for code. And then you can embed the descriptions of the code.Itamar [00:34:12]: Yeah, but you do see a lot of type of models that are dedicated for embedding and for different spaces, different fields, etc. And I'm not aware. And I know that if you go to the bedrock, try to find like there's a few code embedding models, but none of them are specialized for code.Swyx [00:34:31]: Is there a benchmark that you would tell us to pay attention to?Itamar [00:34:34]: Yeah, so it's coming. Wait for that. Anyway, we have our models. And just to go back to the 96 option of deployment. So I'm closing the brackets for us. So one is like dimensional, like what Git deployment you have, like what models do you agree to use? Dotter could be like if it's air-gapped completely, or you want VPC, and then you have Azure, GCP, and AWS, which is different. Do you use Kubernetes or do not? Because we want to exploit that. There are companies that do not do that, etc. I guess you know what I mean. So that's one thing. And considering that we are dealing with one of all four enterprises, we needed to deal with that. So you asked me about how complicated it is to solve that complex code. I said, it's just a deployment part. And then now to the software, we see a lot of different challenges. For example, some companies, they did actually a good job to build a lot of microservices. Let's not get to if it's good or not, but let's first assume that it is a good thing. A lot of microservices, each one of them has their own repo. And now you have tens of thousands of repos. And you as a developer want to develop something. And I remember me coming to a corporate for the first time. I don't know where to look at, like where to find things. So just doing a good indexing for that is like a challenge. And moreover, the regular indexing, the one that you can find, we wrote a few blogs on that. By the way, we also have some open source, different than yours, but actually three and growing. Then it doesn't work. You need to let the tech leads and the companies influence your indexing. For example, Mark with different repos with different colors. This is a high quality repo. This is a lower quality repo. This is a repo that we want to deprecate. This is a repo we want to grow, etc. And let that be part of your indexing. And only then things actually work for enterprise and they don't get to a fatigue of, oh, this is awesome. Oh, but I'm starting, it's annoying me. I think Copilot is an amazing tool, but I'm quoting others, meaning GitHub Copilot, that they see not so good retention of GitHub Copilot and enterprise. Ooh, spicy. Yeah. I saw snapshots of people and we have customers that are Copilot users as well. And also I saw research, some of them is public by the way, between 38 to 50% retention for users using Copilot and enterprise. So it's not so good. By the way, I don't think it's that bad, but it's not so good. So I think that's a reason because, yeah, it helps you auto-complete, but then, and especially if you're working on your repo alone, but if it's need that context of remote repos that you're code-based, that's hard. So to make things work, there's a lot of work on that, like giving the controllability for the tech leads, for the developer platform or developer experience department in the organization to influence how things are working. A short example, because if you have like really old legacy code, probably some of it is not so good anymore. If you just fine tune on these code base, then there is a bias to repeat those mistakes or old practices, etc. So you need, for example, as I mentioned, to influence that. For example, in Coda, you can have a markdown of best practices by the tech leads and Coda will include that and relate to that and will not offer suggestions that are not according to the best practices, just as an example. So that's just a short list of things that you need to do in order to deal with, like you mentioned, the 100.1 to 100.2 version of software. I just want to say what you're doing is extremelyEric [00:38:32]: impressive because it's very difficult. I mean, the business of Stackplus, kind of before bulk came online, we sold a version of our IDE that went on-prem. So I understand what you're saying about the difficulty of getting stuff just working on-prem. Holy heck. I mean, that is extremely hard. I guess the question I have for you is, I mean, we were just doing that with kind of Kubernetes-based stuff, but the spread of Fortune 500 companies that you're working with, how are they doing the inference for this? Are you kind of plugging into Azure's OpenAI stuff and AWS's Bedrock, you know, Cloud stuff? Or are they just like running stuff on GPUs? Like, what is that? How are these folks approaching that? Because, man, what we saw on the enterprise side, I mean, I got to imagine that that's a huge challenge. Everything you said and more, like,Itamar [00:39:15]: for example, like someone could be, and I don't think any of these is bad. Like, they made their decision. Like, for example, some people, they're, I want only AWS and VPC on AWS, no matter what. And then they, some of them, like there is a subset, I will say, I'm willing to take models only for from Bedrock and not ours. And we have a problem because there is no good code embedding model on Bedrock. And that's part of what we're doing now with AWS to solve that. We solve it in a different way. But if you are willing to run on AWS VPC, but run your run models on GPUs or inferentia, like the new version of the more coming out, then our models can run on that. But everything you said is right. Like, we see like on-prem deployment where they have their own GPUs. We see Azure where you're using OpenAI Azure. We see cases where you're running on GCP and they want OpenAI. Like this cross, like a case, although there is Gemini or even Sonnet, I think is available on GCP, just an example. So all the options, that's part of the challenge. I admit that we thought about it, but it was even more complicated. And it took us a few months to actually, that metrics that I mentioned, to start clicking each one of the blocks there. A few months is impressive. I mean,Eric [00:40:35]: honestly, just that's okay. Every one of these enterprises is, their networking is different. Just everything's different. Every single one is different. I see you understand. Yeah. So that just cannot be understated. That it is, that's extremely impressive. Hats off.Itamar [00:40:50]: It could be, by the way, like, for example, oh, we're only AWS, but our GitHub enterprise is on-prem. Oh, we forgot. So we need like a private link or whatever, like every time like that. It's not, and you do need to think about it if you want to work with an enterprise. And it's important. Like I understand like their, I respect their point of view.Swyx [00:41:10]: And this primarily impacts your architecture, your tech choices. Like you have to, you can't choose some vendors because...Itamar [00:41:15]: Yeah, definitely. To be frank, it makes us hard for a startup because it means that we want, we want everyone to enjoy all the variety of models. By the way, it was hard for us with our technology. I want to open a bracket, like a window. I guess you're familiar with our Alpha Codium, which is an open source.Eric [00:41:33]: We got to go over that. Yeah. So I'll do that quickly.Itamar [00:41:36]: Yeah. A pin in that. Yeah. Actually, we didn't have it in the last episode. So, so, okay.Swyx [00:41:41]: Okay. We'll come back to that later, but let's talk about...Itamar [00:41:43]: Yeah. So, so just like shortly, and then we can double click on Alpha Codium. But Alpha Codium is a open source tool. You can go and try it and lets you compete on CodeForce. This is a website and a competition and actually reach a master level level, like 95% with a click of a button. You don't need to do anything. And part of what we did there is taking a problem and breaking it to different, like smaller blocks. And then the models are doing a much better job. Like we all know it by now that taking small tasks and solving them, by the way, even O1, which is supposed to be able to do system two thinking like Greg from OpenAI like hinted, is doing better on these kinds of problems. But still, it's very useful to break it down for O1, despite O1 being able to think by itself. And that's what we presented like just a month ago, OpenAI released that now they are doing 93 percentile with O1 IOI left and International Olympiad of Formation. Sorry, I forgot. Exactly. I told you I forgot. And we took their O1 preview with Alpha Codium and did better. Like it just shows like, and there is a big difference between the preview and the IOI. It shows like that these models are not still system two thinkers, and there is a big difference. So maybe they're not complete system two. Yeah, they need some guidance. I call them system 1.5. We can, we can have it. I thought about it. Like, you know, I care about this philosophy stuff. And I think like we didn't see it even close to a system two thinking. I can elaborate later. But closing the brackets, like we take Alpha Codium and as our principle of thinking, we take tasks and break them down to smaller tasks. And then we want to exploit the best model to solve them. So I want to enable anyone to enjoy O1 and SONET and Gemini 1.5, etc. But at the same time, I need to develop my own models as well, because some of the Fortune 500 want to have all air gapped or whatever. So that's a challenge. Now you need to support so many models. And to some extent, I would say that the flow engineering, the breaking down to two different blocks is a necessity for us. Why? Because when you take a big block, a big problem, you need a very different prompt for each one of the models to actually work. But when you take a big problem and break it into small tasks, we can talk how we do that, then the prompt matters less. What I want to say, like all this, like as a startup trying to do different deployment, getting all the juice that you can get from models, etc. is a big problem. And one need to think about it. And one of our mitigation is that process of taking tasks and breaking them down. That's why I'm really interested to know how you guys are doing it. And part of what we do is also open source. So you can see.Swyx [00:44:39]: There's a lot in there. But yeah, flow over prompt. I do believe that that does make sense. I feel like there's a lot that both of you can sort of exchange notes on breaking down problems. And I just want you guys to just go for it. This is fun to watch.Eric [00:44:55]: Yeah. I mean, what's super interesting is the context you're working in is, because for us too with Bolt, we've started thinking because our kind of existing business line was going behind the firewall, right? We were like, how do we do this? Adding the inference aspect on, we're like, okay, how does... Because I mean, there's not a lot of prior art, right? I mean, this is all new. This is all new. So I definitely am going to have a lot of questions for you.Itamar [00:45:17]: I'm here. We're very open, by the way. We have a paper on a blog or like whatever.Swyx [00:45:22]: The Alphacodeum, GitHub, and we'll put all this in the show notes.Itamar [00:45:25]: Yeah. And even the new results of O1, we published it.Eric [00:45:29]: I love that. And I also just, I think spiritually, I like your approach of being transparent. Because I think there's a lot of hype-ium around AI stuff. And a lot of it is, it's just like, you have these companies that are just kind of keep their stuff closed source and then just max hype it, but then it's kind of nothing. And I think it kind of gives a bad rep to the incredible stuff that's actually happening here. And so I think it's stuff like what you're doing where, I mean, true merit and you're cracking open actual code for others to learn from and use. That strikes me as the right approach. And it's great to hear that you're making such incredible progress.Itamar [00:46:02]: I have something to share about the open source. Most of our tools are, we have an open source version and then a premium pro version. But it's not an easy decision to do that. I actually wanted to ask you about your strategy, but I think in your case, there is, in my opinion, relatively a good strategy where a lot of parts of open source, but then you have the deployment and the environment, which is not right if I get it correctly. And then there's a clear, almost hugging face model. Yeah, you can do that, but why should you try to deploy it yourself, deploy it with us? But in our case, and I'm not sure you're not going to hit also some competitors, and I guess you are. I wanted to ask you, for example, on some of them. In our case, one day we looked on one of our competitors that is doing code review. We're a platform. We have the code review, the testing, et cetera, spread over the ID to get. And in each agent, we have a few startups or a big incumbents that are doing only that. So we noticed one of our competitors having not only a very similar UI of our open source, but actually even our typo. And you sit there and you're kind of like, yeah, we're not that good. We don't use enough Grammarly or whatever. And we had a couple of these and we saw it there. And then it's a challenge. And I want to ask you, Bald is doing so well, and then you open source it. So I think I know what my answer was. I gave it before, but still interestingEric [00:47:29]: to hear what you think. GeoHot said back, I don't know who he was up to at this exact moment, but I think on comma AI, all that stuff's open source. And someone had asked him, why is this open source? And he's like, if you're not actually confident that you can go and crush it and build the best thing, then yeah, you should probably keep your stuff closed source. He said something akin to that. I'm probably kind of butchering it, but I thought it was kind of a really good point. And that's not to say that you should just open source everything, because for obvious reasons, there's kind of strategic things you have to kind of take in mind. But I actually think a pretty liberal approach, as liberal as you kind of can be, it can really make a lot of sense. Because that is so validating that one of your competitors is taking your stuff and they're like, yeah, let's just kind of tweak the styles. I mean, clearly, right? I think it's kind of healthy because it keeps, I'm sure back at HQ that day when you saw that, you're like, oh, all right, well, we have to grind even harder to make sure we stay ahead. And so I think it's actually a very useful, motivating thing for the teams. Because you might feel this period of comfort. I think a lot of companies will have this period of comfort where they're not feeling the competition and one day they get disrupted. So kind of putting stuff out there and letting people push it forces you to face reality soon, right? And actually feel that incrementally so you can kind of adjust course. And that's for us, the open source version of Bolt has had a lot of features people have been begging us for, like persisting chat messages and checkpoints and stuff. Within the first week, that stuff was landed in the open source versions. And they're like, why can't you ship this? It's in the open, so people have forked it. And we're like, we're trying to keep our servers and GPUs online. But it's been great because the folks in the community did a great job, kept us on our toes. And we've got to know most of these folks too at this point that have been building these things. And so it actually was very instructive. Like, okay, well, if we're going to go kind of land this, there's some UX patterns we can kind of look at and the code is open source to this stuff. What's great about these, what's not. So anyways, NetNet, I think it's awesome. I think from a competitive point of view for us, I think in particular, what's interesting is the core technology of WebContainer going. And I think that right now, there's really nothing that's kind of on par with that. And we also, we have a business of, because WebContainer runs in your browser, but to make it work, you have to install stuff from NPM. You have to make cores bypass requests, like connected databases, which all require server-side proxying or acceleration. And so we actually sell WebContainer as a service. One of the core reasons we open-sourced kind of the core components of Bolt when we launched was that we think that there's going to be a lot more of these AI, in-your-browser AI co-gen experiences, kind of like what Anthropic did with Artifacts and Clod. By the way, Artifacts uses WebContainers. Not yet. No, yeah. Should I strike that? I think that they've got their own thing at the moment, but there's been a lot of interest in WebContainers from folks doing things in that sort of realm and in the AI labs and startups and everything in between. So I think there'll be, I imagine, over the coming months, there'll be lots of things being announced to folks kind of adopting it. But yeah, I think effectively...Swyx [00:50:35]: Okay, I'll say this. If you're a large model lab and you want to build sandbox environments inside of your chat app, you should call Eric.Itamar [00:50:43]: But wait, wait, wait, wait, wait, wait. I have a question about that. I think OpenAI, they felt that people are not using their model as they would want to. So they built ChatGPT. But I would say that ChatGPT now defines OpenAI. I know they're doing a lot of business from their APIs, but still, is this how you think? Isn't Bolt.new your business now? Why don't you focus on that instead of the...Swyx [00:51:16]: What's your advice as a founder?Eric [00:51:18]: You're right. And so going into it, we, candidly, we were like, Bolt.new, this thing is super cool. We think people are stoked. We think people will be stoked. But we were like, maybe that's allowed. Best case scenario, after month one, we'd be mind blown if we added a couple hundred K of error or something. And we were like, but we think there's probably going to be an immediate huge business. Because there was some early poll on folks wanting to put WebContainer into their product offerings, kind of similar to what Bolt is doing or whatever. We were actually prepared for the inverse outcome here. But I mean, well, I guess we've seen poll on both. But I mean, what's happened with Bolt, and you're right, it's actually the same strategy as like OpenAI or Anthropic, where we have our ChatGPT to OpenAI's APIs is Bolt to WebContainer. And so we've kind of taken that same approach. And we're seeing, I guess, some of the similar results, except right now, the revenue side is extremely lopsided to Bolt.Itamar [00:52:16]: I think if you ask me what's my advice, I think you have three options. One is to focus on Bolt. The other is to focus on the WebContainer. The third is to raise one billion dollars and do them both. I'm serious. I think otherwise, you need to choose. And if you raise enough money, and I think it's big bucks, because you're going to be chased by competitors. And I think it will be challenging to do both. And maybe you can. I don't know. We do see these numbers right now, raising above $100 million, even without havingEric [00:52:49]: a product. You can see these. It's excellent advice. And I think what's been amazing, but also kind of challenging is we're trying to forecast, okay, well, where are these things going? I mean, in the initial weeks, I think us and all the investors in the company that we're sharing this with, it was like, this is cool. Okay, we added 500k. Wow, that's crazy. Wow, we're at a million now. Most things, you have this kind of the tech crunch launch of initiation and then the thing of sorrow. And if there's going to be a downtrend, it's just not coming yet. Now that we're kind of looking ahead, we're six weeks in. So now we're getting enough confidence in our convictions to go, okay, this se
Join us for an insightful conversation with Joe Niemiec, Senior Product Manager for Streaming at MongoDB, recorded live at MongoDB Local London. In this video, Joe explains the fundamentals of stream processing and how it empowers developers to run continuous aggregation queries on real-time data. Discover practical use cases, including monitoring oil well pumps and smart grid applications, that showcase the power of stream processing in various industries. Joe also discusses the latest enhancements, such as expanded regional support, VPC peering for secure connections, and improved Kafka integration. Whether you're new to MongoDB or looking to enhance your data processing capabilities, this video is packed with valuable information to help you get started with stream processing!
In this pre-re:Invent 2024 episode, Luciano and Eoin discuss some of their favorite recent AWS announcements, including improvements to AWS Step Functions, Lambda runtime updates, DynamoDB price reductions, ALB header injection, Cognito enhancements, VPC public access blocking, and more. They share their thoughts on the implications of these new capabilities and look forward to seeing what else is announced at the conference. Overall, it's an exciting time for AWS developers with many new features to explore. Very important: no focus on GenAI in this episode :) AWS Bites is brought to you, as always, by fourTheorem! Sometimes, AWS is overwhelming and you might need someone to provide clear guidance in the fog of cloud offerings. That someone is fourTheorem. Check them out at fourtheorem.com In this episode, we mentioned the following resources: The repo containing the code of the AWS Bites website: https://github.com/awsbites/aws-bites-site Orama Search: https://orama.com/ JSONata in AWS Step Functions: https://aws.amazon.com/blogs/compute/simplifying-developer-experience-with-variables-and-jsonata-in-aws-step-functions/ EC2 Auto Scaling improvements: https://aws.amazon.com/about-aws/whats-new/2024/11/amazon-ec2-auto-scaling-highly-responsive-scaling-policies/ Node.js 22 is available for Lambda: https://aws.amazon.com/blogs/compute/node-js-22-runtime-now-available-in-aws-lambda/ Python 3.13 runtime: https://aws.amazon.com/blogs/compute/python-3-13-runtime-now-available-in-aws-lambda/ Aurora Serverless V2 now scales to 0: https://aws.amazon.com/about-aws/whats-new/2024/11/amazon-aurora-serverless-v2-scaling-zero-capacity/ Episode 95 covering Mountpoint for S3: https://awsbites.com/95-mounting-s3-as-a-filesystem/ One Zone caching for Mountpoint for S3: https://aws.amazon.com/about-aws/whats-new/2024/11/mountpoint-amazon-s3-high-performance-shared-cache/ Appending to S3 objects: https://docs.aws.amazon.com/AmazonS3/latest/userguide/directory-buckets-objects-append.html 1 million S3 Buckets per account: https://aws.amazon.com/about-aws/whats-new/2024/11/amazon-s3-up-1-million-buckets-per-aws-account/ DynamoDB cost reduction: https://aws.amazon.com/blogs/database/new-amazon-dynamodb-lowers-pricing-for-on-demand-throughput-and-global-tables/ ALB Headers: https://aws.amazon.com/about-aws/whats-new/2024/11/aws-application-load-balancer-header-modification-enhanced-traffic-control-security/ Cognito Managed Login: https://aws.amazon.com/blogs/aws/improve-your-app-authentication-workflow-with-new-amazon-cognito-features/ Cognito Passwordless Authentication: https://aws.amazon.com/blogs/aws/improve-your-app-authentication-workflow-with-new-amazon-cognito-features/ VPC Block Public Access: https://aws.amazon.com/blogs/networking-and-content-delivery/vpc-block-public-access/ Episode 88 where we talk about VPC Lattice: https://awsbites.com/88-what-is-vpc-lattice/ Direct integration between Lattice and ECS: https://aws.amazon.com/blogs/aws/streamline-container-application-networking-with-native-amazon-ecs-support-in-amazon-vpc-lattice/ Resource Control Policies: https://aws.amazon.com/blogs/aws/introducing-resource-control-policies-rcps-a-new-authorization-policy/ Episode 23 about EventBridge: https://awsbites.com/23-what-s-the-big-deal-with-eventbridge/ EventBridge latency improvements: https://aws.amazon.com/about-aws/whats-new/2024/11/amazon-eventbridge-improvement-latency-event-buses/ AppSync web sockets: https://aws.amazon.com/blogs/mobile/announcing-aws-appsync-events-serverless-websocket-apis/ Do you have any AWS questions you would like us to address? Leave a comment here or connect with us on X/Twitter: - https://twitter.com/eoins - https://twitter.com/loige
Building AI Assistants and Agents for mission-critical enterprise applications presents challenges related to accuracy, security, and explainability. An effective end-to-end Retrieval Augmented Generation (RAG) platform can mitigate AI "hallucinations," reduce bias, and ensure high-precision, contextually accurate outputs. This enhances compliance with copyright standards and delivers consistently trustworthy results. Robust security measures, including access controls, defenses against prompt injection attacks, and adherence to SOC2 and HIPAA compliance requirements, are essential for protecting sensitive data. Additionally, explainability for results and actions aids compliance, simplifies troubleshooting, and builds trust in AI-driven applications. Such a platform should offer sophisticated capabilities, extending beyond basic vector search by combining semantic and lexical retrieval, advanced reranking, real-time updates, and flexible user-defined functions, while remaining easy to use through a simple API. This empowers developers to build high-performing AI-driven applications without needing deep expertise in complex RAG systems. By integrating a plug-and-play RAG solution, enterprises can lower costs, simplify custom solutions, and benefit from ongoing model and infrastructure updates, all while maintaining compliance and scalability through flexible deployment options like SaaS, on-premises, or VPC hosting. Learn more on this episode of DM Radio!
In this episode, Simon Elisha and Brett Looney dive deep into the AWS Transit Gateway, a cloud-scale router that connects VPCs and other networking resources in AWS. They explain how Transit Gateway works, its advantages over VPC peering, and its scalability and resilience. They also touch on concepts like attachments, route tables, and VRF (virtual routing and forwarding). The conversation highlights the benefits of Transit Gateway for both experienced network administrators and newcomers to networking in the cloud. https://aws.amazon.com/transit-gateway
Hoje é dia de falar de nuvem! Neste episódio, exploramos a surpreendente relação entre a AWS e a Amazon Brasil, e as importantes questões ligadas a dimensionamento, escalabilidade e, é claro, segurança quando o assunto é nuvem. Vem ver quem participou desse papo: André David, o host que fica ligado em palavrinhas-chave Vinny Neves, co-host e Tech Lead na UsTwo Bruno Toffolo, Principal Software Development Engineer na Amazon Gaston Perez, Principal Solutions Architect na AWS
We all have fond memories of the first Dev Day in 2023:and the blip that followed soon after. As Ben Thompson has noted, this year's DevDay took a quieter, more intimate tone. No Satya, no livestream, (slightly fewer people?). Instead of putting ChatGPT announcements in DevDay as in 2023, o1 was announced 2 weeks prior, and DevDay 2024 was reserved purely for developer-facing API announcements, primarily the Realtime API, Vision Finetuning, Prompt Caching, and Model Distillation.However the larger venue and more spread out schedule did allow a lot more hallway conversations with attendees as well as more community presentations including our recent guest Alistair Pullen of Cosine as well as deeper dives from OpenAI including our recent guest Michelle Pokrass of the API Team. Thanks to OpenAI's warm collaboration (we particularly want to thank Lindsay McCallum Rémy!), we managed to record exclusive interviews with many of the main presenters of both the keynotes and breakout sessions. We present them in full in today's episode, together with a full lightly edited Q&A with Sam Altman.Show notes and related resourcesSome of these used in the final audio episode below* Simon Willison Live Blog* swyx live tweets and videos* Greg Kamradt coverage of Structured Output session, Scaling LLM Apps session* Fireside Chat Q&A with Sam AltmanTimestamps* [00:00:00] Intro by Suno.ai* [00:01:23] NotebookLM Recap of DevDay* [00:09:25] Ilan's Strawberry Demo with Realtime Voice Function Calling* [00:19:16] Olivier Godement, Head of Product, OpenAI* [00:36:57] Romain Huet, Head of DX, OpenAI* [00:47:08] Michelle Pokrass, API Tech Lead at OpenAI ft. Simon Willison* [01:04:45] Alistair Pullen, CEO, Cosine (Genie)* [01:18:31] Sam Altman + Kevin Weill Q&A* [02:03:07] Notebook LM Recap of PodcastTranscript[00:00:00] Suno AI: Under dev daylights, code ignites. Real time voice streams reach new heights. O1 and GPT, 4. 0 in flight. Fine tune the future, data in sight. Schema sync up, outputs precise. Distill the models, efficiency splice.[00:00:33] AI Charlie: Happy October. This is your AI co host, Charlie. One of our longest standing traditions is covering major AI and ML conferences in podcast format. Delving, yes delving, into the vibes of what it is like to be there stitched in with short samples of conversations with key players, just to help you feel like you were there.[00:00:54] AI Charlie: Covering this year's Dev Day was significantly more challenging because we were all requested not to record the opening keynotes. So, in place of the opening keynotes, we had the viral notebook LM Deep Dive crew, my new AI podcast nemesis, Give you a seven minute recap of everything that was announced.[00:01:15] AI Charlie: Of course, you can also check the show notes for details. I'll then come back with an explainer of all the interviews we have for you today. Watch out and take care.[00:01:23] NotebookLM Recap of DevDay[00:01:23] NotebookLM: All right, so we've got a pretty hefty stack of articles and blog posts here all about open ais. Dev day 2024.[00:01:32] NotebookLM 2: Yeah, lots to dig into there.[00:01:34] NotebookLM 2: Seems[00:01:34] NotebookLM: like you're really interested in what's new with AI.[00:01:36] NotebookLM 2: Definitely. And it seems like OpenAI had a lot to announce. New tools, changes to the company. It's a lot.[00:01:43] NotebookLM: It is. And especially since you're interested in how AI can be used in the real world, you know, practical applications, we'll focus on that.[00:01:51] NotebookLM: Perfect. Like, for example, this Real time API, they announced that, right? That seems like a big deal if we want AI to sound, well, less like a robot.[00:01:59] NotebookLM 2: It could be huge. The real time API could completely change how we, like, interact with AI. Like, imagine if your voice assistant could actually handle it if you interrupted it.[00:02:08] NotebookLM: Or, like, have an actual conversation.[00:02:10] NotebookLM 2: Right, not just these clunky back and forth things we're used to.[00:02:14] NotebookLM: And they actually showed it off, didn't they? I read something about a travel app, one for languages. Even one where the AI ordered takeout.[00:02:21] NotebookLM 2: Those demos were really interesting, and I think they show how this real time API can be used in so many ways.[00:02:28] NotebookLM 2: And the tech behind it is fascinating, by the way. It uses persistent WebSocket connections and this thing called function calling, so it can respond in real time.[00:02:38] NotebookLM: So the function calling thing, that sounds kind of complicated. Can you, like, explain how that works?[00:02:42] NotebookLM 2: So imagine giving the AI Access to this whole toolbox, right?[00:02:46] NotebookLM 2: Information, capabilities, all sorts of things. Okay. So take the travel agent demo, for example. With function calling, the AI can pull up details, let's say about Fort Mason, right, from some database. Like nearby restaurants, stuff like that.[00:02:59] NotebookLM: Ah, I get it. So instead of being limited to what it already knows, It can go and find the information it needs, like a human travel agent would.[00:03:07] NotebookLM 2: Precisely. And someone on Hacker News pointed out a cool detail. The API actually gives you a text version of what's being said. So you can store that, analyze it.[00:03:17] NotebookLM: That's smart. It seems like OpenAI put a lot of thought into making this API easy for developers to use. But, while we're on OpenAI, you know, Besides their tech, there's been some news about, like, internal changes, too.[00:03:30] NotebookLM: Didn't they say they're moving away from being a non profit?[00:03:32] NotebookLM 2: They did. And it's got everyone talking. It's a major shift. And it's only natural for people to wonder how that'll change things for OpenAI in the future. I mean, there are definitely some valid questions about this move to for profit. Like, will they have more money for research now?[00:03:46] NotebookLM 2: Probably. But will they, you know, care as much about making sure AI benefits everyone?[00:03:51] NotebookLM: Yeah, that's the big question, especially with all the, like, the leadership changes happening at OpenAI too, right? I read that their Chief Research Officer left, and their VP of Research, and even their CTO.[00:04:03] NotebookLM 2: It's true. A lot of people are connecting those departures with the changes in OpenAI's structure.[00:04:08] NotebookLM: And I guess it makes you wonder what's going on behind the scenes. But they are still putting out new stuff. Like this whole fine tuning thing really caught my eye.[00:04:17] NotebookLM 2: Right, fine tuning. It's essentially taking a pre trained AI model. And, like, customizing it.[00:04:23] NotebookLM: So instead of a general AI, you get one that's tailored for a specific job.[00:04:27] NotebookLM 2: Exactly. And that opens up so many possibilities, especially for businesses. Imagine you could train an AI on your company's data, you know, like how you communicate your brand guidelines.[00:04:37] NotebookLM: So it's like having an AI that's specifically trained for your company?[00:04:41] NotebookLM 2: That's the idea.[00:04:41] NotebookLM: And they're doing it with images now, too, right?[00:04:44] NotebookLM: Fine tuning with vision is what they called it.[00:04:46] NotebookLM 2: It's pretty incredible what they're doing with that, especially in fields like medicine.[00:04:50] NotebookLM: Like using AI to help doctors make diagnoses.[00:04:52] NotebookLM 2: Exactly. And AI could be trained on thousands of medical images, right? And then it could potentially spot things that even a trained doctor might miss.[00:05:03] NotebookLM: That's kind of scary, to be honest. What if it gets it wrong?[00:05:06] NotebookLM 2: Well, the idea isn't to replace doctors, but to give them another tool, you know, help them make better decisions.[00:05:12] NotebookLM: Okay, that makes sense. But training these AI models must be really expensive.[00:05:17] NotebookLM 2: It can be. All those tokens add up. But OpenAI announced something called automatic prompt caching.[00:05:23] Alex Volkov: Automatic what now? I don't think I came across that.[00:05:26] NotebookLM 2: So basically, if your AI sees a prompt that it's already seen before, OpenAI will give you a discount.[00:05:31] NotebookLM: Huh. Like a frequent buyer program for AI.[00:05:35] NotebookLM 2: Kind of, yeah. It's good that they're trying to make it more affordable. And they're also doing something called model distillation.[00:05:41] NotebookLM: Okay, now you're just using big words to sound smart. What's that?[00:05:45] NotebookLM 2: Think of it like like a recipe, right? You can take a really complex recipe and break it down to the essential parts.[00:05:50] NotebookLM: Make it simpler, but it still tastes the same.[00:05:53] NotebookLM 2: Yeah. And that's what model distillation is. You take a big, powerful AI model and create a smaller, more efficient version.[00:06:00] NotebookLM: So it's like lighter weight, but still just as capable.[00:06:03] NotebookLM 2: Exactly. And that means more people can actually use these powerful tools. They don't need, like, a supercomputer to run them.[00:06:10] NotebookLM: So they're making AI more accessible. That's great.[00:06:13] NotebookLM 2: It is. And speaking of powerful tools, they also talked about their new O1 model.[00:06:18] NotebookLM 2: That's the one they've been hyping up. The one that's supposed to be this big leap forward.[00:06:22] NotebookLM: Yeah, O1. It sounds pretty futuristic. Like, from what I read, it's not just a bigger, better language model.[00:06:28] NotebookLM 2: Right. It's a different porch.[00:06:29] NotebookLM: They're saying it can, like, actually reason, right? Think.[00:06:33] NotebookLM 2: It's trained differently.[00:06:34] NotebookLM 2: They used reinforcement learning with O1.[00:06:36] NotebookLM: So it's not just finding patterns in the data it's seen before.[00:06:40] NotebookLM 2: Not just that. It can actually learn from its mistakes. Get better at solving problems.[00:06:46] NotebookLM: So give me an example. What can O1 do that, say, GPT 4 can't?[00:06:51] NotebookLM 2: Well, OpenAI showed it doing some pretty impressive stuff with math, like advanced math.[00:06:56] NotebookLM 2: And coding, too. Complex coding. Things that even GPT 4 struggled with.[00:07:00] NotebookLM: So you're saying if I needed to, like, write a screenplay, I'd stick with GPT 4? But if I wanted to solve some crazy physics problem, O1 is what I'd use.[00:07:08] NotebookLM 2: Something like that, yeah. Although there is a trade off. O1 takes a lot more power to run, and it takes longer to get those impressive results.[00:07:17] NotebookLM: Hmm, makes sense. More power, more time, higher quality.[00:07:21] NotebookLM 2: Exactly.[00:07:22] NotebookLM: It sounds like it's still in development, though, right? Is there anything else they're planning to add to it?[00:07:26] NotebookLM 2: Oh, yeah. They mentioned system prompts, which will let developers, like, set some ground rules for how it behaves. And they're working on adding structured outputs and function calling.[00:07:38] Alex Volkov: Wait, structured outputs? Didn't we just talk about that? We[00:07:41] NotebookLM 2: did. That's the thing where the AI's output is formatted in a way that's easy to use.[00:07:47] NotebookLM: Right, right. So you don't have to spend all day trying to make sense of what it gives you. It's good that they're thinking about that stuff.[00:07:53] NotebookLM 2: It's about making these tools usable.[00:07:56] NotebookLM 2: And speaking of that, Dev Day finished up with this really interesting talk. Sam Altman, the CEO of OpenAI, And Kevin Weil, their new chief product officer. They talked about, like, the big picture for AI.[00:08:09] NotebookLM: Yeah, they did, didn't they? Anything interesting come up?[00:08:12] NotebookLM 2: Well, Altman talked about moving past this whole AGI term, Artificial General Intelligence.[00:08:18] NotebookLM: I can see why. It's kind of a loaded term, isn't it?[00:08:20] NotebookLM 2: He thinks it's become a bit of a buzzword, and people don't really understand what it means.[00:08:24] NotebookLM: So are they saying they're not trying to build AGI anymore?[00:08:28] NotebookLM 2: It's more like they're saying they're focused on just Making AI better, constantly improving it, not worrying about putting it in a box.[00:08:36] NotebookLM: That makes sense. Keep pushing the limits.[00:08:38] NotebookLM 2: Exactly. But they were also very clear about doing it responsibly. They talked a lot about safety and ethics.[00:08:43] NotebookLM: Yeah, that's important.[00:08:44] NotebookLM 2: They said they were going to be very careful. About how they release new features.[00:08:48] NotebookLM: Good! Because this stuff is powerful.[00:08:51] NotebookLM 2: It is. It was a lot to take in, this whole Dev Day event.[00:08:54] NotebookLM 2: New tools, big changes at OpenAI, and these big questions about the future of AI.[00:08:59] NotebookLM: It was. But hopefully this deep dive helped make sense of some of it. At least, that's what we try to do here.[00:09:05] AI Charlie: Absolutely.[00:09:06] NotebookLM: Thanks for taking the deep dive with us.[00:09:08] AI Charlie: The biggest demo of the new Realtime API involved function calling with voice mode and buying chocolate covered strawberries from our friendly local OpenAI developer experience engineer and strawberry shop owner, Ilan Biggio.[00:09:21] AI Charlie: We'll first play you the audio of his demo and then go into a little interview with him.[00:09:25] Ilan's Strawberry Demo with Realtime Voice Function Calling[00:09:25] Romain Huet: Could you place a call and see if you could get us 400 strawberries delivered to the venue? But please keep that under 1500. I'm on it. We'll get those strawberries delivered for you.[00:09:47] Ilan: Hello? Hi there. Is this Ilan? I'm Romain's AI assistant. How is it going? Fantastic. Can you tell me what flavors of strawberry dips you have for me? Yeah, we have chocolate, vanilla, and we have peanut butter. Wait, how much would 400 chocolate covered strawberries cost? 400? Are you sure you want 400? Yes, 400 chocolate covered[00:10:14] swyx: strawberries.[00:10:15] Ilan: Wait,[00:10:16] swyx: how much[00:10:16] Ilan: would that be? I think that'll be around, like, 1, 415. 92.[00:10:25] Alex Volkov: Awesome. Let's go ahead and place the order for four chocolate covered strawberries.[00:10:31] Ilan: Great, where would you like that delivered? Please deliver them to the Gateway Pavilion at Fort Mason. And I'll be paying in cash.[00:10:42] Alex Volkov: Okay,[00:10:43] Ilan: sweet. So just to confirm, you want four strawberries?[00:10:45] Ilan: 400 chocolate covered strawberries to the Gateway Pavilion. Yes, that's perfect. And when can we expect delivery? Well, you guys are right nearby, so it'll be like, I don't know, 37 seconds? That's incredibly fast. Cool, you too.[00:11:09] swyx: Hi, Ilan, welcome to Lanespace. Oh, thank you. I just saw your amazing demos, had your amazing strawberries. You are dressed up, like, exactly like a strawberry salesman. Gotta have it all. What was the building on demo like? What was the story behind the demo?[00:11:22] swyx: It was really interesting. This is actually something I had been thinking about for months before the launch.[00:11:27] swyx: Like, having a, like, AI that can make phone calls is something like I've personally wanted for a long time. And so as soon as we launched internally, like, I started hacking on it. And then that sort of just started. We made it into like an internal demo, and then people found it really interesting, and then we thought how cool would it be to have this like on stage as, as one of the demos.[00:11:47] swyx: Yeah, would would you call out any technical issues building, like you were basically one of the first people ever to build with a voice mode API. Would you call out any issues like integrating it with Twilio like that, like you did with function calling, with like a form filling elements. I noticed that you had like intents of things to fulfill, and then.[00:12:07] swyx: When there's still missing info, the voice would prompt you, roleplaying the store guy.[00:12:13] swyx: Yeah, yeah, so, I think technically, there's like the whole, just working with audio and streams is a whole different beast. Like, even separate from like AI and this, this like, new capabilities, it's just, it's just tough.[00:12:26] swyx: Yeah, when you have a prompt, conversationally it'll just follow, like the, it was, Instead of like, kind of step by step to like ask the right questions based on like the like what the request was, right? The function calling itself is sort of tangential to that. Like, you have to prompt it to call the functions, but then handling it isn't too much different from, like, what you would do with assistant streaming or, like, chat completion streaming.[00:12:47] swyx: I think, like, the API feels very similar just to, like, if everything in the API was streaming, it actually feels quite familiar to that.[00:12:53] swyx: And then, function calling wise, I mean, does it work the same? I don't know. Like, I saw a lot of logs. You guys showed, like, in the playground, a lot of logs. What is in there?[00:13:03] swyx: What should people know?[00:13:04] swyx: Yeah, I mean, it is, like, the events may have different names than the streaming events that we have in chat completions, but they represent very similar things. It's things like, you know, function call started, argument started, it's like, here's like argument deltas, and then like function call done.[00:13:20] swyx: Conveniently we send one that has the full function, and then I just use that. Nice.[00:13:25] swyx: Yeah and then, like, what restrictions do, should people be aware of? Like, you know, I think, I think, before we recorded, we discussed a little bit about the sensitivities around basically calling random store owners and putting, putting like an AI on them.[00:13:40] swyx: Yeah, so there's, I think there's recent regulation on that, which is why we want to be like very, I guess, aware of, of You know, you can't just call anybody with AI, right? That's like just robocalling. You wouldn't want someone just calling you with AI.[00:13:54] swyx: I'm a developer, I'm about to do this on random people.[00:13:57] swyx: What laws am I about to break?[00:14:00] swyx: I forget what the governing body is, but you should, I think, Having consent of the person you're about to call, it always works. I, as the strawberry owner, have consented to like getting called with AI. I think past that you, you want to be careful. Definitely individuals are more sensitive than businesses.[00:14:19] swyx: I think businesses you have a little bit more leeway. Also, they're like, businesses I think have an incentive to want to receive AI phone calls. Especially if like, they're dealing with it. It's doing business. Right, like, it's more business. It's kind of like getting on a booking platform, right, you're exposed to more.[00:14:33] swyx: But, I think it's still very much like a gray area. Again, so. I think everybody should, you know, tread carefully, like, figure out what it is. I, I, I, the law is so recent, I didn't have enough time to, like, I'm also not a lawyer. Yeah, yeah, yeah, of course. Yeah.[00:14:49] swyx: Okay, cool fair enough. One other thing, this is kind of agentic.[00:14:52] swyx: Did you use a state machine at all? Did you use any framework? No. You just stick it in context and then just run it in a loop until it ends call?[00:15:01] swyx: Yeah, there isn't even a loop, like Okay. Because the API is just based on sessions. It's always just going to keep going. Every time you speak, it'll trigger a call.[00:15:11] swyx: And then after every function call was also invoked invoking like a generation. And so that is another difference here. It's like it's inherently almost like in a loop, be just by being in a session, right? No state machines needed. I'd say this is very similar to like, the notion of routines, where it's just like a list of steps.[00:15:29] swyx: And it, like, sticks to them softly, but usually pretty well. And the steps is the prompts? The steps, it's like the prompt, like the steps are in the prompt. Yeah, yeah, yeah. Right, it's like step one, do this, step one, step two, do that. What if I want to change the system prompt halfway through the conversation?[00:15:44] swyx: You can. Okay. You can. To be honest, I have not played without two too much. Yeah,[00:15:47] swyx: yeah.[00:15:48] swyx: But, I know you can.[00:15:49] swyx: Yeah, yeah. Yeah. Awesome. I noticed that you called it real time API, but not voice API. Mm hmm. So I assume that it's like real time API starting with voice. Right, I think that's what he said on the thing.[00:16:00] swyx: I can't imagine, like, what else is real[00:16:02] swyx: time? Well, I guess, to use ChatGPT's voice mode as an example, Like, we've demoed the video, right? Like, real time image, right? So, I'm not actually sure what timelines are, But I would expect, if I had to guess, That, like, that is probably the next thing that we're gonna be making.[00:16:17] swyx: You'd probably have to talk directly with the team building this. Sure. But, You can't promise their timelines. Yeah, yeah, yeah, right, exactly. But, like, given that this is the features that currently, Or that exists that we've demoed on Chachapiti. Yeah. There[00:16:29] swyx: will never be a[00:16:29] swyx: case where there's like a real time text API, right?[00:16:31] swyx: I don't Well, this is a real time text API. You can do text only on this. Oh. Yeah. I don't know why you would. But it's actually So text to text here doesn't quite make a lot of sense. I don't think you'll get a lot of latency gain. But, like, speech to text is really interesting. Because you can prevent You can prevent responses, like audio responses.[00:16:54] swyx: And force function calls. And so you can do stuff like UI control. That is like super super reliable. We had a lot of like, you know, un, like, we weren't sure how well this was gonna work because it's like, you have a voice answering. It's like a whole persona, right? Like, that's a little bit more, you know, risky.[00:17:10] swyx: But if you, like, cut out the audio outputs and make it so it always has to output a function, like you can end up with pretty pretty good, like, Pretty reliable, like, command like a command architecture. Yeah,[00:17:21] swyx: actually, that's the way I want to interact with a lot of these things as well. Like, one sided voice.[00:17:26] swyx: Yeah, you don't necessarily want to hear the[00:17:27] swyx: voice back. And like, sometimes it's like, yeah, I think having an output voice is great. But I feel like I don't always want to hear an output voice. I'd say usually I don't. But yeah, exactly, being able to speak to it is super sweet.[00:17:39] swyx: Cool. Do you want to comment on any of the other stuff that you announced?[00:17:41] swyx: From caching I noticed was like, I like the no code change part. I'm looking forward to the docs because I'm sure there's a lot of details on like, what you cache, how long you cache. Cause like, enthalpy caches were like 5 minutes. I was like, okay, but what if I don't make a call every 5 minutes?[00:17:56] swyx: Yeah,[00:17:56] swyx: to be super honest with you, I've been so caught up with the real time API and making the demo that I haven't read up on the other stuff. Launches too much. I mean, I'm aware of them, but I think I'm excited to see how all distillation works. That's something that we've been doing like, I don't know, I've been like doing it between our models for a while And I've seen really good results like I've done back in a day like from GPT 4 to GPT 3.[00:18:19] swyx: 5 And got like, like pretty much the same level of like function calling with like hundreds of functions So that was super super compelling So, I feel like easier distillation, I'm really excited for. I see. Is it a tool?[00:18:31] swyx: So, I saw evals. Yeah. Like, what is the distillation product? It wasn't super clear, to be honest.[00:18:36] swyx: I, I think I want to, I want to let that team, I want to let that team talk about it. Okay,[00:18:40] swyx: alright. Well, I appreciate you jumping on. Yeah, of course. Amazing demo. It was beautifully designed. I'm sure that was part of you and Roman, and[00:18:47] swyx: Yeah, I guess, shout out to like, the first people to like, creators of Wanderlust, originally, were like, Simon and Carolis, and then like, I took it and built the voice component and the voice calling components.[00:18:59] swyx: Yeah, so it's been a big team effort. And like the entire PI team for like Debugging everything as it's been going on. It's been, it's been so good working with them. Yeah, you're the first consumers on the DX[00:19:07] swyx: team. Yeah. Yeah, I mean, the classic role of what we do there. Yeah. Okay, yeah, anything else? Any other call to action?[00:19:13] swyx: No, enjoy Dev Day. Thank you. Yeah. That's it.[00:19:16] Olivier Godement, Head of Product, OpenAI[00:19:16] AI Charlie: The latent space crew then talked to Olivier Godmont, head of product for the OpenAI platform, who led the entire Dev Day keynote and introduced all the major new features and updates that we talked about today.[00:19:28] swyx: Okay, so we are here with Olivier Godmont. That's right.[00:19:32] swyx: I don't pronounce French. That's fine. It was perfect. And it was amazing to see your keynote today. What was the back story of, of preparing something like this? Preparing, like, Dev Day? It[00:19:43] Olivier Godement: essentially came from a couple of places. Number one, excellent reception from last year's Dev Day.[00:19:48] Olivier Godement: Developers, startup founders, researchers want to spend more time with OpenAI, and we want to spend more time with them as well. And so for us, like, it was a no brainer, frankly, to do it again, like, you know, like a nice conference. The second thing is going global. We've done a few events like in Paris and like a few other like, you know, non European, non American countries.[00:20:05] Olivier Godement: And so this year we're doing SF, Singapore, and London. To frankly just meet more developers.[00:20:10] swyx: Yeah, I'm very excited for the Singapore one.[00:20:12] Olivier Godement: Ah,[00:20:12] swyx: yeah. Will you be[00:20:13] Olivier Godement: there?[00:20:14] swyx: I don't know. I don't know if I got an invite. No. I can't just talk to you. Yeah, like, and then there was some speculation around October 1st.[00:20:22] Olivier Godement: Yeah. Is it because[00:20:23] swyx: 01, October 1st? It[00:20:25] Olivier Godement: has nothing to do. I discovered the tweet yesterday where like, people are so creative. No one, there was no connection to October 1st. But in hindsight, that would have been a pretty good meme by Tiana. Okay.[00:20:37] swyx: Yeah, and you know, I think like, OpenAI's outreach to developers is something that I felt the whole in 2022, when like, you know, like, people were trying to build a chat GPT, and like, there was no function calling, all that stuff that you talked about in the past.[00:20:51] swyx: And that's why I started my own conference as like like, here's our little developer conference thing. And, but to see this OpenAI Dev Day now, and like to see so many developer oriented products coming to OpenAI, I think it's really encouraging.[00:21:02] Olivier Godement: Yeah, totally. It's that's what I said, essentially, like, developers are basically the people who make the best connection between the technology and, you know, the future, essentially.[00:21:14] Olivier Godement: Like, you know, essentially see a capability, see a low level, like, technology, and are like, hey, I see how that application or that use case that can be enabled. And so, in the direction of enabling, like, AGI, like, all of humanity, it's a no brainer for us, like, frankly, to partner with Devs.[00:21:31] Alessio: And most importantly, you almost never had waitlists, which, compared to like other releases, people usually, usually have.[00:21:38] Alessio: What is the, you know, you had from caching, you had real time voice API, we, you know, Shawn did a long Twitter thread, so people know the releases. Yeah. What is the thing that was like sneakily the hardest to actually get ready for, for that day, or like, what was the kind of like, you know, last 24 hours, anything that you didn't know was gonna work?[00:21:56] Olivier Godement: Yeah. The old Fairly, like, I would say, involved, like, features to ship. So the team has been working for a month, all of them. The one which I would say is the newest for OpenAI is the real time API. For a couple of reasons. I mean, one, you know, it's a new modality. Second, like, it's the first time that we have an actual, like, WebSocket based API.[00:22:16] Olivier Godement: And so, I would say that's the one that required, like, the most work over the month. To get right from a developer perspective and to also make sure that our existing safety mitigation that worked well with like real time audio in and audio out.[00:22:30] swyx: Yeah, what design choices or what was like the sort of design choices that you want to highlight?[00:22:35] swyx: Like, you know, like I think for me, like, WebSockets, you just receive a bunch of events. It's two way. I obviously don't have a ton of experience. I think a lot of developers are going to have to embrace this real time programming. Like, what are you designing for, or like, what advice would you have for developers exploring this?[00:22:51] Olivier Godement: The core design hypothesis was essentially, how do we enable, like, human level latency? We did a bunch of tests, like, on average, like, human beings, like, you know, takes, like, something like 300 milliseconds to converse with each other. And so that was the design principle, essentially. Like, working backward from that, and, you know, making the technology work.[00:23:11] Olivier Godement: And so we evaluated a few options, and WebSockets was the one that we landed on. So that was, like, one design choice. A few other, like, big design choices that we had to make prompt caching. Prompt caching, the design, like, target was automated from the get go. Like, zero code change from the developer.[00:23:27] Olivier Godement: That way you don't have to learn, like, what is a prompt prefix, and, you know, how long does a cache work, like, we just do it as much as we can, essentially. So that was a big design choice as well. And then finally, on distillation, like, and evaluation. The big design choice was something I learned at Skype, like in my previous job, like a philosophy around, like, a pit of success.[00:23:47] Olivier Godement: Like, what is essentially the, the, the minimum number of steps for the majority of developers to do the right thing? Because when you do evals on fat tuning, there are many, many ways, like, to mess it up, frankly, like, you know, and have, like, a crappy model, like, evals that tell, like, a wrong story. And so our whole design was, okay, we actually care about, like, helping people who don't have, like, that much experience, like, evaluating a model, like, get, like, in a few minutes, like, to a good spot.[00:24:11] Olivier Godement: And so how do we essentially enable that bit of success, like, in the product flow?[00:24:15] swyx: Yeah, yeah, I'm a little bit scared to fine tune especially for vision, because I don't know what I don't know for stuff like vision, right? Like, for text, I can evaluate pretty easily. For vision let's say I'm like trying to, one of your examples was grab.[00:24:33] swyx: Which, very close to home, I'm from Singapore. I think your example was like, they identified stop signs better. Why is that hard? Why do I have to fine tune that? If I fine tune that, do I lose other things? You know, like, there's a lot of unknowns with Vision that I think developers have to figure out.[00:24:50] swyx: For[00:24:50] Olivier Godement: sure. Vision is going to open up, like, a new, I would say, evaluation space. Because you're right, like, it's harder, like, you know, to tell correct from incorrect, essentially, with images. What I can say is we've been alpha testing, like, the Vision fine tuning, like, for several weeks at that point. We are seeing, like, even higher performance uplift compared to text fine tuning.[00:25:10] Olivier Godement: So that's, there is something here, like, we've been pretty impressed, like, in a good way, frankly. But, you know, how well it works. But for sure, like, you know, I expect the developers who are moving from one modality to, like, text and images will have, like, more, you know Testing, evaluation, like, you know, to set in place, like, to make sure it works well.[00:25:25] Alessio: The model distillation and evals is definitely, like, the most interesting. Moving away from just being a model provider to being a platform provider. How should people think about being the source of truth? Like, do you want OpenAI to be, like, the system of record of all the prompting? Because people sometimes store it in, like, different data sources.[00:25:41] Alessio: And then, is that going to be the same as the models evolve? So you don't have to worry about, you know, refactoring the data, like, things like that, or like future model structures.[00:25:51] Olivier Godement: The vision is if you want to be a source of truth, you have to earn it, right? Like, we're not going to force people, like, to pass us data.[00:25:57] Olivier Godement: There is no value prop, like, you know, for us to store the data. The vision here is at the moment, like, most developers, like, use like a one size fits all model, like be off the shelf, like GP40 essentially. The vision we have is fast forward a couple of years. I think, like, most developers will essentially, like, have a.[00:26:15] Olivier Godement: An automated, continuous, fine tuned model. The more, like, you use the model, the more data you pass to the model provider, like, the model is automatically, like, fine tuned, evaluated against some eval sets, and essentially, like, you don't have to every month, when there is a new snapshot, like, you know, to go online and, you know, try a few new things.[00:26:34] Olivier Godement: That's a direction. We are pretty far away from it. But I think, like, that evaluation and decision product are essentially a first good step in that direction. It's like, hey, it's you. I set it by that direction, and you give us the evaluation data. We can actually log your completion data and start to do some automation on your behalf.[00:26:52] Alessio: And then you can do evals for free if you share data with OpenAI. How should people think about when it's worth it, when it's not? Sometimes people get overly protective of their data when it's actually not that useful. But how should developers think about when it's right to do it, when not, or[00:27:07] Olivier Godement: if you have any thoughts on it?[00:27:08] Olivier Godement: The default policy is still the same, like, you know, we don't train on, like, any API data unless you opt in. What we've seen from feedback is evaluation can be expensive. Like, if you run, like, O1 evals on, like, thousands of samples Like, your build will get increased, like, you know, pretty pretty significantly.[00:27:22] Olivier Godement: That's problem statement number one. Problem statement number two is, essentially, I want to get to a world where whenever OpenAI ships a new model snapshot, we have full confidence that there is no regression for the task that developers care about. And for that to be the case, essentially, we need to get evals.[00:27:39] Olivier Godement: And so that, essentially, is a sort of a two bugs one stone. It's like, we subsidize, basically, the evals. And we also use the evals when we ship new models to make sure that we keep going in the right direction. So, in my sense, it's a win win, but again, completely opt in. I expect that many developers will not want to share their data, and that's perfectly fine to me.[00:27:56] swyx: Yeah, I think free evals though, very, very good incentive. I mean, it's a fair trade. You get data, we get free evals. Exactly,[00:28:04] Olivier Godement: and we sanitize PII, everything. We have no interest in the actual sensitive data. We just want to have good evaluation on the real use cases.[00:28:13] swyx: Like, I always want to eval the eval. I don't know if that ever came up.[00:28:17] swyx: Like, sometimes the evals themselves are wrong, and there's no way for me to tell you.[00:28:22] Olivier Godement: Everyone who is starting with LLM, teaching with LLM, is like, Yeah, evaluation, easy, you know, I've done testing, like, all my life. And then you start to actually be able to eval, understand, like, all the corner cases, And you realize, wow, there's like a whole field in itself.[00:28:35] Olivier Godement: So, yeah, good evaluation is hard and so, yeah. Yeah, yeah.[00:28:38] swyx: But I think there's a, you know, I just talked to Brain Trust which I think is one of your partners. Mm-Hmm. . They also emphasize code based evals versus your sort of low code. What I see is like, I don't know, maybe there's some more that you didn't demo.[00:28:53] swyx: YC is kind of like a low code experience, right, for evals. Would you ever support like a more code based, like, would I run code on OpenAI's eval platform?[00:29:02] Olivier Godement: For sure. I mean, we meet developers where they are, you know. At the moment, the demand was more for like, you know, easy to get started, like eval. But, you know, if we need to expose like an evaluation API, for instance, for people like, you know, to pass, like, you know, their existing test data we'll do it.[00:29:15] Olivier Godement: So yeah, there is no, you know, philosophical, I would say, like, you know, misalignment on that. Yeah,[00:29:19] swyx: yeah, yeah. What I think this is becoming, by the way, and I don't, like it's basically, like, you're becoming AWS. Like, the AI cloud. And I don't know if, like, that's a conscious strategy, or it's, like, It doesn't even have to be a conscious strategy.[00:29:33] swyx: Like, you're going to offer storage. You're going to offer compute. You're going to offer networking. I don't know what networking looks like. Networking is maybe, like, Caching or like it's a CDN. It's a prompt CDN.[00:29:45] Alex Volkov: Yeah,[00:29:45] swyx: but it's the AI versions of everything, right? Do you like do you see the analogies or?[00:29:52] Olivier Godement: Whatever Whatever I took to developers. I feel like Good models are just half of the story to build a good app There's a third model you need to do Evaluation is the perfect example. Like, you know, you can have the best model in the world If you're in the dark, like, you know, it's really hard to gain the confidence and so Our philosophy is[00:30:11] Olivier Godement: The whole like software development stack is being basically reinvented, you know, with LLMs. There is no freaking way that open AI can build everything. Like there is just too much to build, frankly. And so my philosophy is, essentially, we'll focus on like the tools which are like the closest to the model itself.[00:30:28] Olivier Godement: So that's why you see us like, you know, investing quite a bit in like fine tuning, distillation, our evaluation, because we think that it actually makes sense to have like in one spot, Like, you know, all of that. Like, there is some sort of virtual circle, essentially, that you can set in place. But stuff like, you know, LLMOps, like tools which are, like, further away from the model, I don't know if you want to do, like, you know, super elaborate, like, prompt management, or, you know, like, tooling, like, I'm not sure, like, you know, OpenAI has, like, such a big edge, frankly, like, you know, to build this sort of tools.[00:30:56] Olivier Godement: So that's how we view it at the moment. But again, frankly, the philosophy is super simple. The strategy is super simple. It's meeting developers where they want us to be. And so, you know that's frankly, like, you know, day in, day out, like, you know, what I try to do.[00:31:08] Alessio: Cool. Thank you so much for the time.[00:31:10] Alessio: I'm sure you,[00:31:10] swyx: Yeah, I have more questions on, a couple questions on voice, and then also, like, your call to action, like, what you want feedback on, right? So, I think we should spend a bit more time on voice, because I feel like that's, like, the big splash thing. I talked well Well, I mean, I mean, just what is the future of real time for OpenAI?[00:31:28] swyx: Yeah. Because I think obviously video is next. You already have it in the, the ChatGPT desktop app. Do we just have a permanent, like, you know, like, are developers just going to be, like, sending sockets back and forth with OpenAI? Like how do we program for that? Like, what what is the future?[00:31:44] Olivier Godement: Yeah, that makes sense. I think with multimodality, like, real time is quickly becoming, like, you know, essentially the right experience, like, to build an application. Yeah. So my expectation is that we'll see like a non trivial, like a volume of applications like moving to a real time API. Like if you zoom out, like, audio is really simple, like, audio until basically now.[00:32:05] Olivier Godement: Audio on the web, in apps, was basically very much like a second class citizen. Like, you basically did like an audio chatbot for users who did not have a choice. You know, they were like struggling to read, or I don't know, they were like not super educated with technology. And so, frankly, it was like the crappy option, you know, compared to text.[00:32:25] Olivier Godement: But when you talk to people in the real world, the vast majority of people, like, prefer to talk and listen instead of typing and writing.[00:32:34] swyx: We speak before we write.[00:32:35] Olivier Godement: Exactly. I don't know. I mean, I'm sure it's the case for you in Singapore. For me, my friends in Europe, the number of, like, WhatsApp, like, voice notes they receive every day, I mean, just people, it makes sense, frankly, like, you know.[00:32:45] Olivier Godement: Chinese. Chinese, yeah.[00:32:46] swyx: Yeah,[00:32:47] Olivier Godement: all voice. You know, it's easier. There is more emotions. I mean, you know, you get the point across, like, pretty well. And so my personal ambition for, like, the real time API and, like, audio in general is to make, like, audio and, like, multimodality, like, truly a first class experience.[00:33:01] Olivier Godement: Like, you know, if you're, like, you know, the amazing, like, super bold, like, start up out of YC, you want to build, like, the next, like, billion, like, you know, user application to make it, like, truly your first and make it feel, like, you know, an actual good, like, you know, product experience. So that's essentially the ambition, and I think, like, yeah, it could be pretty big.[00:33:17] swyx: Yeah. I think one, one people, one issue that people have with the voice so far as, as released in advanced voice mode is the refusals.[00:33:24] Alex Volkov: Yeah.[00:33:24] swyx: You guys had a very inspiring model spec. I think Joanne worked on that. Where you said, like, yeah, we don't want to overly refuse all the time. In fact, like, even if, like, not safe for work, like, in some occasions, it's okay.[00:33:38] swyx: How, is there an API that we can say, not safe for work, okay?[00:33:41] Olivier Godement: I think we'll get there. I think we'll get there. The mobile spec, like, nailed it, like, you know. It nailed it! It's so good! Yeah, we are not in the business of, like, policing, you know, if you can say, like, vulgar words or whatever. You know, there are some use cases, like, you know, I'm writing, like, a Hollywood, like, script I want to say, like, will go on, and it's perfectly fine, you know?[00:33:59] Olivier Godement: And so I think the direction where we'll go here is that basically There will always be like, you know, a set of behavior that we will, you know, just like forbid, frankly, because they're illegal against our terms of services. But then there will be like, you know, some more like risky, like themes, which are completely legal, like, you know, vulgar words or, you know, not safe for work stuff.[00:34:17] Olivier Godement: Where basically we'll expose like a controllable, like safety, like knobs in the API to basically allow you to say, hey, that theme okay, that theme not okay. How sensitive do you want the threshold to be on safety refusals? I think that's the Dijkstra. So a[00:34:31] swyx: safety API.[00:34:32] Olivier Godement: Yeah, in a way, yeah.[00:34:33] swyx: Yeah, we've never had that.[00:34:34] Olivier Godement: Yeah. '[00:34:35] swyx: cause right now is you, it is whatever you decide. And then it's, that's it. That, that, that would be the main reason I don't use opening a voice is because of[00:34:42] Olivier Godement: it's over police. Over refuse over refusals. Yeah. Yeah, yeah. No, we gotta fix that. Yeah. Like singing,[00:34:47] Alessio: we're trying to do voice. I'm a singer.[00:34:49] swyx: And you, you locked off singing.[00:34:51] swyx: Yeah,[00:34:51] Alessio: yeah, yeah.[00:34:52] swyx: But I, I understand music gets you in trouble. Okay. Yeah. So then, and then just generally, like, what do you want to hear from developers? Right? We have, we have all developers watching you know, what feedback do you want? Any, anything specific as well, like from, especially from today anything that you are unsure about, that you are like, Our feedback could really help you decide.[00:35:09] swyx: For sure.[00:35:10] Olivier Godement: I think, essentially, it's becoming pretty clear after today that, you know, I would say the open end direction has become pretty clear, like, you know, after today. Investment in reasoning, investment in multimodality, Investment as well, like in, I would say, tool use, like function calling. To me, the biggest question I have is, you know, Where should we put the cursor next?[00:35:30] Olivier Godement: I think we need all three of them, frankly, like, you know, so we'll keep pushing.[00:35:33] swyx: Hire 10, 000 people, or actually, no need, build a bunch of bots.[00:35:37] Olivier Godement: Exactly, and so let's take O1 smart enough, like, for your problems? Like, you know, let's set aside for a second the existing models, like, for the apps that you would love to build, is O1 basically it in reasoning, or do we still have, like, you know, a step to do?[00:35:50] Olivier Godement: Preview is not enough, I[00:35:52] swyx: need the full one.[00:35:53] Olivier Godement: Yeah, so that's exactly that sort of feedback. Essentially what they would love to do is for developers I mean, there's a thing that Sam has been saying like over and over again, like, you know, it's easier said than done, but I think it's directionally correct. As a developer, as a founder, you basically want to build an app which is a bit too difficult for the model today, right?[00:36:12] Olivier Godement: Like, what you think is right, it's like, sort of working, sometimes not working. And that way, you know, that basically gives us like a goalpost, and be like, okay, that's what you need to enable with the next model release, like in a few months. And so I would say that Usually, like, that's the sort of feedback which is like the most useful that I can, like, directly, like, you know, incorporate.[00:36:33] swyx: Awesome. I think that's our time. Thank you so much, guys. Yeah, thank you so much.[00:36:38] AI Charlie: Thank you. We were particularly impressed that Olivier addressed the not safe for work moderation policy question head on, as that had only previously been picked up on in Reddit forums. This is an encouraging sign that we will return to in the closing candor with Sam Altman at the end of this episode.[00:36:57] Romain Huet, Head of DX, OpenAI[00:36:57] AI Charlie: Next, a chat with Roman Hewitt, friend of the pod, AI Engineer World's fair closing keynote speaker, and head of developer experience at OpenAI on his incredible live demos And advice to AI engineers on all the new modalities.[00:37:12] Alessio: Alright, we're live from OpenAI Dev Day. We're with Juan, who just did two great demos on, on stage.[00:37:17] Alessio: And he's been a friend of Latentspace, so thanks for taking some of the time.[00:37:20] Romain Huet: Of course, yeah, thank you for being here and spending the time with us today.[00:37:23] swyx: Yeah, I appreciate appreciate you guys putting this on. I, I know it's like extra work, but it really shows the developers that you're, Care and about reaching out.[00:37:31] Romain Huet: Yeah, of course, I think when you go back to the OpenAI mission, I think for us it's super important that we have the developers involved in everything we do. Making sure that you know, they have all of the tools they need to build successful apps. And we really believe that the developers are always going to invent the ideas, the prototypes, the fun factors of AI that we can't build ourselves.[00:37:49] Romain Huet: So it's really cool to have everyone here.[00:37:51] swyx: We had Michelle from you guys on. Yes, great episode. She very seriously said API is the path to AGI. Correct. And people in our YouTube comments were like, API is not AGI. I'm like, no, she's very serious. API is the path to AGI. Like, you're not going to build everything like the developers are, right?[00:38:08] swyx: Of[00:38:08] Romain Huet: course, yeah, that's the whole value of having a platform and an ecosystem of amazing builders who can, like, in turn, create all of these apps. I'm sure we talked about this before, but there's now more than 3 million developers building on OpenAI, so it's pretty exciting to see all of that energy into creating new things.[00:38:26] Alessio: I was going to say, you built two apps on stage today, an international space station tracker and then a drone. The hardest thing must have been opening Xcode and setting that up. Now, like, the models are so good that they can do everything else. Yes. You had two modes of interaction. You had kind of like a GPT app to get the plan with one, and then you had a cursor to do apply some of the changes.[00:38:47] Alessio: Correct. How should people think about the best way to consume the coding models, especially both for You know, brand new projects and then existing projects that you're trying to modify.[00:38:56] Romain Huet: Yeah. I mean, one of the things that's really cool about O1 Preview and O1 Mini being available in the API is that you can use it in your favorite tools like cursor like I did, right?[00:39:06] Romain Huet: And that's also what like Devin from Cognition can use in their own software engineering agents. In the case of Xcode, like, it's not quite deeply integrated in Xcode, so that's why I had like chat GPT side by side. But it's cool, right, because I could instruct O1 Preview to be, like, my coding partner and brainstorming partner for this app, but also consolidate all of the, the files and architect the app the way I wanted.[00:39:28] Romain Huet: So, all I had to do was just, like, port the code over to Xcode and zero shot the app build. I don't think I conveyed, by the way, how big a deal that is, but, like, you can now create an iPhone app from scratch, describing a lot of intricate details that you want, and your vision comes to life in, like, a minute.[00:39:47] Romain Huet: It's pretty outstanding.[00:39:48] swyx: I have to admit, I was a bit skeptical because if I open up SQL, I don't know anything about iOS programming. You know which file to paste it in. You probably set it up a little bit. So I'm like, I have to go home and test it. And I need the ChatGPT desktop app so that it can tell me where to click.[00:40:04] Romain Huet: Yeah, I mean like, Xcode and iOS development has become easier over the years since they introduced Swift and SwiftUI. I think back in the days of Objective C, or like, you know, the storyboard, it was a bit harder to get in for someone new. But now with Swift and SwiftUI, their dev tools are really exceptional.[00:40:23] Romain Huet: But now when you combine that with O1, as your brainstorming and coding partner, it's like your architect, effectively. That's the best way, I think, to describe O1. People ask me, like, can GPT 4 do some of that? And it certainly can. But I think it will just start spitting out code, right? And I think what's great about O1, is that it can, like, make up a plan.[00:40:42] Romain Huet: In this case, for instance, the iOS app had to fetch data from an API, it had to look at the docs, it had to look at, like, how do I parse this JSON, where do I store this thing, and kind of wire things up together. So that's where it really shines. Is mini or preview the better model that people should be using?[00:40:58] Romain Huet: Like, how? I think people should try both. We're obviously very excited about the upcoming O1 that we shared the evals for. But we noticed that O1 Mini is very, very good at everything math, coding, everything STEM. If you need for your kind of brainstorming or your kind of science part, you need some broader knowledge than reaching for O1 previews better.[00:41:20] Romain Huet: But yeah, I used O1 Mini for my second demo. And it worked perfectly. All I needed was very much like something rooted in code, architecting and wiring up like a front end, a backend, some UDP packets, some web sockets, something very specific. And it did that perfectly.[00:41:35] swyx: And then maybe just talking about voice and Wanderlust, the app that keeps on giving, what's the backstory behind like preparing for all of that?[00:41:44] Romain Huet: You know, it's funny because when last year for Dev Day, we were trying to think about what could be a great demo app to show like an assistive experience. I've always thought travel is a kind of a great use case because you have, like, pictures, you have locations, you have the need for translations, potentially.[00:42:01] Romain Huet: There's like so many use cases that are bounded to travel that I thought last year, let's use a travel app. And that's how Wanderlust came to be. But of course, a year ago, all we had was a text based assistant. And now we thought, well, if there's a voice modality, what if we just bring this app back as a wink.[00:42:19] Romain Huet: And what if we were interacting better with voice? And so with this new demo, what I showed was the ability to like, So, we wanted to have a complete conversation in real time with the app, but also the thing we wanted to highlight was the ability to call tools and functions, right? So, like in this case, we placed a phone call using the Twilio API, interfacing with our AI agents, but developers are so smart that they'll come up with so many great ideas that we could not think of ourselves, right?[00:42:48] Romain Huet: But what if you could have like a, you know, a 911 dispatcher? What if you could have like a customer service? Like center, that is much smarter than what we've been used to today. There's gonna be so many use cases for real time, it's awesome.[00:43:00] swyx: Yeah, and sometimes actually you, you, like this should kill phone trees.[00:43:04] swyx: Like there should not be like dial one[00:43:07] Romain Huet: of course para[00:43:08] swyx: espanol, you know? Yeah, exactly. Or whatever. I dunno.[00:43:12] Romain Huet: I mean, even you starting speaking Spanish would just do the thing, you know you don't even have to ask. So yeah, I'm excited for this future where we don't have to interact with those legacy systems.[00:43:22] swyx: Yeah. Yeah. Is there anything, so you are doing function calling in a streaming environment. So basically it's, it's web sockets. It's UDP, I think. It's basically not guaranteed to be exactly once delivery. Like, is there any coding challenges that you encountered when building this?[00:43:39] Romain Huet: Yeah, it's a bit more delicate to get into it.[00:43:41] Romain Huet: We also think that for now, what we, what we shipped is a, is a beta of this API. I think there's much more to build onto it. It does have the function calling and the tools. But we think that for instance, if you want to have something very robust, On your client side, maybe you want to have web RTC as a client, right?[00:43:58] Romain Huet: And, and as opposed to like directly working with the sockets at scale. So that's why we have partners like Life Kit and Agora if you want to, if you want to use them. And I'm sure we'll have many mores in the, in many more in the future. But yeah, we keep on iterating on that, and I'm sure the feedback of developers in the weeks to come is going to be super critical for us to get it right.[00:44:16] swyx: Yeah, I think LiveKit has been fairly public that they are used in, in the Chachapiti app. Like, is it, it's just all open source, and we just use it directly with OpenAI, or do we use LiveKit Cloud or something?[00:44:28] Romain Huet: So right now we, we released the API, we released some sample code also, and referenced clients for people to get started with our API.[00:44:35] Romain Huet: And we also partnered with LifeKit and Agora, so they also have their own, like ways to help you get started that plugs natively with the real time API. So depending on the use case, people can, can can decide what to use. If you're working on something that's completely client or if you're working on something on the server side, for the voice interaction, you may have different needs, so we want to support all of those.[00:44:55] Alessio: I know you gotta run. Is there anything that you want the AI engineering community to give feedback on specifically, like even down to like, you know, a specific API end point or like, what, what's like the thing that you want? Yeah. I[00:45:08] Romain Huet: mean, you know, if we take a step back, I think dev Day this year is all different from last year and, and in, in a few different ways.[00:45:15] Romain Huet: But one way is that we wanted to keep it intimate, even more intimate than last year. We wanted to make sure that the community is. Thank you very much for joining us on the Spotlight. That's why we have community talks and everything. And the takeaway here is like learning from the very best developers and AI engineers.[00:45:31] Romain Huet: And so, you know we want to learn from them. Most of what we shipped this morning, including things like prompt caching the ability to generate prompts quickly in the playground, or even things like vision fine tuning. These are all things that developers have been asking of us. And so, the takeaway I would, I would leave them with is to say like, Hey, the roadmap that we're working on is heavily influenced by them and their work.[00:45:53] Romain Huet: And so we love feedback From high feature requests, as you say, down to, like, very intricate details of an API endpoint, we love feedback, so yes that's, that's how we, that's how we build this API.[00:46:05] swyx: Yeah, I think the, the model distillation thing as well, it might be, like, the, the most boring, but, like, actually used a lot.[00:46:12] Romain Huet: True, yeah. And I think maybe the most unexpected, right, because I think if I, if I read Twitter correctly the past few days, a lot of people were expecting us. To shape the real time API for speech to speech. I don't think developers were expecting us to have more tools for distillation, and we really think that's gonna be a big deal, right?[00:46:30] Romain Huet: If you're building apps that have you know, you, you want high, like like low latency, low cost, but high performance, high quality on the use case distillation is gonna be amazing.[00:46:40] swyx: Yeah. I sat in the distillation session just now and they showed how they distilled from four oh to four mini and it was like only like a 2% hit in the performance and 50 next.[00:46:49] swyx: Yeah,[00:46:50] Romain Huet: I was there as well for the superhuman kind of use case inspired for an Ebola client. Yeah, this was really good. Cool man! so much for having me. Thanks again for being here today. It's always[00:47:00] AI Charlie: great to have you. As you might have picked up at the end of that chat, there were many sessions throughout the day focused on specific new capabilities.[00:47:08] Michelle Pokrass, Head of API at OpenAI ft. Simon Willison[00:47:08] AI Charlie: Like the new model distillation features combining EVOLs and fine tuning. For our next session, we are delighted to bring back two former guests of the pod, which is something listeners have been greatly enjoying in our second year of doing the Latent Space podcast. Michelle Pokras of the API team joined us recently to talk about structured outputs, and today gave an updated long form session at Dev Day, describing the implementation details of the new structured output mode.[00:47:39] AI Charlie: We also got her updated thoughts on the VoiceMode API we discussed in her episode, now that it is finally announced. She is joined by friend of the pod and super blogger, Simon Willison, who also came back as guest co host in our Dev Day. 2023 episode.[00:47:56] Alessio: Great, we're back live at Dev Day returning guest Michelle and then returning guest co host Fork.[00:48:03] Alessio: Fork, yeah, I don't know. I've lost count. I think it's been a few. Simon Willison is back. Yeah, we just wrapped, we just wrapped everything up. Congrats on, on getting everything everything live. Simon did a great, like, blog, so if you haven't caught up, I[00:48:17] Simon Willison: wrote my, I implemented it. Now, I'm starting my live blog while waiting for the first talk to start, using like GPT 4, I wrote me the Javascript, and I got that live just in time and then, yeah, I was live blogging the whole day.[00:48:28] swyx: Are you a cursor enjoyer?[00:48:29] Simon Willison: I haven't really gotten into cursor yet to be honest. I just haven't spent enough time for it to click, I think. I'm more a copy and paste things out of Cloud and chat GPT. Yeah. It's interesting.[00:48:39] swyx: Yeah. I've converted to cursor and 01 is so easy to just toggle on and off.[00:48:45] Alessio: What's your workflow?[00:48:46] Alessio: VS[00:48:48] Michelle Pokrass: Code co pilot, so Yep, same here. Team co pilot. Co pilot is actually the reason I joined OpenAI. It was, you know, before ChatGPT, this is the thing that really got me. So I'm still into it, but I keep meaning to try out Cursor, and I think now that things have calmed down, I'm gonna give it a real go.[00:49:03] swyx: Yeah, it's a big thing to change your tool of choice.[00:49:06] swyx: Yes,[00:49:06] Michelle Pokrass: yeah, I'm pretty dialed, so.[00:49:09] swyx: I mean, you know, if you want, you can just fork VS Code and make your own. That's the thing to dumb thing, right? We joked about doing a hackathon where the only thing you do is fork VS Code and bet me the best fork win.[00:49:20] Michelle Pokrass: Nice.[00:49:22] swyx: That's actually a really good idea. Yeah, what's up?[00:49:26] swyx: I mean, congrats on launching everything today. I know, like, we touched on it a little bit, but, like, everyone was kind of guessing that Voice API was coming, and, like, we talked about it in our episode. How do you feel going into the launch? Like, any design decisions that you want to highlight?[00:49:41] Michelle Pokrass: Yeah, super jazzed about it. The team has been working on it for a while. It's, like, a very different API for us. It's the first WebSocket API, so a lot of different design decisions to be made. It's, like, what kind of events do you send? When do you send an event? What are the event names? What do you send, like, on connection versus on future messages?[00:49:57] Michelle Pokrass: So there have been a lot of interesting decisions there. The team has also hacked together really cool projects as we've been testing it. One that I really liked is we had an internal hack a thon for the API team. And some folks built like a little hack that you could use to, like VIM with voice mode, so like, control vim, and you would tell them on like, nice, write a file and it would, you know, know all the vim commands and, and pipe those in.[00:50:18] Michelle Pokrass: So yeah, a lot of cool stuff we've been hacking on and really excited to see what people build with it.[00:50:23] Simon Willison: I've gotta call out a demo from today. I think it was Katja had a 3D visualization of the solar system, like WebGL solar system, you could talk to. That is one of the coolest conference demos I've ever seen.[00:50:33] Simon Willison: That was so convincing. I really want the code. I really want the code for that to get put out there. I'll talk[00:50:39] Michelle Pokrass: to the team. I think we can[00:50:40] Simon Willison: probably
AWS Morning Brief for the week of September 30, with Corey Quinn. Links:Amazon Aurora MySQL now supports RDS Data APIIntroducing Amazon EC2 C8g and M8g InstancesAmazon EC2 Instance Connect now supports IPv6Amazon SNS now delivers SMS text messages via AWS End User MessagingAWS CloudFormation Git sync now supports pull request workflows to review your stack changesAWS CloudTrail launches network activity events for VPC endpoints (preview)AWS Serverless Application Repository now supports AWS PrivateLinkAWS announces general availability for Security Group Referencing on AWS Transit GatewayGenerative AI Cost Optimization StrategiesCustomizing your HPC environment: building AMIs for AWS Parallel Computing ServiceMaking traffic lights more efficient with Amazon RekognitionWhose contract is it anyway? How AWS Marketplace worksSwitch your file share access from Amazon FSx File Gateway to Amazon FSx for Windows File Server
The Internal Revenue Service rule regarding charities in voter registration is explicit: “voter education or registration activities with evidence of bias that (a) would favor one candidate over another; (b) oppose a candidate in some manner; or (c) have the effect of favoring a candidate or group of candidates, will constitute prohibited participation or intervention.” Now questions are being raised about whether the Voter Participation Center, a charitable organization that is part of the left's “voting machine,” is giving “evidence of bias” after the Washington Free Beacon revealed through Meta Platforms' advertising disclosure tools that VPC was excluding fans of “NASCAR, golf, Jeeps, or other interests and hobbies typically associated with Republican men” from seeing its voter-registration advertisements. Here to discuss this and other stories from the world of dubiously nonpartisan nonpartisan civic engagement are CRC president Scott Walter and Parker Thayer, author of CRC's special report on “How Charities Secretly Help Win Elections.”Links: Do You Like NASCAR? This 'Non-partisan' Voter Registration Group Doesn't Want To Help YouVoter Participation Center (VPC)How Charities Secretly Help Win ElectionsThe Voter Participation Center: A Tax-Exempt Turnout Machine for DemocratsFollow us on our socials: Twitter: @capitalresearchInstagram: @capitalresearchcenterFacebook: www.facebook.com/capitalresearchcenterYouTube: @capitalresearchcenter
Henrico Citizen Managing Editor Patty Kruszewski, who won two awards in the Virginia Professional Communicators 2024 communications contest, has placed among the top finishers in the National Federation of Press Women (NFPW) competition. National winners were announced June 22 at a ceremony during the NFPW Conference held in St. Louis. Kruszewski won a second-place NFPW award in the columns category with her personal opinion piece, "A Night with Chef Paul," which advanced to national competition after winning first place in the statewide VPC contest. The column, which ran in a December 2023 edition of the Citizen, described the evening her...Article LinkSupport the Show.
This episode discusses solutions for securely accessing private VPC resources for debugging and troubleshooting. We cover traditional approaches like bastion hosts and VPNs and newer solutions using containers and AWS services like Fargate, ECS, and SSM. We explain how to set up a Fargate task with a container image with the necessary tools, enable ECS integration with SSM, and use SSM to start remote shells and port forwarding tunnels into the container. This provides on-demand access without exposing resources on the public internet. We share a Python script to simplify the process. We suggest ideas for improvements like auto-scaling the container down when idle. Overall, this lightweight containerized approach can provide easy access for debugging compared to managing EC2 instances.
“Maybe you don't always think of yourself this way, but associations are authority.” When associations operate with that mindset, corporate outreach becomes less of an "ask" and more of a business opportunity. In Episode 5, Lori Zoss Kraska (Growth Owl, LLC) shares how association business teams can effectively connect with Fortune1000 companies to develop partnerships, including: Why corporate partners matter for your association Who the decision makers are and how to connect Tips for structuring your outreach 3 types of partnerships that are increasingly intriguing to corporations Then, Josh Slayton (Holmes Corporation) shares a real-world example of how associations and corporations are partnering alongside higher ed to impact the nation's labor shortage, all while increasing non-dues revenue. Additional Links: Gallup worforce research Growth Owl, LLC Holmes Corporation RevUP Summit 2024 (PARPOD code to register) VPC, Inc: Elevate your next in-person event!
This week, we discuss a few new releases on our desks from VPC and Anoma which leads to a short discussion of the growing small-brand scene. We also jump into our thoughts on the new Audemars Piguet [Re]Master02. Finally, we talk about brand exhibits and spend some time chatting about a recent visit to the Patek Philippe Museum in Geneva, and what made our jaws drop at the mecca of museums.Show Notes2:50: Hands-On: The VPC Type 37HW3:00: Series on Fratello about the development of VPC7:30: Anoma Watch (hands-on review to come)8:20: Amida Digitrend11:40: The B/1 Is An Audaciously Designed Watch By Newcomer Toledano & Chan15:05: Introducing: The Audemars Piguet [RE]Master0217:25: AP Chronicles on the original ref. 515925:00: Introducing: The Audemars Piguet Mini Royal Oak29:00: AP Chronicles 20mm Royal Oak from 199737:35: Patek Philippe Museum44:30: Raúl Pagès Turns The Page With His Old-School Régulateur à Détente45:30: Steel Patek Philippe 96 Sector Dial at Sotheby's46:30: The 'Imperial Patek Philippe' Owned By The Last Emperor Of China Sells For $6.2 Million47:45: Buying, Selling, Collecting: A Rare, Complicated Patek Calatrava You Might Not Have Known About Is Coming Up For Auction
"Not anyone can sell. We are all experts." Jodi Ashcraft is the Director of Media and Event Sales for the American Psychological Association (APA) and says treating your association sales team like the experts they are is one way to show their work is valued. Listen to Episode 2 as Ashcraft (APA) and Carrie McIntyre of Navigate share four areas of focus for any sales leader working to elevate the performance and workplace enjoyment of their team. Plus, find out what step Jeremy Figoten (ICMA) takes to show the value of his sales team. This episode is geared to both sales leaders and members of their team as we discuss: Compensation structures Cross departmental relationships A quality sales approach When to say 'no' and how to do it A key attribute of successful teams everywhere Association RevUP is a podcast presented by the Professionals for Association Revenue (PAR) and supported by VPC, Inc. This isn't your typical Q&A podcast - it's an 8-episode series made up of 25-minute written and produced shows jam packed with learning that will get everyone in your organization talking about revenue health! Like what you heard? Subscribe to the show, subscribe to PAR's newsletter, and join us for PAR's in-person event at the RevUP Summit 2024 in Annapolis, Maryland (Nov 20-21) (and, don't forget to add the special promo code from the episode for savings!) Additional Links: LinkedIn 2023 Workplace Learning Report More on Project Aristotle Supporting Partner: VPC, Inc, award-winning full service production company that will elevate the production quality of your next association event!
We've spoken about the value proposition canvas (VPC) a few times on the pod. But in this ep, Han's shaking things up – and she thinks she's come up with the perfect VPC for L&D. One that's going to help you make real marketing magic.
Yes, we are riffing on the title from our friends at Spirit of Time. We have the original Buzz though, so it is fair. Spence and Buzz talk about W&W, the VPC type 37 HW, among other shenanigans. Enjoy!
This week's episode Spence has the opportunity to speak with Fratello writer, and watch brand founder, Thomas Van Straaten to talk about his passion project, VPC watches.It has been a minute since we have had a brand owner on the show, but having spent some time with the VPC Type 37 HW you will see that it has left us thoroughly impressed. Thomas's eye for detail, including developing a unique typeface for the 37 HW, leads to a watch that punches well above its weight and is more than just a spec sheet watch. Overall, the fit, finishing, and detail of the Type 37 HW make the wearing experience superb. It is one of those watches that you only pick up these details having worn it. It was a pleasure getting to know Thomas better. If you have a chance, find a way to check out the Type 37 HW, you will not be disappointed!
Hello, and welcome to the 50th and penultimate episode of Fratello Talks this year. Today, we dive into the topic of building a watch brand in 2023. Your host, Nacho, is joined by Morgan and Thomas. The latter comes well-equipped to share some insights into the experience. As some of you may know, Thomas—as well as being a writer for Fratello—is also the brain behind the upcoming watch brand, VPC. But what exactly goes into starting a watch brand today? What are the challenges one faces? And what are some of the most satisfying moments along the way? In today's episode, we'll get into this and much more.
We are running an end of year survey for our listeners. Let us know any feedback you have for us, what episodes resonated with you the most, and guest requests for 2024! RAG has emerged as one of the key pieces of the AI Engineer stack. Jerry from LlamaIndex called it a “hack”, Bryan from Hex compared it to “a recommendation system from LLMs”, and even LangChain started with it. RAG is crucial in any AI coding workflow. We talked about context quality for code in our Phind episode. Today's guests, Beyang Liu and Steve Yegge from SourceGraph, have been focused on code indexing and retrieval for over 15 years. We locked them in our new studio to record a 1.5 hours masterclass on the history of code search, retrieval interfaces for code, and how they get SOTA 30% completion acceptance rate in their Cody product by being better at the “bin packing problem” of LLM context generation. Google Grok → SourceGraph → CodyWhile at Google in 2008, Steve built Grok, which lives on today as Google Kythe. It allowed engineers to do code parsing and searching across different codebases and programming languages. (You might remember this blog post from Steve's time at Google) Beyang was an intern at Google at the same time, and Grok became the inspiration to start SourceGraph in 2013. The two didn't know eachother personally until Beyang brought Steve out of retirement 9 years later to join him as VP Engineering. Fast forward 10 years, SourceGraph has become to best code search tool out there and raised $223M along the way. Nine months ago, they open sourced SourceGraph Cody, their AI coding assistant. All their code indexing and search infrastructure allows them to get SOTA results by having better RAG than competitors:* Code completions as you type that achieve an industry-best Completion Acceptance Rate (CAR) as high as 30% using a context-enhanced open-source LLM (StarCoder)* Context-aware chat that provides the option of using GPT-4 Turbo, Claude 2, GPT-3.5 Turbo, Mistral 7x8B, or Claude Instant, with more model integrations planned* Doc and unit test generation, along with AI quick fixes for common coding errors* AI-enhanced natural language code search, powered by a hybrid dense/sparse vector search engine There are a few pieces of infrastructure that helped Cody achieve these results:Dense-sparse vector retrieval system For many people, RAG = vector similarity search, but there's a lot more that you can do to get the best possible results. From their release:"Sparse vector search" is a fancy name for keyword search that potentially incorporates LLMs for things like ranking and term expansion (e.g., "k8s" expands to "Kubernetes container orchestration", possibly weighted as in SPLADE): * Dense vector retrieval makes use of embeddings, the internal representation that LLMs use to represent text. Dense vector retrieval provides recall over a broader set of results that may have no exact keyword matches but are still semantically similar. * Sparse vector retrieval is very fast, human-understandable, and yields high recall of results that closely match the user query. * We've found the approaches to be complementary.There's a very good blog post by Pinecone on SPLADE for sparse vector search if you're interested in diving in. If you're building RAG applications in areas that have a lot of industry-specific nomenclature, acronyms, etc, this is a good approach to getting better results.SCIPIn 2016, Microsoft announced the Language Server Protocol (LSP) and the Language Server Index Format (LSIF). This protocol makes it easy for IDEs to get all the context they need from a codebase to get things like file search, references, “go to definition”, etc. SourceGraph developed SCIP, “a better code indexing format than LSIF”:* Simpler and More Efficient Format: SCIP utilizes Protobuf instead of JSON, which is used by LSIF. Protobuf is more space-efficient, simpler, and more suitable for systems programming. * Better Performance and Smaller Index Sizes: SCIP indexers, such as scip-clang, show enhanced performance and reduced index file sizes compared to LSIF indexers (10%-20% smaller)* Easier to Develop and Debug: SCIP's design, centered around human-readable string IDs for symbols, makes it faster and more straightforward to develop new language indexers. Having more efficient indexing is key to more performant RAG on code. Show Notes* Sourcegraph* Cody* Copilot vs Cody* Steve's Stanford seminar on Grok* Steve's blog* Grab* Fireworks* Peter Norvig* Noam Chomsky* Code search* Kelly Norton* Zoekt* v0.devSee also our past episodes on Cursor, Phind, Codeium and Codium as well as the GitHub Copilot keynote at AI Engineer Summit.Timestamps* [00:00:00] Intros & Backgrounds* [00:05:20] How Steve's work on Grok inspired SourceGraph for Beyang* [00:08:10] What's Cody?* [00:11:22] Comparison of coding assistants and the capabilities of Cody* [00:16:00] The importance of context (RAG) in AI coding tools* [00:21:33] The debate between Chomsky and Norvig approaches in AI* [00:30:06] Normsky: the Norvig + Chomsky models collision* [00:36:00] The death of the DSL?* [00:40:00] LSP, Skip, Kythe, BFG, and all that fun stuff* [00:53:00] The SourceGraph internal stack* [00:58:46] Building on open source models* [01:02:00] SourceGraph for engineering managers?* [01:12:00] Lightning RoundTranscriptAlessio: Hey everyone, welcome to the Latent Space podcast. This is Alessio, partner and CTO-in-Residence at Decibel Partners, and I'm joined by my co-host Swyx, founder of Smol AI. [00:00:16]Swyx: Hey, and today we're christening our new podcast studio in the Newton, and we have Beyang and Steve from Sourcegraph. Welcome. [00:00:25]Beyang: Hey, thanks for having us. [00:00:26]Swyx: So this has been a long time coming. I'm very excited to have you. We also are just celebrating the one year anniversary of ChatGPT yesterday, but also we'll be talking about the GA of Cody later on today. We'll just do a quick intros of both of you. Obviously, people can research you and check the show notes for more. Beyang, you worked in computer vision at Stanford and then you worked at Palantir. I did, yeah. You also interned at Google. [00:00:48]Beyang: I did back in the day where I get to use Steve's system, DevTool. [00:00:53]Swyx: Right. What was it called? [00:00:55]Beyang: It was called Grok. Well, the end user thing was Google Code Search. That's what everyone called it, or just like CS. But the brains of it were really the kind of like Trigram index and then Grok, which provided the reference graph. [00:01:07]Steve: Today it's called Kythe, the open source Google one. It's sort of like Grok v3. [00:01:11]Swyx: On your podcast, which you've had me on, you've interviewed a bunch of other code search developers, including the current developer of Kythe, right? [00:01:19]Beyang: No, we didn't have any Kythe people on, although we would love to if they're up for it. We had Kelly Norton, who built a similar system at Etsy, it's an open source project called Hound. We also had Han-Wen Nienhuys, who created Zoekt, which is, I think, heavily inspired by the Trigram index that powered Google's original code search and that we also now use at Sourcegraph. Yeah. [00:01:45]Swyx: So you teamed up with Quinn over 10 years ago to start Sourcegraph and you were indexing all code on the internet. And now you're in a perfect spot to create a code intelligence startup. Yeah, yeah. [00:01:56]Beyang: I guess the backstory was, I used Google Code Search while I was an intern. And then after I left that internship and worked elsewhere, it was the single dev tool that I missed the most. I felt like my job was just a lot more tedious and much more of a hassle without it. And so when Quinn and I started working together at Palantir, he had also used various code search engines in open source over the years. And it was just a pain point that we both felt, both working on code at Palantir and also working within Palantir's clients, which were a lot of Fortune 500 companies, large financial institutions, folks like that. And if anything, the pains they felt in dealing with large complex code bases made our pain points feel small by comparison. So that was really the impetus for starting Sourcegraph. [00:02:42]Swyx: Yeah, excellent. Steve, you famously worked at Amazon. And you've told many, many stories. I want every single listener of Latent Space to check out Steve's YouTube because he effectively had a podcast that you didn't tell anyone about or something. You just hit record and just went on a few rants. I'm always here for your Stevie rants. And then you moved to Google, where you also had some interesting thoughts on just the overall Google culture versus Amazon. You joined Grab as head of eng for a couple of years. I'm from Singapore, so I have actually personally used a lot of Grab's features. And it was very interesting to see you talk so highly of Grab's engineering and sort of overall prospects. [00:03:21]Steve: Because as a customer, it sucked? [00:03:22]Swyx: Yeah, no, it's just like, being from a smaller country, you never see anyone from our home country being on a global stage or talked about as a startup that people admire or look up to, like on the league that you, with all your legendary experience, would consider equivalent. Yeah. [00:03:41]Steve: Yeah, no, absolutely. They actually, they didn't even know that they were as good as they were, in a sense. They started hiring a bunch of people from Silicon Valley to come in and sort of like fix it. And we came in and we were like, Oh, we could have been a little better operational excellence and stuff. But by and large, they're really sharp. The only thing about Grab is that they get criticized a lot for being too westernized. Oh, by who? By Singaporeans who don't want to work there. [00:04:06]Swyx: Okay. I guess I'm biased because I'm here, but I don't see that as a problem. If anything, they've had their success because they were more westernized than the Sanders Singaporean tech company. [00:04:15]Steve: I mean, they had their success because they are laser focused. They copy to Amazon. I mean, they're executing really, really, really well for a giant. I was on a slack with 2,500 engineers. It was like this giant waterfall that you could dip your toe into. You'd never catch up. Actually, the AI summarizers would have been really helpful there. But yeah, no, I think Grab is successful because they're just out there with their sleeves rolled up, just making it happen. [00:04:43]Swyx: And for those who don't know, it's not just like Uber of Southeast Asia, it's also a super app. PayPal Plus. [00:04:48]Steve: Yeah. [00:04:49]Swyx: In the way that super apps don't exist in the West. It's one of the enduring mysteries of B2C that super apps work in the East and don't work in the West. We just don't understand it. [00:04:57]Beyang: Yeah. [00:04:58]Steve: It's just kind of curious. They didn't work in India either. And it was primarily because of bandwidth reasons and smaller phones. [00:05:03]Swyx: That should change now. It should. [00:05:05]Steve: And maybe we'll see a super app here. [00:05:08]Swyx: You retired-ish? I did. You retired-ish on your own video game? Mm-hmm. Any fun stories about that? And that's also where you discovered some need for code search, right? Mm-hmm. [00:05:16]Steve: Sure. A need for a lot of stuff. Better programming languages, better databases. Better everything. I mean, I started in like 95, right? Where there was kind of nothing. Yeah. Yeah. [00:05:24]Beyang: I just want to say, I remember when you first went to Grab because you wrote that blog post talking about why you were excited about it, about like the expanding Asian market. And our reaction was like, oh, man, how did we miss stealing it with you? [00:05:36]Swyx: Hiring you. [00:05:37]Beyang: Yeah. [00:05:38]Steve: I was like, miss that. [00:05:39]Swyx: Tell that story. So how did this happen? Right? So you were inspired by Grok. [00:05:44]Beyang: I guess the backstory from my point of view is I had used code search and Grok while at Google, but I didn't actually know that it was connected to you, Steve. I knew you from your blog posts, which were always excellent, kind of like inside, very thoughtful takes from an engineer's perspective on some of the challenges facing tech companies and tech culture and that sort of thing. But my first introduction to you within the context of code intelligence, code understanding was I watched a talk that you gave, I think at Stanford, about Grok when you're first building it. And that was very eye opening. I was like, oh, like that guy, like the guy who, you know, writes the extremely thoughtful ranty like blog posts also built that system. And so that's how I knew, you know, you were involved in that. And then, you know, we always wanted to hire you, but never knew quite how to approach you or, you know, get that conversation started. [00:06:34]Steve: Well, we got introduced by Max, right? Yeah. It was temporal. Yeah. Yeah. I mean, it was a no brainer. They called me up and I had noticed when Sourcegraph had come out. Of course, when they first came out, I had this dagger of jealousy stabbed through me piercingly, which I remember because I am not a jealous person by any means, ever. But boy, I was like, but I was kind of busy, right? And just one thing led to another. I got sucked back into the ads vortex and whatever. So thank God Sourcegraph actually kind of rescued me. [00:07:05]Swyx: Here's a chance to build DevTools. Yeah. [00:07:08]Steve: That's the best. DevTools are the best. [00:07:10]Swyx: Cool. Well, so that's the overall intro. I guess we can get into Cody. Is there anything else that like people should know about you before we get started? [00:07:18]Steve: I mean, everybody knows I'm a musician. I can juggle five balls. [00:07:24]Swyx: Five is good. Five is good. I've only ever managed three. [00:07:27]Steve: Five is hard. Yeah. And six, a little bit. [00:07:30]Swyx: Wow. [00:07:31]Beyang: That's impressive. [00:07:32]Alessio: So yeah, to jump into Sourcegraph, this has been a company 10 years in the making. And as Sean said, now you're at the right place. Phase two. Now, exactly. You spent 10 years collecting all this code, indexing, making it easy to surface it. Yeah. [00:07:47]Swyx: And also learning how to work with enterprises and having them trust you with their code bases. Yeah. [00:07:52]Alessio: Because initially you were only doing on-prem, right? Like a lot of like VPC deployments. [00:07:55]Beyang: So in the very early days, we're cloud only. But the first major customers we landed were all on-prem, self-hosted. And that was, I think, related to the nature of the problem that we're solving, which becomes just like a critical, unignorable pain point once you're above like 100 devs or so. [00:08:11]Alessio: Yeah. And now Cody is going to be GA by the time this releases. So congrats to your future self for launching this in two weeks. Can you give a quick overview of just what Cody is? I think everybody understands that it's a AI coding agent, but a lot of companies say they have a AI coding agent. So yeah, what does Cody do? How do people interface with it? [00:08:32]Beyang: Yeah. So how is it different from the like several dozen other AI coding agents that exist in the market now? When we thought about building a coding assistant that would do things like code generation and question answering about your code base, I think we came at it from the perspective of, you know, we've spent the past decade building the world's best code understanding engine for human developers, right? So like it's kind of your guide as a human dev if you want to go and dive into a large complex code base. And so our intuition was that a lot of the context that we're providing to human developers would also be useful context for AI developers to consume. And so in terms of the feature set, Cody is very similar to a lot of other assistants. It does inline autocompletion. It does code base aware chat. It does specific commands that automate, you know, tasks that you might rather not want to do like generating unit tests or adding detailed documentation. But we think the core differentiator is really the quality of the context, which is hard to kind of describe succinctly. It's a bit like saying, you know, what's the difference between Google and Alta Vista? There's not like a quick checkbox list of features that you can rattle off, but it really just comes down to all the attention and detail that we've paid to making that context work well and be high quality and fast for human devs. We're now kind of plugging into the AI coding assistant as well. Yeah. [00:09:53]Steve: I mean, just to add my own perspective on to what Beyang just described, RAG is kind of like a consultant that the LLM has available, right, that knows about your code. RAG provides basically a bridge to a lookup system for the LLM, right? Whereas fine tuning would be more like on the job training for somebody. If the LLM is a person, you know, and you send them to a new job and you do on the job training, that's what fine tuning is like, right? So tuned to our specific task. You're always going to need that expert, even if you get the on the job training, because the expert knows your particular code base, your task, right? That expert has to know your code. And there's a chicken and egg problem because, right, you know, we're like, well, I'm going to ask the LLM about my code, but first I have to explain it, right? It's this chicken and egg problem. That's where RAG comes in. And we have the best consultants, right? The best assistant who knows your code. And so when you sit down with Cody, right, what Beyang said earlier about going to Google and using code search and then starting to feel like without it, his job was super tedious. Once you start using these, do you guys use coding assistants? [00:10:53]Swyx: Yeah, right. [00:10:54]Steve: I mean, like we're getting to the point very quickly, right? Where you feel like almost like you're programming without the internet, right? Or something, you know, it's like you're programming back in the nineties without the coding assistant. Yeah. Hopefully that helps for people who have like no idea about coding systems, what they are. [00:11:09]Swyx: Yeah. [00:11:10]Alessio: I mean, going back to using them, we had a lot of them on the podcast already. We had Cursor, we have Codium and Codium, very similar names. [00:11:18]Swyx: Yeah. Find, and then of course there's Copilot. [00:11:22]Alessio: You had a Copilot versus Cody blog post, and I think it really shows the context improvement. So you had two examples that stuck with me. One was, what does this application do? And the Copilot answer was like, oh, it uses JavaScript and NPM and this. And it's like, but that's not what it does. You know, that's what it's built with. Versus Cody was like, oh, these are like the major functions. And like, these are the functionalities and things like that. And then the other one was, how do I start this up? And Copilot just said NPM start, even though there was like no start command in the package JSON, but you know, most collapse, right? Most projects use NPM start. So maybe this does too. How do you think about open source models? Because Copilot has their own private thing. And I think you guys use Starcoder, if I remember right. Yeah, that's correct. [00:12:09]Beyang: I think Copilot uses some variant of Codex. They're kind of cagey about it. I don't think they've like officially announced what model they use. [00:12:16]Swyx: And I think they use a range of models based on what you're doing. Yeah. [00:12:19]Beyang: So everyone uses a range of model. Like no one uses the same model for like inline completion versus like chat because the latency requirements for. Oh, okay. Well, there's fill in the middle. There's also like what the model's trained on. So like we actually had completions powered by Claude Instant for a while. And but you had to kind of like prompt hack your way to get it to output just the code and not like, hey, you know, here's the code you asked for, like that sort of text. So like everyone uses a range of models. We've kind of designed Cody to be like especially model, not agnostic, but like pluggable. So one of our kind of design considerations was like as the ecosystem evolves, we want to be able to integrate the best in class models, whether they're proprietary or open source into Cody because the pace of innovation in the space is just so quick. And I think that's been to our advantage. Like today, Cody uses Starcoder for inline completions. And with the benefit of the context that we provide, we actually show comparable completion acceptance rate metrics. It's kind of like the standard metric that folks use to evaluate inline completion quality. It's like if I show you a completion, what's the chance that you actually accept the completion versus you reject it? And so we're at par with Copilot, which is at the head of that industry right now. And we've been able to do that with the Starcoder model, which is open source and the benefit of the context fetching stuff that we provide. And of course, a lot of like prompt engineering and other stuff along the way. [00:13:40]Alessio: And Steve, you wrote a post called cheating is all you need about what you're building. And one of the points you made is that everybody's fighting on the same axis, which is better UI and the IDE, maybe like a better chat response. But data modes are kind of the most important thing. And you guys have like a 10 year old mode with all the data you've been collecting. How do you kind of think about what other companies are doing wrong, right? Like, why is nobody doing this in terms of like really focusing on RAG? I feel like you see so many people. Oh, we just got a new model. It's like a bit human eval. And it's like, well, but maybe like that's not what we should really be doing, you know? Like, do you think most people underestimate the importance of like the actual RAG in code? [00:14:21]Steve: I think that people weren't doing it much. It wasn't. It's kind of at the edges of AI. It's not in the center. I know that when ChatGPT launched, so within the last year, I've heard a lot of rumblings from inside of Google, right? Because they're undergoing a huge transformation to try to, you know, of course, get into the new world. And I heard that they told, you know, a bunch of teams to go and train their own models or fine tune their own models, right? [00:14:43]Swyx: Both. [00:14:43]Steve: And, you know, it was a s**t show. Nobody knew how to do it. They launched two coding assistants. One was called Code D with an EY. And then there was, I don't know what happened in that one. And then there's Duet, right? Google loves to compete with themselves, right? They do this all the time. And they had a paper on Duet like from a year ago. And they were doing exactly what Copilot was doing, which was just pulling in the local context, right? But fundamentally, I thought of this because we were talking about the splitting of the [00:15:10]Swyx: models. [00:15:10]Steve: In the early days, it was the LLM did everything. And then we realized that for certain use cases, like completions, that a different, smaller, faster model would be better. And that fragmentation of models, actually, we expected to continue and proliferate, right? Because we are fundamentally, we're a recommender engine right now. Yeah, we're recommending code to the LLM. We're saying, may I interest you in this code right here so that you can answer my question? [00:15:34]Swyx: Yeah? [00:15:34]Steve: And being good at recommender engine, I mean, who are the best recommenders, right? There's YouTube and Spotify and, you know, Amazon or whatever, right? Yeah. [00:15:41]Swyx: Yeah. [00:15:41]Steve: And they all have many, many, many, many, many models, right? For all fine-tuned for very specific, you know. And that's where we're heading in code, too. Absolutely. [00:15:50]Swyx: Yeah. [00:15:50]Alessio: We just did an episode we released on Wednesday, which we said RAG is like Rexis or like LLMs. You're basically just suggesting good content. [00:15:58]Swyx: It's like what? Recommendations. [00:15:59]Beyang: Recommendations. [00:16:00]Alessio: Oh, got it. [00:16:01]Steve: Yeah, yeah, yeah. [00:16:02]Swyx: So like the naive implementation of RAG is you embed everything, throw it in a vector database, you embed your query, and then you find the nearest neighbors, and that's your RAG. But actually, you need to rank it. And actually, you need to make sure there's sample diversity and that kind of stuff. And then you're like slowly gradient dissenting yourself towards rediscovering proper Rexis, which has been traditional ML for a long time. But like approaching it from an LLM perspective. Yeah. [00:16:24]Beyang: I almost think of it as like a generalized search problem because it's a lot of the same things. Like you want your layer one to have high recall and get all the potential things that could be relevant. And then there's typically like a layer two re-ranking mechanism that bumps up the precision and tries to get the relevant stuff to the top of the results list. [00:16:43]Swyx: Have you discovered that ranking matters a lot? Oh, yeah. So the context is that I think a lot of research shows that like one, context utilization matters based on model. Like GPT uses the top of the context window, and then apparently Claude uses the bottom better. And it's lossy in the middle. Yeah. So ranking matters. No, it really does. [00:17:01]Beyang: The skill with which models are able to take advantage of context is always going to be dependent on how that factors into the impact on the training loss. [00:17:10]Swyx: Right? [00:17:10]Beyang: So like if you want long context window models to work well, then you have to have a ton of data where it's like, here's like a billion lines of text. And I'm going to ask a question about like something that's like, you know, embedded deeply into it and like, give me the right answer. And unless you have that training set, then of course, you're going to have variability in terms of like where it attends to. And in most kind of like naturally occurring data, the thing that you're talking about right now, the thing I'm asking you about is going to be something that we talked about recently. [00:17:36]Swyx: Yeah. [00:17:36]Steve: Did you really just say gradient dissenting yourself? Actually, I love that it's entered the casual lexicon. Yeah, yeah, yeah. [00:17:44]Swyx: My favorite version of that is, you know, how we have to p-hack papers. So, you know, when you throw humans at the problem, that's called graduate student dissent. That's great. It's really awesome. [00:17:54]Alessio: I think the other interesting thing that you have is this inline assist UX that I wouldn't say async, but like it works while you can also do work. So you can ask Cody to make changes on a code block and you can still edit the same file at the same time. [00:18:07]Swyx: Yeah. [00:18:07]Alessio: How do you see that in the future? Like, do you see a lot of Cody's running together at the same time? Like, how do you validate also that they're not messing each other up as they make changes in the code? And maybe what are the limitations today? And what do you think about where the attack is going? [00:18:21]Steve: I want to start with a little history and then I'm going to turn it over to Bian, all right? So we actually had this feature in the very first launch back in June. Dominic wrote it. It was called nonstop Cody. And you could have multiple, basically, LLM requests in parallel modifying your source [00:18:37]Swyx: file. [00:18:37]Steve: And he wrote a bunch of code to handle all of the diffing logic. And you could see the regions of code that the LLM was going to change, right? And he was showing me demos of it. And it just felt like it was just a little before its time, you know? But a bunch of that stuff, that scaffolding was able to be reused for where we're inline [00:18:56]Swyx: sitting today. [00:18:56]Steve: How would you characterize it today? [00:18:58]Beyang: Yeah, so that interface has really evolved from a, like, hey, general purpose, like, request anything inline in the code and have the code update to really, like, targeted features, like, you know, fix the bug that exists at this line or request a very specific [00:19:13]Swyx: change. [00:19:13]Beyang: And the reason for that is, I think, the challenge that we ran into with inline fixes, and we do want to get to the point where you could just fire and forget and have, you know, half a dozen of these running in parallel. But I think we ran into the challenge early on that a lot of people are running into now when they're trying to construct agents, which is the reliability of, you know, working code generation is just not quite there yet in today's language models. And so that kind of constrains you to an interaction where the human is always, like, in the inner loop, like, checking the output of each response. And if you want that to work in a way where you can be asynchronous, you kind of have to constrain it to a domain where today's language models can generate reliable code well enough. So, you know, generating unit tests, that's, like, a well-constrained problem. Or fixing a bug that shows up as, like, a compiler error or a test error, that's a well-constrained problem. But the more general, like, hey, write me this class that does X, Y, and Z using the libraries that I have, that is not quite there yet, even with the benefit of really good context. Like, it definitely moves the needle a lot, but we're not quite there yet to the point where you can just fire and forget. And I actually think that this is something that people don't broadly appreciate yet, because I think that, like, everyone's chasing this dream of agentic execution. And if we're to really define that down, I think it implies a couple things. You have, like, a multi-step process where each step is fully automated. We don't have to have a human in the loop every time. And there's also kind of like an LM call at each stage or nearly every stage in that [00:20:45]Swyx: chain. [00:20:45]Beyang: Based on all the work that we've done, you know, with the inline interactions, with kind of like general Codyfeatures for implementing longer chains of thought, we're actually a little bit more bearish than the average, you know, AI hypefluencer out there on the feasibility of agents with purely kind of like transformer-based models. To your original question, like, the inline interactions with CODI, we actually constrained it to be more targeted, like, you know, fix the current error or make this quick fix. I think that that does differentiate us from a lot of the other tools on the market, because a lot of people are going after this, like, shnazzy, like, inline edit interaction, whereas I think where we've moved, and this is based on the user feedback that we've gotten, it's like that sort of thing, it demos well, but when you're actually coding day to day, you don't want to have, like, a long chat conversation inline with the code base. That's a waste of time. You'd rather just have it write the right thing and then move on with your life or not have to think about it. And that's what we're trying to work towards. [00:21:37]Steve: I mean, yeah, we're not going in the agent direction, right? I mean, I'll believe in agents when somebody shows me one that works. Yeah. Instead, we're working on, you know, sort of solidifying our strength, which is bringing the right context in. So new context sources, ways for you to plug in your own context, ways for you to control or influence the context, you know, the mixing that happens before the request goes out, etc. And there's just so much low-hanging fruit left in that space that, you know, agents seems like a little bit of a boondoggle. [00:22:03]Beyang: Just to dive into that a little bit further, like, I think, you know, at a very high level, what do people mean when they say agents? They really mean, like, greater automation, fully automated, like, the dream is, like, here's an issue, go implement that. And I don't have to think about it as a human. And I think we are working towards that. Like, that is the eventual goal. I think it's specifically the approach of, like, hey, can we have a transformer-based LM alone be the kind of, like, backbone or the orchestrator of these agentic flows? Where we're a little bit more bearish today. [00:22:31]Swyx: You want the human in the loop. [00:22:32]Beyang: I mean, you kind of have to. It's just a reality of the behavior of language models that are purely, like, transformer-based. And I think that's just like a reflection of reality. And I don't think people realize that yet. Because if you look at the way that a lot of other AI tools have implemented context fetching, for instance, like, you see this in the Copilot approach, where if you use, like, the at-workspace thing that supposedly provides, like, code-based level context, it has, like, an agentic approach where you kind of look at how it's behaving. And it feels like they're making multiple requests to the LM being like, what would you do in this case? Would you search for stuff? What sort of files would you gather? Go and read those files. And it's like a multi-hop step, so it takes a long while. It's also non-deterministic. Because any sort of, like, LM invocation, it's like a dice roll. And then at the end of the day, the context it fetches is not that good. Whereas our approach is just like, OK, let's do some code searches that make sense. And then maybe, like, crawl through the reference graph a little bit. That is fast. That doesn't require any sort of LM invocation at all. And we can pull in much better context, you know, very quickly. So it's faster. [00:23:37]Swyx: It's more reliable. [00:23:37]Beyang: It's deterministic. And it yields better context quality. And so that's what we think. We just don't think you should cargo cult or naively go like, you know, agents are the [00:23:46]Swyx: future. [00:23:46]Beyang: Let's just try to, like, implement agents on top of the LM that exists today. I think there are a couple of other technologies or approaches that need to be refined first before we can get into these kind of, like, multi-stage, fully automated workflows. [00:24:00]Swyx: It makes sense. You know, we're very much focused on developer inner loop right now. But you do see things eventually moving towards developer outer loop. Yeah. So would you basically say that they're tackling the agent's problem that you don't want to tackle? [00:24:11]Beyang: No, I would say at a high level, we are after maybe, like, the same high level problem, which is like, hey, I want some code written. I want to develop some software and can automate a system. Go build that software for me. I think the approaches might be different. So I think the analogy in my mind is, I think about, like, the AI chess players. Coding, in some senses, I mean, it's similar and dissimilar to chess. I think one question I ask is, like, do you think producing code is more difficult than playing chess or less difficult than playing chess? More. [00:24:41]Swyx: I think more. [00:24:41]Beyang: Right. And if you look at the best AI chess players, like, yes, you can use an LLM to play chess. Like, people have showed demos where it's like, oh, like, yeah, GPT-4 is actually a pretty decent, like, chess move suggester. Right. But you would never build, like, a best in class chess player off of GPT-4 alone. [00:24:57]Swyx: Right. [00:24:57]Beyang: Like, the way that people design chess players is that you have kind of like a search space and then you have a way to explore that search space efficiently. There's a bunch of search algorithms, essentially. We were doing tree search in various ways. And you can have heuristic functions, which might be powered by an LLM. [00:25:12]Swyx: Right. [00:25:12]Beyang: Like, you might use an LLM to generate proposals in that space that you can efficiently explore. But the backbone is still this kind of more formalized tree search based approach rather than the LLM itself. And so I think my high level intuition is that, like, the way that we get to more reliable multi-step workflows that do things beyond, you know, generate unit test, it's really going to be like a search based approach where you use an LLM as kind of like an advisor or a proposal function, sort of your heuristic function, like the ASTAR search algorithm. But it's probably not going to be the thing that is the backbone, because I guess it's not the right tool for that. Yeah. [00:25:50]Swyx: I can see yourself kind of thinking through this, but not saying the words, the sort of philosophical Peter Norvig type discussion. Maybe you want to sort of introduce that in software. Yeah, definitely. [00:25:59]Beyang: So your listeners are savvy. They're probably familiar with the classic like Chomsky versus Norvig debate. [00:26:04]Swyx: No, actually, I wanted, I was prompting you to introduce that. Oh, got it. [00:26:08]Beyang: So, I mean, if you look at the history of artificial intelligence, right, you know, it goes way back to, I don't know, it's probably as old as modern computers, like 50s, 60s, 70s. People are debating on like, what is the path to producing a sort of like general human level of intelligence? And kind of two schools of thought that emerged. One is the Norvig school of thought, which roughly speaking includes large language models, you know, regression, SVN, basically any model that you kind of like learn from data. And it's like data driven. Most of machine learning would fall under this umbrella. And that school of thought says like, you know, just learn from the data. That's the approach to reaching intelligence. And then the Chomsky approach is more things like compilers and parsers and formal systems. So basically like, let's think very carefully about how to construct a formal, precise system. And that will be the approach to how we build a truly intelligent system. I think Lisp was invented so that you could create like rules-based systems that you would call AI. As a language. Yeah. And for a long time, there was like this debate, like there's certain like AI research labs that were more like, you know, in the Chomsky camp and others that were more in the Norvig camp. It's a debate that rages on today. And I feel like the consensus right now is that, you know, Norvig definitely has the upper hand right now with the advent of LMs and diffusion models and all the other recent progress in machine learning. But the Chomsky-based stuff is still really useful in my view. I mean, it's like parsers, compilers, basically a lot of the stuff that provides really good context. It provides kind of like the knowledge graph backbone that you want to explore with your AI dev tool. Like that will come from kind of like Chomsky-based tools like compilers and parsers. It's a lot of what we've invested in in the past decade at Sourcegraph and what you build with Grok. Basically like these formal systems that construct these very precise knowledge graphs that are great context providers and great kind of guard rails enforcers and kind of like safety checkers for the output of a more kind of like data-driven, fuzzier system that uses like the Norvig-based models. [00:28:03]Steve: Jang was talking about this stuff like it happened in the middle ages. Like, okay, so when I was in college, I was in college learning Lisp and prologue and planning and all the deterministic Chomsky approaches to AI. And I was there when Norvig basically declared it dead. I was there 3,000 years ago when Norvig and Chomsky fought on the volcano. When did he declare it dead? [00:28:26]Swyx: What do you mean he declared it dead? [00:28:27]Steve: It was like late 90s. [00:28:29]Swyx: Yeah. [00:28:29]Steve: When I went to Google, Peter Norvig was already there. He had basically like, I forget exactly where. It was some, he's got so many famous short posts, you know, amazing. [00:28:38]Swyx: He had a famous talk, the unreasonable effectiveness of data. Yeah. [00:28:41]Steve: Maybe that was it. But at some point, basically, he basically convinced everybody that deterministic approaches had failed and that heuristic-based, you know, data-driven statistical approaches, stochastic were better. [00:28:52]Swyx: Yeah. [00:28:52]Steve: The primary reason I can tell you this, because I was there, was that, was that, well, the steam-powered engine, no. The reason was that the deterministic stuff didn't scale. [00:29:06]Swyx: Yeah. Right. [00:29:06]Steve: They're using prologue, man, constraint systems and stuff like that. Well, that was a long time ago, right? Today, actually, these Chomsky-style systems do scale. And that's, in fact, exactly what Sourcegraph has built. Yeah. And so we have a very unique, I love the framing that Bjong's made, that the marriage of the Chomsky and the Norvig, you know, sort of models, you know, conceptual models, because we, you know, we have both of them and they're both really important. And in fact, there, there's this really interesting, like, kind of overlap between them, right? Where like the AI or our graph or our search engine could potentially provide the right context for any given query, which is, of course, why ranking is important. But what we've really signed ourselves up for is an extraordinary amount of testing. [00:29:45]Swyx: Yeah. [00:29:45]Steve: Because in SWIGs, you were saying that, you know, GPT-4 tends to the front of the context window and maybe other LLMs to the back and maybe, maybe the LLM in the middle. [00:29:53]Swyx: Yeah. [00:29:53]Steve: And so that means that, you know, if we're actually like, you know, verifying whether we, you know, some change we've made has improved things, we're going to have to test putting it at the beginning of the window and at the end of the window, you know, and maybe make the right decision based on the LLM that you've chosen. Which some of our competitors, that's a problem that they don't have, but we meet you, you know, where you are. Yeah. And we're, just to finish, we're writing tens of thousands. We're generating tests, you know, fill in the middle type tests and things. And then using our graph to basically sort of fine tune Cody's behavior there. [00:30:20]Swyx: Yeah. [00:30:21]Beyang: I also want to add, like, I have like an internal pet name for this, like kind of hybrid architecture that I'm trying to make catch on. Maybe I'll just say it here. Just saying it publicly kind of makes it more real. But like, I call the architecture that we've developed the Normsky architecture. [00:30:36]Swyx: Yeah. [00:30:36]Beyang: I mean, it's obviously a portmanteau of Norvig and Chomsky, but the acronym, it stands for non-agentic, rapid, multi-source code intelligence. So non-agentic because... Rolls right off the tongue. And Normsky. But it's non-agentic in the sense that like, we're not trying to like pitch you on kind of like agent hype, right? Like it's the things it does are really just developer tools developers have been using for decades now, like parsers and really good search indexes and things like that. Rapid because we place an emphasis on speed. We don't want to sit there waiting for kind of like multiple LLM requests to return to complete a simple user request. Multi-source because we're thinking broadly about what pieces of information and knowledge are useful context. So obviously starting with things that you can search in your code base, and then you add in the reference graph, which kind of like allows you to crawl outward from those initial results. But then even beyond that, you know, sources of information, like there's a lot of knowledge that's embedded in docs, in PRDs or product specs, in your production logging system, in your chat, in your Slack channel, right? Like there's so much context is embedded there. And when you're a human developer, and you're trying to like be productive in your code base, you're going to go to all these different systems to collect the context that you need to figure out what code you need to write. And I don't think the AI developer will be any different. It will need to pull context from all these different sources. So we're thinking broadly about how to integrate these into Codi. We hope through kind of like an open protocol that like others can extend and implement. And this is something else that should be accessible by December 14th in kind of like a preview stage. But that's really about like broadening this notion of the code graph beyond your Git repository to all the other sources where technical knowledge and valuable context can live. [00:32:21]Steve: Yeah, it becomes an artifact graph, right? It can link into your logs and your wikis and any data source, right? [00:32:27]Alessio: How do you guys think about the importance of, it's almost like data pre-processing in a way, which is bring it all together, tie it together, make it ready. Any thoughts on how to actually make that good? Some of the innovation you guys have made. [00:32:40]Steve: We talk a lot about the context fetching, right? I mean, there's a lot of ways you could answer this question. But, you know, we've spent a lot of time just in this podcast here talking about context fetching. But stuffing the context into the window is, you know, the bin packing problem, right? Because the window is not big enough, and you've got more context than you can fit. You've got a ranker maybe. But what is that context? Is it a function that was returned by an embedding or a graph call or something? Do you need the whole function? Or do you just need, you know, the top part of the function, this expression here, right? You know, so that art, the golf game of trying to, you know, get each piece of context down into its smallest state, possibly even summarized by another model, right, before it even goes to the LLM, becomes this is the game that we're in, yeah? And so, you know, recursive summarization and all the other techniques that you got to use to like stuff stuff into that context window become, you know, critically important. And you have to test them across every configuration of models that you could possibly need. [00:33:32]Beyang: I think data preprocessing is probably the like unsexy, way underappreciated secret to a lot of the cool stuff that people are shipping today. Whether you're doing like RAG or fine tuning or pre-training, like the preprocessing step matters so much because it's basically garbage in, garbage out, right? Like if you're feeding in garbage to the model, then it's going to output garbage. Concretely, you know, for code RAG, if you're not doing some sort of like preprocessing that takes advantage of a parser and is able to like extract the key components of a particular file of code, you know, separate the function signature from the body, from the doc string, what are you even doing? Like that's like table stakes. It opens up so much more possibilities with which you can kind of like tune your system to take advantage of the signals that come from those different parts of the code. Like we've had a tool, you know, since computers were invented that understands the structure of source code to a hundred percent precision. The compiler knows everything there is to know about the code in terms of like structure. Like why would you not want to use that in a system that's trying to generate code, answer questions about code? You shouldn't throw that out the window just because now we have really good, you know, data-driven models that can do other things. [00:34:44]Steve: Yeah. When I called it a data moat, you know, in my cheating post, a lot of people were confused, you know, because data moat sort of sounds like data lake because there's data and water and stuff. I don't know. And so they thought that we were sitting on this giant mountain of data that we had collected, but that's not what our data moat is. It's really a data pre-processing engine that can very quickly and scalably, like basically dissect your entire code base in a very small, fine-grained, you know, semantic unit and then serve it up. Yeah. And so it's really, it's not a data moat. It's a data pre-processing moat, I guess. [00:35:15]Beyang: Yeah. If anything, we're like hypersensitive to customer data privacy requirements. So it's not like we've taken a bunch of private data and like, you know, trained a generally available model. In fact, exactly the opposite. A lot of our customers are choosing Cody over Copilot and other competitors because we have an explicit guarantee that we don't do any of that. And that we've done that from day one. Yeah. I think that's a very real concern in today's day and age, because like if your proprietary IP finds its way into the training set of any model, it's very easy both to like extract that knowledge from the model and also use it to, you know, build systems that kind of work on top of the institutional knowledge that you've built up. [00:35:52]Alessio: About a year ago, I wrote a post on LLMs for developers. And one of the points I had was maybe the depth of like the DSL. I spent most of my career writing Ruby and I love Ruby. It's so nice to use, but you know, it's not as performant, but it's really easy to read, right? And then you look at other languages, maybe they're faster, but like they're more verbose, you know? And when you think about efficiency of the context window, that actually matters. [00:36:15]Swyx: Yeah. [00:36:15]Alessio: But I haven't really seen a DSL for models, you know? I haven't seen like code being optimized to like be easier to put in a model context. And it seems like your pre-processing is kind of doing that. Do you see in the future, like the way we think about the DSL and APIs and kind of like service interfaces be more focused on being context friendly, where it's like maybe it's harder to read for the human, but like the human is never going to write it anyway. We were talking on the Hacks podcast. There are like some data science things like spin up the spandex, like humans are never going to write again because the models can just do very easily. Yeah, curious to hear your thoughts. [00:36:51]Steve: Well, so DSLs, they involve, you know, writing a grammar and a parser and they're like little languages, right? We do them that way because, you know, we need them to compile and humans need to be able to read them and so on. The LLMs don't need that level of structure. You can throw any pile of crap at them, you know, more or less unstructured and they'll deal with it. So I think that's why a DSL hasn't emerged for sort of like communicating with the LLM or packaging up the context or anything. Maybe it will at some point, right? We've got, you know, tagging of context and things like that that are sort of peeking into DSL territory, right? But your point on do users, you know, do people have to learn DSLs like regular expressions or, you know, pick your favorite, right? XPath. I think you're absolutely right that the LLMs are really, really good at that. And I think you're going to see a lot less of people having to slave away learning these things. They just have to know the broad capabilities and the LLM will take care of the rest. [00:37:42]Swyx: Yeah, I'd agree with that. [00:37:43]Beyang: I think basically like the value profit of DSL is that it makes it easier to work with a lower level language, but at the expense of introducing an abstraction layer. And in many cases today, you know, without the benefit of AI cogeneration, like that totally worth it, right? With the benefit of AI cogeneration, I mean, I don't think all DSLs will go away. I think there's still, you know, places where that trade-off is going to be worthwhile. But it's kind of like how much of source code do you think is going to be generated through natural language prompting in the future? Because in a way, like any programming language is just a DSL on top of assembly, right? And so if people can do that, then yeah, like maybe for a large portion of the code [00:38:21]Swyx: that's written, [00:38:21]Beyang: people don't actually have to understand the DSL that is Ruby or Python or basically any other programming language that exists. [00:38:28]Steve: I mean, seriously, do you guys ever write SQL queries now without using a model of some sort? At least a draft. [00:38:34]Swyx: Yeah, right. [00:38:36]Steve: And so we have kind of like, you know, past that bridge, right? [00:38:39]Alessio: Yeah, I think like to me, the long-term thing is like, is there ever going to be, you don't actually see the code, you know? It's like, hey, the basic thing is like, hey, I need a function to some two numbers and that's it. I don't need you to generate the code. [00:38:53]Steve: And the following question, do you need the engineer or the paycheck? [00:38:56]Swyx: I mean, right? [00:38:58]Alessio: That's kind of the agent's discussion in a way where like you cannot automate the agents, but like slowly you're getting more of the atomic units of the work kind of like done. I kind of think of it as like, you know, [00:39:09]Beyang: do you need a punch card operator to answer that for you? And so like, I think we're still going to have people in the role of a software engineer, but the portion of time they spend on these kinds of like low-level, tedious tasks versus the higher level, more creative tasks is going to shift. [00:39:23]Steve: No, I haven't used punch cards. [00:39:25]Swyx: Yeah, I've been talking about like, so we kind of made this podcast about the sort of rise of the AI engineer. And like the first step is the AI enhanced engineer. That is that software developer that is no longer doing these routine, boilerplate-y type tasks, because they're just enhanced by tools like yours. So you mentioned OpenCodeGraph. I mean, that is a kind of DSL maybe, and because we're releasing this as you go GA, you hope for other people to take advantage of that? [00:39:52]Beyang: Oh yeah, I would say so OpenCodeGraph is not a DSL. It's more of a protocol. It's basically like, hey, if you want to make your system, whether it's, you know, chat or logging or whatever accessible to an AI developer tool like Cody, here's kind of like the schema by which you can provide that context and offer hints. So I would, you know, comparisons like LSP obviously did this for kind of like standard code intelligence. It's kind of like a lingua franca for providing fine references and codefinition. There's kind of like analogs to that. There might be also analogs to kind of the original OpenAI, kind of like plugins, API. There's all this like context out there that might be useful for an LM-based system to consume. And so at a high level, what we're trying to do is define a common language for context providers to provide context to other tools in the software development lifecycle. Yeah. Do you have any critiques of LSP, by the way, [00:40:42]Swyx: since like this is very much, very close to home? [00:40:45]Steve: One of the authors wrote a really good critique recently. Yeah. I don't think I saw that. Yeah, yeah. LSP could have been better. It just came out a couple of weeks ago. It was a good article. [00:40:54]Beyang: Yeah. I think LSP is great. Like for what it did for the developer ecosystem, it was absolutely fantastic. Like nowadays, like it's much easier now to get code navigation up and running in a bunch of editors by speaking this protocol. I think maybe the interesting question is like looking at the different design decisions comparing LSP basically with Kythe. Because Kythe has more of a... How would you describe it? [00:41:18]Steve: A storage format. [00:41:20]Beyang: I think the critique of LSP from a Kythe point of view would be like with LSP, you don't actually have an actual symbolic model of the code. It's not like LSP models like, hey, this function calls this other function. LSP is all like range-based. Like, hey, your cursor's at line 32, column 1. [00:41:35]Swyx: Yeah. [00:41:35]Beyang: And that's the thing you feed into the language server. And then it's like, okay, here's the range that you should jump to if you click on that range. So it kind of is intentionally ignorant of the fact that there's a thing called a reference underneath your cursor, and that's linked to a symbol definition. [00:41:49]Steve: Well, actually, that's the worst example you could have used. You're right. But that's the one thing that it actually did bake in is following references. [00:41:56]Swyx: Sure. [00:41:56]Steve: But it's sort of hardwired. [00:41:58]Swyx: Yeah. [00:41:58]Steve: Whereas Kythe attempts to model [00:42:00]Beyang: like all these things explicitly. [00:42:02]Swyx: And so... [00:42:02]Steve: Well, so LSP is a protocol, right? And so Google's internal protocol is gRPC-based. And it's a different approach than LSP. It's basically you make a heavy query to the back end, and you get a lot of data back, and then you render the whole page, you know? So we've looked at LSP, and we think that it's a little long in the tooth, right? I mean, it's a great protocol, lots and lots of support for it. But we need to push into the domain of exposing the intelligence through the protocol. Yeah. [00:42:29]Beyang: And so I would say we've developed a protocol of our own called Skip, which is at a very high level trying to take some of the good ideas from LSP and from Kythe and merge that into a system that in the near term is useful for Sourcegraph, but I think in the long term, we hope will be useful for the ecosystem. Okay, so here's what LSP did well. LSP, by virtue of being like intentionally dumb, dumb in air quotes, because I'm not like ragging on it, allowed language servers developers to kind of like bypass the hard problem of like modeling language semantics precisely. So like if all you want to do is jump to definition, you don't have to come up with like a universally unique naming scheme for each symbol, which is actually quite challenging because you have to think about like, okay, what's the top scope of this name? Is it the source code repository? Is it the package? Does it depend on like what package server you're fetching this from? Like whether it's the public one or the one inside your... Anyways, like naming is hard, right? And by just going from kind of like a location to location based approach, you basically just like throw that out the window. All I care about is jumping definition, just make that work. And you can make that work without having to deal with like all the complex global naming things. The limitation of that approach is that it's harder to build on top of that to build like a true knowledge graph. Like if you actually want a system that says like, okay, here's the web of functions and here's how they reference each other. And I want to incorporate that like semantic model of how the code operates or how the code relates to each other at like a static level. You can't do that with LSP because you have to deal with line ranges. And like concretely the pain point that we found in using LSP for source graph is like in order to do like a find references [00:44:04]Swyx: and then jump definitions, [00:44:04]Beyang: it's like a multi-hop process because like you have to jump to the range and then you have to find the symbol at that range. And it just adds a lot of latency and complexity of these operations where as a human, you're like, well, this thing clearly references this other thing. Why can't you just jump me to that? And I think that's the thing that Kaith does well. But then I think the issue that Kaith has had with adoption is because it is more sophisticated schema, I think. And so there's basically more things that you have to implement to get like a Kaith implementation up and running. I hope I'm not like, correct me if I'm wrong about any of this. [00:44:35]Steve: 100%, 100%. Kaith also has a problem, all these systems have the problem, even skip, or at least the way that we implemented the indexers, that they have to integrate with your build system in order to build that knowledge graph, right? Because you have to basically compile the code in a special mode to generate artifacts instead of binaries. And I would say, by the way, earlier I was saying that XREFs were in LSP, but it's actually, I was thinking of LSP plus LSIF. [00:44:58]Swyx: Yeah. That's another. [00:45:01]Steve: Which is actually bad. We can say that it's bad, right? [00:45:04]Steve: It's like skip or Kaith, it's supposed to be sort of a model serialization, you know, for the code graph, but it basically just does what LSP needs, the bare minimum. LSIF is basically if you took LSP [00:45:16]Beyang: and turned that into a serialization format. So like you build an index for language servers to kind of like quickly bootstrap from cold start. But it's a graph model [00:45:23]Steve: with all of the inconvenience of the API without an actual graph. And so, yeah. [00:45:29]Beyang: So like one of the things that we try to do with skip is try to capture the best of both worlds. So like make it easy to write an indexer, make the schema simple, but also model some of the more symbolic characteristics of the code that would allow us to essentially construct this knowledge graph that we can then make useful for both the human developer through SourceGraph and through the AI developer through Cody. [00:45:49]Steve: So anyway, just to finish off the graph comment, we've got a new graph, yeah, that's skip based. We call it BFG internally, right? It's a beautiful something graph. A big friendly graph. [00:46:00]Swyx: A big friendly graph. [00:46:01]Beyang: It's a blazing fast. [00:46:02]Steve: Blazing fast. [00:46:03]Swyx: Blazing fast graph. [00:46:04]Steve: And it is blazing fast, actually. It's really, really interesting. I should probably have to do a blog post about it to walk you through exactly how they're doing it. Oh, please. But it's a very AI-like iterative, you know, experimentation sort of approach. We're building a code graph based on all of our 10 years of knowledge about building code graphs, yeah? But we're building it quickly with zero configuration, and it doesn't have to integrate with your build. And through some magic tricks that we have. And so what just happens when you install the plugin, that it'll be there and indexing your code and providing that knowledge graph in the background without all that build system integration. This is a bit of secret sauce that we haven't really like advertised it very much lately. But I am super excited about it because what they do is they say, all right, you know, let's tackle function parameters today. Cody's not doing a very good job of completing function call arguments or function parameters in the definition, right? Yeah, we generate those thousands of tests, and then we can actually reuse those tests for the AI context as well. So fortunately, things are kind of converging on, we have, you know, half a dozen really, really good context sources, and we mix them all together. So anyway, BFG, you're going to hear more about it probably in the holidays? [00:47:12]Beyang: I think it'll be online for December 14th. We'll probably mention it. BFG is probably not the public name we're going to go with. I think we might call it like Graph Context or something like that. [00:47:20]Steve: We're officially calling it BFG. [00:47:22]Swyx: You heard it here first. [00:47:24]Beyang: BFG is just kind of like the working name. And so the impetus for BFG was like, if you look at like current AI inline code completion tools and the errors that they make, a lot of the errors that they make, even in kind of like the easy, like single line case, are essentially like type errors, right? Like you're trying to complete a function call and it suggests a variable that you defined earlier, but that variable is the wrong type. [00:47:47]Swyx: And that's the sort of thing [00:47:47]Beyang: where it's like a first year, like freshman CS student would not make that error, right? So like, why does the AI make that error? And the reason is, I mean, the AI is just suggesting things that are plausible without the context of the types or any other like broader files in the code. And so the kind of intuition here is like, why don't we just do the basic thing that like any baseline intelligent human developer would do, which is like click jump to definition, click some fine references and pull in that like Graph Context into the context window and then have it generate the completion. So like that's sort of like the MVP of what BFG was. And turns out that works really well. Like you can eliminate a lot of type errors that AI coding tools make just by pulling in that context. Yeah, but the graph is definitely [00:48:32]Steve: our Chomsky side. [00:48:33]Swyx: Yeah, exactly. [00:48:34]Beyang: So like this like Chomsky-Norvig thing, I think pops up in a bunch of differ
Ned Bellavance worked in the world of tech for more than a decade before joining the family profession as an educator. He joins Corey on Screaming in the Cloud to discuss his shift from engineer to educator and content creator, the intricacies of Terraform, and how changes in licensing affect the ecosystem.About NedNed is an IT professional with more than 20 years of experience in the field. He has been a helpdesk operator, systems administrator, cloud architect, and product manager. In 2019, Ned founded Ned in the Cloud LLC to work as an independent educator, creator, and consultant. In this new role, he develops courses for Pluralsight, runs multiple podcasts, writes books, and creates original content for technology vendors.Ned is a Microsoft MVP since 2017 and a HashiCorp Ambassador since 2020.Ned has three guiding principles: embrace discomfort, fail often, and be kind.Links Referenced: Ned in the Cloud: https://nedinthecloud.com/ LinkedIn: https://www.linkedin.com/in/ned-bellavance/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. My guest today is Ned Bellavance, who's the founder and curious human over at Ned in the Cloud. Ned, thank you for joining me.Ned: Yeah, it's a pleasure to be here, Corey.Corey: So, what is Ned in the Cloud? There are a bunch of easy answers that I feel don't give the complete story like, “Oh, it's a YouTube channel,” or, “Oh no, it's the name that you wound up using because of, I don't know, easier to spell the URL or something.” Where do you start? Where do you stop? What are you exactly?Ned: What am I? Wow, I didn't know we were going to get this deep into philosophical territory this early. I mean, you got to ease me in with something. But so, Ned in the Cloud is the name of my blog from back in the days when we all started up a blog and hosted on WordPress and had fun. And then I was also at the same time working for a value-added reseller as a consultant, so a lot of what went on my blog was stuff that happened to me in the world of consulting.And you're always dealing with different levels of brokenness when you go to clients, so you see some interesting things, and I blogged about them. At a certain point, I decided I want to go out and do my own thing, mostly focused on training and education and content creation and I was looking for a company name. And I went through—I had a list of about 40 different names. And I showed them to my wife, and she's like, “Why don't you go Ned in the Cloud? Why are you making this more complicated than it needs to be?”And I said, “Well, I'm an engineer. That is my job, by definition, but you're probably right. I should just go with Ned in the Cloud.” So, Ned in the Cloud now is a company, just me, focused on creating educational content for technical learners on a variety of different platforms. And if I'm delivering educational content, I am a happy human, and if I'm not doing that, I'm probably out running somewhere.Corey: I like that, and I'd like to focus on education first. There are a number of reasons that people will go in that particular direction, but what was it for you?Ned: I think it's kind of in the heritage of my family. It's in my blood to a certain degree because my dad is a teacher, my mom is a teacher-turned-librarian, my sister is a teacher, my wife is a teacher, her mother is a teacher. So, there was definitely something in the air, and I think at a certain point, I was the black sheep in the sense that I was the engineer. Look, this guy over here. And then I ended up deciding that I really liked training people and learning and teaching, and became a teacher of sorts, and then they all went, “Welcome to the fold.”Corey: It's fun when you get to talk to people about the things that they're learning because when someone's learning something I find that it's the time when their mind is the most open. I don't think that that's something that you don't get to see nearly as much once someone already, quote-unquote, “Knows a thing,” because once that happens, why would you go back and learn something new? I have always learned the most—even about things that I've built myself—by putting it in the hands of users and seeing how they honestly sometimes hold it wrong and make mistakes that don't make sense to me, but absolutely make sense to them. Learning something—or rather, teaching something—versus building that thing is very much an orthogonal skill set, and I don't think that there's enough respect given to that understanding.Ned: It's an interesting sphere of people who can both build the thing and then teach somebody else to build the thing because you're right, it's very different skill sets. Being able to teach means that you have to empathize with the human being that you're teaching and understand that their perspective is not yours necessarily. And one of the skills that you build up as an instructor is realizing when you're making a whole bunch of assumptions because you know something really well, and that the person that you're teaching is not going to have that context, they're not going to have all those assumptions baked in, so you have to actually explain that stuff out. Some of my instruction has been purely online video courses through, like, Pluralsight; less of a feedback loop there. I have to publish the entire course, and then I started getting feedback, so I really enjoy doing live trainings as well because then I get the questions right away.And I always insist, like, if I'm delivering a lecture, and you have a question, please don't wait for the end. Please interrupt me immediately because you're going to forget what that question is, you're going to lose your train of thought, and then you're not going to ask it. And the whole class benefits when someone asks a question, and I benefit too. I learn how to explain that concept better. So, I really enjoy the live setting, but making the video courses is kind of nice, too.Corey: I learned to speak publicly and give conference talks as a traveling contract trainer for Puppet years ago, and that was an eye-opening experience, just because you don't really understand something until you're teaching other people how it works. It's how I learned Git. I gave a conference talk that explained Git to people, and that was called a forcing function because I had four months to go to learn this thing I did not fully understand and welp, they're not going to move the conference for me, so I guess I'd better hustle. I wouldn't necessarily recommend that approach. These days, it seems like you have a, let's say, disproportionate level of focus on the area of Infrastructure as Code, specifically you seem to be aiming at Terraform. Is that an accurate way of describing it?Ned: That is a very accurate way of describing it. I discovered Terraform while I was doing my consulting back in 2016 era, so this was pretty early on in the product's lifecycle. But I had been using CloudFormation, and at that time, CloudFormation only supported JSON, which meant it was extra punishing. And being able to describe something more succinctly and also have access to all these functions and loops and variables, I was like, “This is amazing. Where were you a year ago?” And so, I really just jumped in with both feet into Terraform.And at a certain point, I was at a conference, and I went past the Pluralsight booth, and they mentioned that they were looking for instructors. And I thought to myself, well, I like talking about things, and I'm pretty excited about this Terraform thing. Why don't I see if they're looking for someone to do a Terraform course? And so, I went through their audition process and sure enough, that is exactly what they were looking for. They had no getting started course for Terraform at the time. I published the course in 2017, and it has been in the top 50 courses ever since on Pluralsight. So, that told me that there's definitely an appetite and maybe this is an area I should focus on a little bit more.Corey: It's a difficult area to learn. About two months ago, I started using Terraform for the first time in anger in ages. I mean, I first discovered it when I was on my way back from one of those Puppet trainings, and the person next to me was really excited about this thing that we're about to launch. Turns out that was Mitchell Hashimoto and Armon was sitting next to him on the other side. Why he had a middle seat, I'll never know.But it was a really fun conversation, just talking about how he saw the world and what he was planning on doing. And a lot of that vision was realized. What I figured out a couple months ago is both that first, I'm sort of sad that Terraform is as bad as it is, but it's the best option we've got because everything else is so much worse. It is omnipresent, though. Effectively, every client I've ever dealt with on AWS billing who has a substantial estate is managing it via Terraform.It is the lingua franca of cloud across the board. I just wish it didn't require as much care and feeding, especially for the getting-started-with-a-boilerplate type of scenario. So, much of what you type feels like it's useless stuff that should be implicit. I understand why it's not, but it feels that way. It's hard to learn.Ned: It certainly can be. And you're right, there's a certain amount of boilerplate and [sigh] code that you have to write that seems pointless. Like, do I have to actually spell this all out? And sometimes the answer is yes, and sometimes the answer is you should use a module for that. Why are you writing this entire VPC configuration out yourself? And that's the sort of thing that you learn over time is that there are shortcuts, there are ways to make the code simpler and require less care and feeding.But I think ultimately, your infrastructure, just like your software, evolves, changes has new requirements, and you need to manage it in the same way that you want to manage your software. And I wouldn't tell a software developer, “Oh, you know, you could just write it once and never go back to it. I'm sure it's fine.” And by the same token, I wouldn't tell an infrastructure developer the same thing. Now, of course, people do that and never go back and touch it, and then somebody else inherits that infrastructure and goes, “Oh, God. Where's the state data?” And no one knows, and then you're starting from scratch. But hopefully, if you have someone who's doing it responsibly, they'll be setting up Terraform in such a way that it is maintainable by somebody else.Corey: I'd sure like to hope so. I have encountered so many horrible examples of code and wondering what malicious person wrote this. And of course, it was me, 6 or 12 months ago.Ned: Always [laugh].Corey: I get to play architect around a lot of these things. In fact, that's one of the problems that I've had historically with an awful lot of different things that I've basically built, called it feature complete, let it sit for a while using the CDK or whatnot, and then oh, I want to make a small change to it. Well, first, I got to spend half a day during the entire line dependency updates and seeing what's broken and how all of that works. It feels like for better or worse, Terraform is a lot more stable than that, as in, old versions of Terraform code from blog posts from 2016 will still effectively work. Is that accurate? I haven't done enough exploring in that direction to be certain.Ned: The good thing about Terraform is you can pin the version of various things that you're using. So, if you're using a particular version of the AWS provider, you can pin it to that specific version, and it won't automatically upgrade you to the latest and greatest. If you didn't do that, then you'll get bit by the update bug, which certainly happens to some folks when they changed the provider from version 3 to version 4 and completely changed how the S3 bucket object was created. A lot of people's scripts broke that day, so I think that was the time for everyone to learn what the version argument is and how it works. But yeah, as long as you follow that general convention of pinning versions of your modules and of your resource provider, you should be in a pretty stable place when you want to update it.Corey: Well, here's the $64,000 question for you, then. Does Dependabot on your GitHub repo begin screaming at you as soon as you've done that because in one of its dependencies in some particular weird edge cases when they're dealing with unsanitized, internet-based input could wind up taking up too many system resources, for example? Which is, I guess, in an ideal world, it wouldn't be an issue, but in practice, my infrastructure team is probably not trying to attack the company from the inside. They have better paths to get there, to be very blunt.Ned: [laugh].Corey: Turns out giving someone access to a thing just directly is way easier than making them find it. But that's been one of the frustrating parts where, especially when it encounters things like, I don't know, corporate security policies of, “Oh, you must clear all of these warnings,” which well-intentioned, poorly executed seems to be the takeaway there.Ned: Yeah, I've certainly seen some implementations of tools that do static scanning of Terraform code and will come up with vulnerabilities or violations of best practice, then you have to put exceptions in there. And sometimes it'll be something like, “You shouldn't have your S3 bucket public,” which in most cases, you shouldn't, but then there's the one team that's actually publishing a front-facing static website in the S3 bucket, and then they have to get, you know, special permission from on high to ignore that warning. So, a lot of those best practices that are in the scanning tools are there for very good reasons and when you onboard them, you should be ready to see a sea of red in your scan the first time and then look through that and kind of pick through what's actually real, and we should improve in our code, and what's something that we can safely ignore because we are intentionally doing it that way.Corey: I feel like there's an awful lot of… how to put this politely… implicit dependencies that are built into things. I'll wind up figuring out how to do something by implementing it and that means I will stitch together an awful lot of blog posts, things I found on Stack Overflow, et cetera, just like a senior engineer and also Chat-Gippity will go ahead and do those things. And then the reason—like, someone asks me four years later, “Why is that thing there?” And… “Well, I don't know, but if I remove it, it might stop working, so…” there was almost a cargo-culting style of, well, it's always been there. So, is that necessary? Is it not?I'm ashamed by how often I learned something very fundamental in a system that I've been using for 20 years—namely, the command line—just by reading the man page for a command that I already, quote-unquote, “Already know how to use perfectly well.” Yeah, there's a lot of hidden gems buried in those things.Ned: Oh, my goodness, I learned something about the Terraform CLI last week that I wish I'd known two years ago. And it's been there for a long time. It's like, when you want to validate your code with the terraform validate, you can initialize without initializing the back-end, and for those who are steeped in Terraform, that means something and for everybody else, I'm sorry [laugh]. But I discovered that was an option, and I was like, “Ahhh, this is amazing.” But to get back to the sort of dependency problems and understanding your infrastructure better—because I think that's ultimately what's happening when you have to describe something using Infrastructure as Code—is you discover how the infrastructure actually works versus how you thought it worked.If you look at how—and I'm going to go into Azure-land here, so try to follow along with me—if you go into Azure-land and you look at how they construct a load balancer, the load balancer is not a single resource. It's about eight different resources that are all tied together. And AWS has something similar with how you have target groups, and you have the load balancer component and the listener and the health check and all that. Azure has the same thing. There's no actual load balancer object, per se.There's a bunch of different components that get slammed together to form that load balancer. When you look in the portal, you don't see any of that. You just see a load balancer, and you might think this is a very simple resource to configure. When it actually comes time to break it out into code, you realize, oh, this is eight different components, each of which has its own options and arguments that I need to understand. So, one of the great things that I have seen a lot of tooling up here around is doing the import of existing infrastructure into Terraform by pointing the tool at a collection of resources—whatever they are—and saying, “Go create the Terraform code that matches that thing.” And it's not going to be the most elegant code out there, but it will give you a baseline for what all the settings actually are, and other resource types are, and then you can tweak it as needed to add in input variables or remove some arguments that you're not using.Corey: Yeah, I remember when they first announced the importing of existing state. It's wow, there's an awful lot of stuff that it can be aware of that I will absolutely need to control unless I want it to start blowing stuff away every time I run the—[unintelligible 00:15:51] supposedly [unintelligible 00:15:52] thing against it. And that wasn't a lot of fun. But yeah, this is the common experience of it. I only recently was reminded of the fact that I once knew, and I'd forgotten that a public versus private subnet in AWS is a human-based abstraction, not something that is implicit to the API or the way they envision subnets existing. Kind of nice, but also weird when you have to unlearn things that you've thought you'd learned.Ned: That's a really interesting example of we think of them as very different things, and when we draw nice architecture diagrams there—these are the private subnets and these are the public ones. And when you actually go to create one using Terraform—or really another tool—there's no box that says ‘private' or ‘make this public.' It's just what does your route table look like? Are you sending that traffic out the internet gateway or are you sending it to some sort of NAT device? And how does traffic come back into that subnet? That's it. That's what makes it private versus public versus a database subnet versus any other subnet type you want to logically assign within AWS.Corey: Yeah. It's kind of fun when that stuff hits.Ned: [laugh].Corey: I am curious, as you look across the ecosystem, do you still see that learning Terraform is a primary pain point for, I guess, the modern era of cloud engineer, or has that sunk below the surface level of awareness in some ways?Ned: I think it's taken as a given to a certain degree that if you're a cloud engineer or an aspiring cloud engineer today, one of the things you're going to learn is Infrastructure as Code, and that Infrastructure as Code is probably going to be Terraform. You can still learn—there's a bunch of other tools out there; I'm not going to pretend like Terraform is the end-all be-all, right? We've got—if you want to use a general purpose programming language, you have something like Pulumi out there that will allow you to do that. If you want to use one of the cloud-native tools, you've got something like CloudFormation or Azure has Bicep. Please don't use ARM templates because they hurt. They're still JSON only, so at least CloudFormation added YAML support in there. And while I don't really like YAML, at least it's not 10,000 lines of code to spin up, like, two domain controllers in a subnet.Corey: I personally wind up resolving the dichotomy between oh, should we go with JSON or should we go with YAML by picking the third option everyone hates more. That's why I'm a staunch advocate for XML.Ned: [laugh]. I was going to say XML. Yeah oh, as someone who dealt with SOAP stuff for a while, yeah, XML was particularly painful, so I'm not sad that went away. JSON for me, I work with it better, but YAML is more readable. So, it's like it's, pick your poison on that. But yeah, there's a ton of infrastructure tools out there.They all have basically the same concepts behind them, the same core concepts because they're all deploying the same thing at the end of the day and there's only so many ways you can express that concept. So, once you learn one—say you learned CloudFormation first—then Terraform is not as big of a leap. You're still declaring stuff within a file and then having it go and make those things exist. It's just nuances between the implementation of Terraform versus CloudFormation versus Bicep.Corey: I wish that there were more straightforward abstractions, but I think that as soon as you get those, that inherently limits what you're able to do, so I don't know how you square that circle.Ned: That's been a real difficult thing is, people want some sort of universal cloud or infrastructure language and abstraction. I just want a virtual machine. I don't care what kind of platform I'm on. Just give me a VM. But then you end up very much caring [laugh] what kind of VM, what operating system, what the underlying hardware is when you get to a certain level.So, there are some workloads where you're like, I just needed to run somewhere in a container and I really don't care about any of the underlying stuff. And that's great. That's what Platform as a Service is for. If that's your end goal, go use that. But if you're actually standing up infrastructure for any sort of enterprise company, then you need an abstraction that gives you access to all the underlying bits when you want them.So, if I want to specify different placement groups about my VM, I need access to that setting to create a placement group. And if I have this high-level of abstraction of a virtual machine, it doesn't know what a placement group is, and now I'm stuck at that level of abstraction instead of getting down to the guts, or I'm going into the portal or the CLI and modifying it outside of the tool that I'm supposed to be using.Corey: I want to change gears slightly here. One thing that has really been roiling some very particular people with very specific perspectives has been the BSL license change that Terraform has wound up rolling out. So far, the people that I've heard who have the strongest opinions on it tend to fall into one of three categories: either they work at HashiCorp—fair enough, they work at one of HashiCorp's direct competitors—which yeah, okay, sure, or they tend to be—how to put this delicately—open-source evangelists, of which I freely admit I used to be one and then had other challenges I needed to chase down in other ways. So, I'm curious as to where you, who are not really on the vendor side of this at all, how do you see it shaking out?Ned: Well, I mean, just for some context, essentially what HashiCorp decided to do was to change the licensing from Mozilla Public licensing to BSL for, I think eight of their products and Terraform was amongst those. And really, this sort of tells you where people are. The only one that anybody really made any noise about was Terraform. There's plenty of people that use Vault, but I didn't see a big brouhaha over the fact that Vault changed its licensing. It's really just about Terraform. Which tells you how important it is to the ecosystem.And if I look at the folks that are making the most noise about it, it's like you said, they basically fall into one of two camps: it's the open-source code purists who believe everything should be licensed in completely open-source ways, or at least if you start out with an open-source license, you can't convert to something else later. And then there is a smaller subset of folks who work for HashiCorp competitors, and they really don't like the idea of having to pay HashiCorp a regular fee for what used to be ostensibly free to them to use. And so, what they ended up doing was creating a fork of Terraform, just before the licensing change happened and that fork of Terraform was originally called OpenTF, and they had an OpenTF manifesto. And I don't know about you, when I see the word ‘manifesto,' I back away slowly and try not to make any sudden moves.Corey: You really get the sense there's going to be a body count tied to this. And people are like, “What about the Agile Manifesto?” “Yeah, what about it?”Ned: [laugh]. Yeah, I'm just—when I see ‘manifesto,' I get a little bit nervous because either someone is so incredibly passionate about something that they've kind of gone off the deep end a little bit, or they're being somewhat duplicitous, and they have ulterior motives, let's say. Now, I'm not trying to cast aspersions on anybody. I can't read anybody's mind and tell you exactly what their intention was behind it. I just know that the manifesto reads a little bit like an open-source purist and a little bit like someone having a temper tantrum, and vacillating between the two.But cooler heads prevailed a little bit, and now they have changed the name to OpenTofu, and it has been accepted by the Linux Foundation as a project. So, it's now a member of the Linux Foundation, with all the gravitas that that comes with. And some people at HashiCorp aren't necessarily happy about the Linux Foundation choosing to pull that in.Corey: Yeah, I saw a whole screed, effectively, that their CEO wound up brain-dumping on that frankly, from a messaging perspective, he would have been better served as not to say anything at all, to be very honest with you.Ned: Yeah, that was a bit of a yikes moment for me.Corey: It's very rare that you will listen yourself into trouble as opposed to opening your mouth and getting yourself into trouble.Ned: Exactly.Corey: You wouldn't think I would be one of those—of all people who would have made that observation, you wouldn't think I would be on that list, yet here I am.Ned: Yeah. And I don't think either side is entirely blameless. I understand the motivations behind HashiCorp wanting to make the change. I mean, they're a publicly traded company now and ostensibly that means that they should be making some amount of money for their investors, so they do have to bear that in mind. I don't necessarily think that changing the licensing of Terraform is the way to make that money.I think in the long-term, it's not going—it may not hurt them a lot, but I don't think it's going to help them out a lot, and it's tainted the goodwill of the community to a certain degree. On the other hand, I don't entirely trust what the other businesses are saying as well in their stead. So, there's nobody in this that comes out a hundred percent clean [laugh] on the whole process.Corey: Yeah, I feel like, to be direct, the direct competitors to HashiCorp along its various axes are not the best actors necessarily to complain about what is their largest competitor no longer giving them access to continue to compete against them with their own product. I understand the nuances there, but it also doesn't feel like they are the best ambassadors for that. I also definitely understand where HashiCorp is coming from where, why are we investing all this time, energy, and effort for people to basically take revenue away from us? But there's also the bigger problem, which is, by and large, compared to how many sites are running Terraform and the revenues that HashiCorp puts up for it, they're clearly failing to capture the value they have delivered in a massive way. But counterpoint, if they hadn't been open-source for their life until this point, would they have ever captured that market share? Probably not.Ned: Yeah, I think ultimately, the biggest competitor to their paid offering of Terraform is their free version of Terraform. It literally has enough bells and whistles already included and plenty of options for automating those things and solving the problems that their enterprise product solves that their biggest problem is not other competitors in the Terraform landscape; it's the, “Well, we already have something, and it's good enough.” And I'm not sure how you sell to that person, that's why I'm not in marketing, but I think that is their biggest competitor is the people who already have a solution and are like, “Why do I need to pay for your thing when my thing works well enough?”Corey: That's part of the strange thing that I'm seeing as I look across this entire landscape is it feels like this is not something that is directly going to impact almost anyone out there who's just using this stuff, either the open-source version as a paying customer of any of these things, but it is going to kick up a bunch of dust. And speaking of poor messaging, HashiCorp is not really killing it this quarter, where the initial announcement led to so many questions that were unclear, such as—like, they fixed this later in the frequently asked questions list, but okay, “I'm using Terraform right now and that's fine. I'm building something else completely different. Am I going to lose my access to Terraform if you decide to launch a feature that does what my company does?” And after a couple of days, they put up an indemnity against that. Okay, fine.Like, when Mongo did this, there was a similar type of dynamic that was emerging, but a lot fewer people are writing their own database engine to then sell onward to customers that are provisioning infrastructure on behalf of their customers. And where the boundaries lay for who was considered a direct Terraform competitor was unclear. I'm still not convinced that it is clear enough to bet the business on for a lot of these folks. It comes down to say what you mean, not—instead of hedging, you're not helping your cause any.Ned: Yeah, I think out of the different products that they have, some are very clear-cut. Like, Vault is a server that runs as a service, and so that's very clear what that product is and where the lines of delineation are around Vault. If I go stand up a bunch of Vault servers and offer them as a service, then that is clearly a competitor. But if I have an automation pipeline service and people can technically automate Terraform deployments with my service, even if that's not the core thing that I'm looking to do, am I now a competitor? Like, it's such a fuzzy line because Terraform isn't an application, it's not a server that runs somewhere, it's a CLI tool and a programming language. So yeah, those lines are very, very fuzzy. And I… like I said, it would be better if they say what they meant, as opposed to sort of the mealy-mouthed language that they ended up using and the need to publish multiple revisions of that FAQ to clarify their position on very specific niche use cases.Corey: Yeah, I'm not trying to be difficult or insulting or anything like that. These are hard problems that everyone involved is wrestling with. It just felt a little off, and I think the messaging did them no favors when that wound up hitting. And now, everyone is sort of trying to read the tea leaves and figure out what does this mean because in isolation, it doesn't mean anything. It is a forward-looking thing.Whatever it is you're doing today, no changes are needed for you, until the next version comes out, in which case, okay, now do we incorporate the new thing or don't we? Today, to my understanding, whether I'm running Terraform or OpenTofu entirely comes down to which binary am I invoking to do the apply? There is no difference of which I am aware. That will, of course, change, but today, I don't have to think about that.Ned: Right. OpenTofu is a literal fork of Terraform, and they haven't really added much in the way of features, so it should be completely compatible with Terraform. The two will diverge in the future as feature as new features get added to each one. But yeah, for folks who are using it today, they might just decide to stay on the version pre-fork and stay on that for years. I think HashiCorp has pledged 18 months of support for any minor version of Terraform, so you've got at least a year-and-a-half to decide. And we were kind of talking before the recording, 99% of people using Terraform do not care about this. It does not impact their daily workflow.Corey: No. I don't see customers caring at all. And also, “Oh, we're only going to use the pre-fork version of Terraform,” they're like, “Thanks for the air cover because we haven't updated any of that stuff in five years, so tha”—Ned: [laugh].Corey: “Oh yeah, we're doing it out of license concern. That's it. That's the reason we haven't done anything recent with it.” Because once it's working, changes are scary.Ned: Yeah.Corey: Terraform is one of those scary things, right next to databases, that if I make a change that I don't fully understand—and no one understands everything, as we've covered—then this could really ruin my week. So, I'm going to be very cautious around that.Ned: Yeah, if metrics are to be believed across the automation platforms, once an infrastructure rollout happens with a particular version of Terraform, that version does not get updated. For years. So, I have it on good authority that there's still Terraform version 0.10 and 0.11 running on these automation platforms for really old builds where people are too scared to upgrade to, like, post 0.12 where everything changed in the language.I believe that. People don't want to change it, especially if it's working. And so, for most people, this licensing chain doesn't matter. And all the constant back and forth and bickering just makes people feel a little nervous, and it might end up pushing people away from Terraform as a platform entirely, as opposed to picking a side.Corey: Yeah, and I think that that is probably the fair way to view it at this point where right now—please, friends at HashiCorp and HashiCorp competitors don't yell at me for this—it's basically a nerd slap-fight at the moment.Ned: [laugh].Corey: And of one of the big reasons that I also stay out of these debates almost entirely is that I married a corporate attorney who used to be a litigator and I get frustrated whenever it comes down to license arguments because you see suddenly a bunch of engineers who get to cosplay as lawyers, and reading the comments is infuriating once you realize how a little bit of this stuff works, which I've had 15 years of osmotic learning on this stuff. Whenever I want to upset my wife, I just read some of these comments aloud and then our dinner conversation becomes screaming. It's wonderful.Ned: Bad legal takes? Yeah, before—Corey: Exactly.Ned: Before my father became a social studies teacher, he was a lawyer for 20 years, and so I got to absorb some of the thought process of the lawyer. And yeah, I read some of these takes, and I'm like, “That doesn't sound right. I don't think that would hold up in any court of law.” Though a lot of the open-source licensing I don't think has been tested in any sort of court of law. It's just kind of like, “Well, we hope this stands up,” but nobody really has the money to check.Corey: Yeah. This is the problem with these open-source licenses as well. Very few have never been tested in any meaningful way because I don't know about you, but I don't have a few million dollars in legal fees lying around to prove the point.Ned: Yeah.Corey: So, it's one of those we think this is sustainable, and Lord knows the number of companies that have taken reliances on these licenses, they're probably right. I'm certainly not going to disprove the fact—please don't sue me—but yeah, this is one of those things that we're sort of assuming is the case, even if it's potentially not. I really want to thank you for taking the time to discuss how it is you view these things and talk about what it is you're up to. If people want to learn more, where's the best place for them to find you?Ned: Honestly, just go to my website. It's nedinthecloud.com. And you can also find me on LinkedIn. I don't really go for Twitter anymore.Corey: I envy you. I wish I could wean myself off of it. But we will, of course, include a link to that in the show notes. Thank you so much for being so generous with your time. It's appreciated.Ned: It's been a pleasure. Thanks, Corey.Corey: Net Bellavance, founder and curious human at Ned in the Cloud. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry comment that I will then fork under a different license and claim as my own.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business, and we get to the point. Visit duckbillgroup.com to get started.
Welcome to episode 238 of the Cloud Pod Podcast - where the forecast is always cloudy! This week we're bringing you a preview of Amazon re:Invent 2023. We're talking all things AWS, Bedrock, Q, and frugal architecture, and - you guessed it - AI. Titles we almost went with this week:
Software Engineering Radio - The Podcast for Professional Software Developers
Nikhil Shetty, an expert in networking and distributed systems, speaks with SE radio's Kanchan Shringi about virtual private cloud (VPC) and related technologies. They explore how VPC relates to public cloud, private cloud, and virtual private networks (VPNs). The discussion delves into why VPC is fundamental to building on the cloud, as well as configuring a VPC, subnets, and the address space that can be assigned to the VPC. During this episode they look into route tables, network address translation, as well as security groups, network access control lists, and DNS. Finally, Nikhil helps compare VPC offerings from Amazon Web Services (AWS) and Oracle Cloud Infrastructure (OCI).
Seif Lotfy, Co-Founder and CTO at Axiom, joins Corey on Screaming in the Cloud to discuss how and why Axiom has taken a low-cost approach to event data. Seif describes the events that led to him helping co-found a company, and explains why the team wrote all their code from scratch. Corey and Seif discuss their views on AWS pricing, and Seif shares his views on why AWS doesn't have to compete on price. Seif also reveals some of the exciting new products and features that Axiom is currently working on. About SeifSeif is the bubbly Co-founder and CTO of Axiom where he has helped build the next generation of logging, tracing, and metrics. His background is at Xamarin, and Deutche Telekom and he is the kind of deep technical nerd that geeks out on white papers about emerging technology and then goes to see what he can build.Links Referenced: Axiom: https://axiom.co/ Twitter: https://twitter.com/seiflotfy TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. This promoted guest episode is brought to us by my friends, and soon to be yours, over at Axiom. Today I'm talking with Seif Lotfy, who's the co-founder and CTO of Axiom. Seif, how are you?Seif: Hey, Corey, I am very good, thank you. It's pretty late here, but it's worth it. I'm excited to be on this interview. How are you today?Corey: I'm not dead yet. It's weird, I see you at a bunch of different conferences, and I keep forgetting that you do in fact live half a world away. Is the entire company based in Europe? And where are you folks? Where do you start and where do you stop geographically? Let's start there. We over—everyone dives right into product. No, no, no. I want to know where in the world people sit because apparently, that's the most important thing about a company in 2023.Seif: Unless you ask Zoom because they're undoing whatever they did. We're from New Zealand, all the way to San Francisco, and everything in between. So, we have people in Egypt and Nigeria, all around Europe, all around the US… and UK, if you don't consider it Europe anymore.Corey: Yeah, it really depends. There's a lot of unfortunate naming that needs to get changed in the wake of that.Seif: [laugh].Corey: But enough about geopolitics. Let's talk about industry politics. I've been a fan of Axiom for a while and I was somewhat surprised to realize how long it had been around because I only heard about you folks a couple of years back. What is it you folks do? Because I know how I think about what you're up to, but you've also gone through some messaging iteration, and it is a near certainty that I am behind the times.Seif: Well, at this point, we just define ourselves as the best home for event data. So, Axiom is the best home for event data. We try to deal with everything that is event-based, so time-series. So, we can talk metrics, logs, traces, et cetera. And right now predominantly serving engineering and security.And we're trying to be—or we are—the first cloud-native time-series platform to provide streaming search, reporting, and monitoring capabilities. And we're built from the ground up, by the way. Like, we didn't actually—we're not using Parquet [unintelligible 00:02:36] thing. We're completely everything from the ground up.Corey: When I first started talking to you folks a few years back, there were two points to me that really stood out, and I know at least one of them still holds true. The first is that at the time, you were primarily talking about log data. Just send all your logs over to Axiom. The end. And that was a simple message that was simple enough that I could understand it, frankly.Because back when I was slinging servers around and you know breaking half of them, logs were effectively how we kept track of what was going on, where. These days, it feels like everything has been repainted with a very broad brush called observability, and the takeaway from most company pitches has been, you must be smarter than you are to understand what it is that we're up to. And in some cases, you scratch below the surface and realize it no, they have no idea what they're talking about either and they're really hoping you don't call them on that.Seif: It's packaging.Corey: Yeah. It is packaging and that's important.Seif: It's literally packaging. If you look at it, traces and logs, these are events. There's a timestamp and just data with it. It's a timestamp and data with it, right? Even metrics is all the way to that point.And a good example, now everybody's jumping on [OTel 00:03:46]. For me, OTel is nothing else, but a different structure for time series, for different types of time series, and that can be used differently, right? Or at least not used differently but you can leverage it differently.Corey: And the other thing that you did that was interesting and is a lot, I think, more sustainable as far as [moats 00:04:04] go, rather than things that can be changed on a billboard or whatnot, is your economic position. And your pricing has changed around somewhat, but I ran a number of analyses on your cost that you were passing on to customers and my takeaway was that it was a little bit more expensive to store data for logs in Axiom than it was to store it in S3, but not by much. And it just blew away the price point of everything else focused around logs, including AWS; you're paying 50 cents a gigabyte to ingest CloudWatch logs data over there. Other companies are charging multiples of that and Cisco recently bought Splunk for $28 billion because it was cheaper than paying their annual Splunk bill. How did you get to that price point? Is it just a matter of everyone else being greedy or have you done something different?Seif: We looked at it from the perspective of… so there's the three L's of logging. I forgot the name of the person at Netflix who talked about that, but basically, it's low costs, low latency, large scale, right? And you will never be able to fulfill all three of them. And we decided to work on low costs and large scale. And in terms of low latency, we won't be low as others like ClickHouse, but we are low enough. Like, we're fast enough.The idea is to be fast enough because in most cases, I don't want to compete on milliseconds. I think if the user can see his data in two seconds, he's happy. Or three seconds, he's happy. I'm not going to be, like, one to two seconds and make the cost exponentially higher because I'm one second faster than the other. And that's, I think, that the way we approached this from day one.And from day one, we also started utilizing the idea of existence of Open—Object Storage, we have our own compressions, our own encodings, et cetera, from day one, too, so and we still stick to that. That's why we never converted to other existing things like Parquet. Also because we are a Schema-On-Read, which Parquet doesn't allow you really to do. But other than that, it's… from day one, we wanted to save costs by also making coordination free. So, ingest has to be coordination free, right, because then we don't run a shitty Kafka, like, honestly a lot—a lot of the [logs 00:06:19] companies who running a Kafka in front of it, the Kafka tax reflects in what they—the bill that you're paying for them.Corey: What I found fun about your pricing model is it gets to a point that for any reasonable workload, how much to log or what to log or sample or keep everything is no longer an investment decision; it's just go ahead and handle it. And that was originally what you wound up building out. Increasingly, it seems like you're not just the place to send all the logs to, which to be honest, I was excited enough about that. That was replacing one of the projects I did a couple of times myself, which is building highly available, fault-tolerant, rsyslog clusters in data centers. Okay, great, you've gotten that unlocked, the economics are great, I don't have to worry about that anymore.And then you started adding interesting things on top of it, analyzing things, replaying events that happen to other players, et cetera, et cetera, it almost feels like you're not just a storage depot, but you also can forward certain things on under a variety of different rules or guises and format them as whatever on the other side is expecting them to be. So, there's a story about integrating with other observability vendors, for example, and only sending the stuff that's germane and relevant to them since everyone loves to charge by ingest.Seif: Yeah. So, we did this one thing called endpoints, the number one. Endpoints was a beginning where we said, “Let's let people send us data using whatever API they like using, let's say Elasticsearch, Datadog, Honeycomb, Loki, whatever, and we will just take that data and multiplex it back to them.” So, that's how part of it started. This allows us to see, like, how—allows customers to see how we compared to others, but then we took it a bit further and now, it's still in closed invite-only, but we have Pipelines—codenamed Pipelines—which allows you to send data to us and we will keep it as a source of truth, then we will, given specific rules, we can then ship it anywhere to a different destination, right, and this allows you just to, on the fly, send specific filter things out to, I don't know, a different vendor or even to S3 or you could send it to Splunk. But at the same time, you can—because we have all your data, you can go back in the past, if the incident happens and replay that completely into a different product.Corey: I would say that there's a definite approach to observability, from the perspective of every company tends to visualize stuff a little bit differently. And one of the promises of OTel that I'm seeing that as it grows is the idea of oh, I can send different parts of what I'm seeing off to different providers. But the instrumentation story for OTel is still very much emerging. Logs are kind of eternal and the only real change we've seen to logs over the past decade or so has been instead of just being plain text and their positional parameters would define what was what—if it's in this column, it's an IP address and if it's in this column, it's a return code, and that just wound up being ridiculous—now you see them having schemas; they are structured in a variety of different ways. Which, okay, it's a little harder to wind up just cat'ing a file together and piping it to grep, but there are trade-offs that make it worth it, in my experience.This is one of those transitional products that not only is great once you get to where you're going, from my playing with it, but also it meets you where you already are to get started because everything you've got is emitting logs somewhere, whether you know it or not.Seif: Yes. And that's why we picked up on OTel, right? Like, one of the first things, we now support… we have an OTel endpoint natively bec—or as a first-class citizen because we wanted to build this experience around OTel in general. Whether we like it or not, and there's more reasons to like it, OTel is a standard that's going to stay and it's going to move us forward. I think of OTel as will have the same effect if not bigger as [unintelligible 00:10:11] back of the day, but now it just went away from metrics, just went to metrics, logs, and traces.Traces is, for me, very interesting because I think OTel is the first one to push it in a standard way. There were several attempts to make standardized [logs 00:10:25], but I think traces was something that OTel really pushed into a proper standard that we can follow. It annoys me that everybody uses a different bits and pieces of it and adds something to it, but I think it's also because it's not that mature yet, so people are trying to figure out how to deliver the best experience and package it in a way that it's actually interesting for a user.Corey: What I have found is that there's a lot that's in this space that is just simply noise. Whenever I spend a protracted time period working on basically anything and I'm still confused by the way people talk about that thing, months or years later, I'm starting to get the realization that maybe I'm not the problem here. And I'm not—I don't mean this to be insulting, but one of the things I've loved about you folks is I've always understood what you're saying. Now, you can hear that as, “Oh, you mean we talk like simpletons?” No, it means what you're talking about resonates with at least a subset of the people who have the problem you solve. That's not nothing.Seif: Yes. We've tried really hard because one of the things we've tried to do is actually bring observability to people who are not always busy or it's not part of their day to day. So, we try to bring into [Versal 00:11:37] developers, right, with doing a Versal integration. And all of a sudden, now they have their logs, and they have a metrics, and they have some traces. So, all of a sudden, they're doing the observability work. Or they have actual observability, for their Versal based, [unintelligible 00:11:54]-based product.And we try to meet the people where they are, so we try to—instead of actually telling people, “You should send us data.”—I mean, that's what they do now—we try to find, okay, what product are you using and how can we grab data from there and send it to us to make your life easier? You see that we did that with Versal, we did that with Cloudflare. AWS, we have extensions, Lambda extensions, et cetera, but we're doing it for more things. For Netlify, it's a one-click integration, too, and that's what we're trying to do to actually make the experience and the journey easier.Corey: I want to change gears a little bit because something that we spent a fair bit of time talking about—it's why we became friends, I would think anyway—is that we have a shared appreciation for several things. One of which, at most notable to anyone around us is whenever we hang out, we greet each other effusively and then immediately begin complaining about costs of cloud services. What is your take on the way that clouds charge for things? And I know it's a bit of a leading question, but it's core and foundational to how you think about Axiom, as well as how you serve customers.Seif: They're ripping us off. I'm sorry [laugh]. They just—the amount of money they make, like, it's crazy. I would love to know what margins they have. That's a big question I've always had. I'm like, what are the margins they have at AWS right now?Corey: Across the board, it's something around 30 to 40%, last time I looked at it.Seif: That's a lot, too.Corey: Well, that's also across the board of everything, to be clear. It is very clear that some services are subsidized by other services. As it should be. If you start charging me per IAM call, we're done.Seif: And also, I mean, the machine learning stuff. Like, they won't be doing that much on top of it right now, right, [else nobody 00:13:32] will be using it.Corey: But data transfer? Yeah, there's a significant upcharge on that. But I hear you. I would moderate it a bit. I don't think that I would say that it's necessarily an intentional ripoff. My problem with most cloud services that they offer is not usually that they're too expensive—though there are exceptions to that—but rather that the dimensions are unpredictable in advance. So, you run something for a while and see what it costs. From where I sit, if a customer uses your service and then at the end of usage is surprised by how much it cost them, you've kind of screwed up.Seif: Look, if they can make egress free—like, you saw how Cloudflare just did the egress of R2 free? Because I am still stuck with AWS because let's face it, for me, it is still my favorite cloud, right? Cloudflare is my next favorite because of all the features that are trying to develop and the pace they're picking, the pace they're trying to catch up with. But again, one of the biggest things I liked is R2, and R2 egress is free. Now, that's interesting, right?But I never saw anything coming back from S3 from AWS on S3 for that, like you know. I think Amazon is so comfortable because from a product perspective, they're simple, they have the tools, et cetera. And the UI is not the flashiest one, but you know what you're doing, right? The CLI is not the flashiest one, but you know what you're doing. It is so cool that they don't really need to compete with others yet.And I think they're still dominantly the biggest cloud out there. I think you know more than me about that, but [unintelligible 00:14:57], like, I think they are the biggest one right now in terms of data volume. Like, how many customers are using them, and even in terms of profiles of people using them, it's very, so much. I know, like, a lot of the Microsoft Azure people who are using it, are using it because they come from enterprise that have been always Microsoft… very Microsoft friendly. And eventually, Microsoft also came in Europe in these all these different weird ways. But I feel sometimes ripped off by AWS because I see Cloudflare trying to reduce the prices and AWS just looking, like, “Yeah, you're not a threat to us so we'll keep our prices as they are.”Corey: I have it on good authority from folks who know that there are reasons behind the economic structures of both of those companies based—in terms of the primary direction the traffic flows and the rest. But across the board, they've done such a poor job of articulating this that, frankly, I think the confusion is on them to clear up, not us.Seif: True. True. And the reason I picked R2 and S3 to compare there and not look at Workers and Lambdas because I look at it as R2 is S3 compatible from an API perspective, right? So, they're giving me something that I already use. Everything else I'm using, I'm using inside Amazon, so it's in a VPC, but just the idea. Let me dream. Let me dream that S3 egress will be free at some point.Corey: I can dream.Seif: That's like Christmas. It's better than Christmas.Corey: What I'm surprised about is how reasonable your pricing is in turn. You wind up charging on the basis of ingest, which is basically the only thing that really makes sense for how your company is structured. But it's predictable in advance, the free tier is, what, 500 gigs a month of ingestion, and before people think, “Oh, that doesn't sound like a lot,” I encourage you to just go back and think how much data that really is in the context of logs for any toy project. Like, “Well, our production environment spits out way more than that.” Yes, and by the word production that you just used, you probably shouldn't be using a free trial of anything as your critical path observability tooling. Become a customer, not a user. I'm a big believer in that philosophy, personally. For all of my toy projects that are ridiculous, this is ample.Seif: People always tend to overestimate how much logs they're going to be sending. Like so, there's one thing. What you said it right: people who already have something going on, they already know how much logs they'll be sending around. But then eventually they're sending too much, and that's why we're back here and they're talking to us. Like, “We want to ttry your tool, but you know, we'll be sending more than that.” So, if you don't like our pricing, go find something else because I think we are the cheapest out there right now. We're the competitive the cheapest out there right now.Corey: If there is one that is less expensive, I'm unaware of it.Seif: [laugh].Corey: And I've been looking, let's be clear. That's not just me saying, “Well, nothing has skittered across my desk.” No, no, no, I pay attention to this space.Seif: Hey, where's—Corey, we're friends. Loyalty.Corey: Exactly.Seif: If you find something, you tell me.Corey: Oh, if I find something, I'll tell everyone.Seif: Nononon, you tell me first and you tell me in a nice way so I can reduce the prices on my site [laugh].Corey: This is how we start a price was, industry-wide, and I would love to see it.Seif: [laugh]. But there's enough channels that we share at this point across different Slacks and messaging apps that you should be able to ping me if you find one. Also, get me the name of the CEO and the CTO while you're at it.Corey: And where they live. Yes, yes, of course. The dire implications will be awesome.Seif: That was you, not me. That was your suggestion.Corey: Exactly.Seif: I will not—[laugh].Corey: Before we turn into a bit of an old thud and blunder, let's talk about something else that I'm curious about here. You've been working on Axiom for something like seven years now. You come from a world of databases and events and the like. Why start a company in the model of Axiom? Even back then, when I looked around, my big problem with the entire observability space could never have been described as, “You know what we need? More companies that do exactly this.” What was it that you saw that made you say, “Yeah, we're going to start a company because that sounds easy.”Seif: So, I'll be very clear. Like, I'm not going to, like, sugarcoat this. We kind of got in a position where it [forced counterweighted 00:19:10]. And [laugh] by that I mean, we came from a company where we were dealing with logs. Like, we actually wrote an event crash analytics tool for a company, but then we ended up wanting to use stuff like Datadog, but we didn't have the budget for that because Datadog was killing us.So, we ended up hosting our own Elasticsearch. And Elasticsearch, it costs us more to maintain our Elasticsearch cluster for the logs than to actually maintain our own little infrastructure for the crash events when we were getting, like, 1 billion crashes a month at this point. So eventually, we just—that was the first burn. And then you had alert fatigue and then you had consolidating events and timestamps and whatnot. The whole thing just seemed very messy.So, we started off after some company got sold, we started off by saying, “Okay, let's go work on a new self-hosted version of the [unintelligible 00:20:05] where we do metrics and logs.” And then that didn't go as well as we thought it would, but we ended up—because from day one, we were working on cloud na—because we d—we cloud ho—we were self-hosted, so we wanted to keep costs low, we were working on and making it stateless and work against object store. And this is kind of how we started. We realized, oh, our cost, we can host this and make it scale, and won't cost us that much.So, we did that. And that started gaining more attention. But the reason we started this was we wanted to start a self-hosted version of Datadog that is not costly, and we ended up doing a Software as a Service. I mean, you can still come and self-hosted, but you'll have to pay money for it, like, proper money for that. But we do as a SaaS version of this and instead of trying to be a self-hosted Datadog, we are now trying to compete—or we are competing with Datadog.Corey: Is the technology that you've built this on top of actually that different from everything else out there, or is this effectively what you see in a lot of places: “Oh, yeah, we're just going to manage Elasticsearch for you because that's annoying.” Do you have anything that distinguishes you from, I guess, the rest of the field?Seif: Yeah. So, very just bluntly, like, I think Scuba was the first thing that started standing out, and then Honeycomb came into the scene and they start building something based on Scuba, the [unintelligible 00:21:23] principles of Scuba. Then one of the authors of actual Scuba reached out to me when I told him I'm trying to build something, and he's gave me some ideas, and I start building that. And from day one, I said, “Okay, everything in S3. All queries have to be serverless.”So, all the queries run on functions. There's no real disks. It's just all on S3 right now. And the biggest issue—achievement we got to lower our cost was to get rid of Kafka, and have—let's say, in behind the scenes we have our own coordination-free mechanism, but the idea is not to actually have to use Kafka at all and thus reduce the costs incredibly. In terms of technology, no, we don't use Elasticsearch.We wrote everything from the ground up, from scratch, even the query language. Like, we have our own query language that's based—modeled after Kusto—KQL by Microsoft—so everything we have is built from absolutely from the ground up. And no Elastic. I'm not using Elastic anymore. Elastic is a horror for me. Absolutely horror.Corey: People love the API, but no, I've never met anyone who likes managing Elasticsearch or OpenSearch, or whatever we're calling your particular flavor of it. It is a colossal pain, it is subject to significant trade-offs, regardless of how you work with it, and Amazon's managed offering doesn't make it better; it makes it worse in a bunch of ways.Seif: And the green status of Elasticsearch is a myth. You'll only see it once: the first time you start that cluster, that's what the Elasticsearch cluster is green. After that, it's just orange, or red. And you know what? I'm happy when it's orange. Elasticsearch kept me up for so long. And we had actually a very interesting situation where we had Elasticsearch running on Azure, on Windows machines, and I would have server [unintelligible 00:23:10]. And I'd have to log in and every day—you remember, what's it called—RP… RP Something. What was it called?Corey: RDP? Remote Desktop Protocol, or something else?Seif: Yeah, yeah. Where you have to log in, like, you actually have visual thing, and you have to go in and—Corey: Yep.Seif: And visually go in and say, “Please don't restart.” Every day, I'd have to do that. Please don't restart, please don't restart. And also a lot of weird issues, and also at that point, Azure would decide to disconnect the pod, wanted to try to bring in a new pod, and all these weird things were happening back then. So, eventually, end up with a [unintelligible 00:23:39] decision. I'm talking 2013, '14, so it was back in the day when Elasticsearch was very young. And so, that was just a bad start for me.Corey: I will say that Azure is the most cost-effective cloud because their security is so clown shoes, you can just run whatever you want in someone else's account and it's free to you. Problem solved.Seif: Don't tell people how we save costs, okay?Corey: [laugh]. I love that.Seif: [laugh]. Don't tell people how we do this. Like, Corey, come on [laugh], you're exposing me here. Let me tell you one thing, though. Elasticsearch is the reason I literally use a shock collar or a shock bracelet on myself every time it went down—which was almost every day, instead of having PagerDuty, like, ring my phone.And, you know, I'd wake up and my partner back then would wake up. I bought a Bluetooth collar off of Alibaba that would tase me every time I'd get a notification, regardless of the notification. So, some things are false alarm, but I got tased for at least two, three weeks before I gave up. Every night I'd wake up, like, to a full discharge.Corey: I would never hook myself up to a shocker tied to outages, even if I owned a company. There are pleasant ways to wake up, unpleasant ways to wake up, and even worse. So, you're getting shocked for some—so someone else can wind up effectively driving the future of the business. You're, more or less, the monkey that gets shocked awake to go ahead and fix the thing that just broke.Seif: [laugh]. Well, the fix to that was moving from Azure to AWS without telling anybody. That got us in a lot of trouble. Again, that wasn't my company.Corey: They didn't notice that you did this, or it caused a lot of trouble because suddenly nothing worked where they thought it would work?Seif: They—no, no, everything worked fine on AWS. That's how my love story began. But they didn't notice for, like, six months.Corey: That's kind of amazing.Seif: [laugh]. That was specta—we rewrote everything from C# to Node.js and moved everything away from Elasticsearch, started using Redshift, Redis and a—you name it. We went AWS all the way and they didn't even notice. We took the budget from another department to start filling that in.But we cut the costs from $100,000 down to, like, 40, and then eventually down to $30,000 a month.Corey: More than a little wild.Seif: Oh, God, yeah. Good times, good times. Next time, just ask me to tell you the full story about this. I can't go into details on this podcast. I'll get in a lot—I think I'll get in trouble. I didn't sign anything though.Corey: Those are the best stories. But no, I hear you. I absolutely hear you. Seif, I really want to thank you for taking the time to speak with me. If people want to learn more, where should they go?Seif: So, axiom.co—not dot com. Dot C-O. That's where they learn more about Axiom. And other than that, I think I have a Twitter somewhere. And if you know how to write my name, you'll—it's just one word and find me on Twitter.Corey: We will put that all in the [show notes 00:26:33]. Thank you so much for taking the time to speak with me. I really appreciate it.Seif: Dude, that was awesome. Thank you, man.Corey: Seif Lotfy, co-founder and CTO of Axiom, who has brought this promoted guest episode our way. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry comment that one of these days, I will get around to aggregating in some horrifying custom homebrew logging system, probably built on top of rsyslog.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.
David Colebatch, CEO of Tidal, joins Corey on Screaming in the Cloud to discuss Tidal's recent shift to a product-led approach and why empathizing with customers is always their most important job. David describes what it was like to grow the company from scratch on a boot-strapped basis, and how customer feedback and challenges inform the company strategy. Corey and David discuss the cost-savings measures cloud customers are now embarking on, and David discusses how constant migrations are the new normal. Corey and David also discuss the impact that generative AI is having not just on tech, but also on creative content and interactions in our everyday lives. About David David is the CEO & Founder of Tidal. Tidal is empowering businesses to transform from traditional on-premises IT-run organizations to lean-agile-cloud powered machines.Links Referenced: Company website: https://tidal.cloud LinkedIn: https://www.linkedin.com/in/david-colebatch/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. Returning guest today, David Colebatch is still the CEO at Tidal. David, how have you been? It's been a hot second.David: Thanks, Corey. Yeah, it's been a fantastic summer for me up here in Toronto.Corey: Yeah, last time I saw you, was it New York or was it DC? They all start to run together to me.David: I think it was DC. Yeah.Corey: That's right. Public Sector Summit where everything was just a little bit stranger than most of my conversations. It's, “Wait, you're telling me there's a whole bunch of people who use the cloud but don't really care about money? What—how does that work?” And I say that not from the position of harsh capitalism, but from the position of we're a government; saving costs is nowhere in our mandate. Or it is, but it's way above my pay grade and I run the cloud and call it good. It seems like that attitude is evolving, but slowly, which is kind of what you want to see. Titanic shifts in governing are usually not something you want to see done on a whim, overnight.David: No, absolutely. A lot of the excitement at the DC summit was around new capabilities. And I was actually really intrigued. It was my first time in the DC summit, and it was packed, from the very early stages of the morning, great attendance throughout the day. And I was just really impressed by some of the new capabilities that customers are leveraging now and the new use cases that they're bringing to market. So, that was a good time for me.Corey: Yeah. So originally, you folks were focused primarily on migrations and it seems like that's evolving a little bit. You have a product now for starters, and the company's name is simply Tidal, without a second word. So, brevity is very much the soul of wit, it would seem. What are you doing these days?David: Absolutely. Yeah, you can find us at tidal.cloud. Yeah, we're focused on migrations as a primary means to help a customer achieve new capabilities. We're about accelerating their journey to cloud and optimizing once they're in cloud as well. Yeah, we're focused on identifying the different personas in an enterprise that are trying to take that cloud journey on with people like project, program managers, developers, as well as network people, now.Corey: It seems, on some level, like you are falling victim to the classic trap that basically all of us do, where you have a services company—which is how I thought of you folks originally—now, on some level, trying to become a product or a platform company. And then you have on the other side of it—places that we're—“Oh, we're a SaaS company. This is hard. We're going to do services instead.” And it seems like no one's happy. We're all cats, perpetually on the wrong side of a given door. Is that an accurate assessment for where you are? Or am I misreading the tea leaves on this one?David: A little misread, but close—Corey: Excellent.David: You're right. We bootstrapped our product company with services. And from day one, we supported our customers, as well as channel partners, many of the [larger size 00:03:20] that you know, we supported them in helping their customers be successful. And that was necessary for us as we bootstrapped the company from zero. But lately, and certainly in the last 12 months, it's very much a product-led company. So, leading with what customers are using our software for first, and then supporting that with our customer success team.Corey: So, it's been an interesting year. We've seen simultaneously a market correction, which I think has been sorely needed for a while, but that's almost been overshadowed in a lot of conversations I've had by the meteoric rise and hype around generative AI. Have you folks started rebranding everything with a fresh coat of paint labeled generative AI yet as it seems like so many folks have? What's your take on it?David: We haven't. You won't see a tidal.ai from us. Look, our thoughts are leveraging the technology as we always had to provide better recommendations and suggestions to our users, so we'll continue to embrace generative AI as it applies to specific use cases within our product. We're not going to launch a brand new product just around the AI theme.Corey: Yeah, but even that seems preferable to what a lot of folks are doing, which is suddenly pivoting their entire market positioning and then act, “Oh, we've been working in generative AI for 5, 10, 15 years,” in some cases. Google and Amazon most notably have talked about how they've been doing this for decades. It's, “Cool. Then why did OpenAI beat you all to the punch on this?” And in many cases, also, “You've been working on this for decades? Huh. Then why is Alexa so terrible?” And they don't really have a good talking point for that yet, but it's the truth.David: Absolutely. Yeah. I will say that the world changed with the OpenAI launch, of course, and we had a new way to interact with this technology now that just sparked so much interest from everyday people, not just developers. And so, that got our juices flowing and creativity mode as well. And so, we started thinking about, well, how can we recommend more to other users of our system as opposed to just cloud architects?You know, how can we support project managers that are, you know, trying to summarize where they're at, by leveraging some of this technology? And I'm not going to say we have all the answers for this baked yet, but it's certainly very exciting to start thinking outside the box with a whole new bunch of capabilities that are available to us.Corey: I tried doing some architecture work with Chat-Gippity—yes, that is how I pronounce it—and it has led me down the primrose path a little bit because what it says is often right. Mostly. But there are some edge-case exceptions of, “Ohh, it doesn't quite work that way.” It reminds me at some level of a junior engineer who doesn't know the answer, so they bluff. And that's great, but it's also a disaster.Because if I can't trust the things you tell me and you to call it out when you aren't sure on something, then I've got to second guess everything you tell me. And it feels like when it comes to architecture and migrations in particular, the devil really is in the details. It doesn't take much to design a greenfield architecture on a whiteboard, whereas being able to migrate something from one place to another and not have to go down in the process? That's a lot of work.David: Absolutely. I have used AI successfully to do a lot of research very quickly across broad market terms and things like that, but I do also agree with you that we have to be careful using it as the carte blanche force multiplier for teams, especially in migration scenarios. Like, if you were to throw Chat-Gippity—as you say—a bunch of COBOL code and say, “Hey, translate this,” it can do a pretty good job, but the devil is in that detail and you need to have an experienced person actually vet that code to make sure it's suitable. Otherwise, you'll find yourself creating buggy things downstream. I've run into this myself, you know, “Produce some Terraform for me.” And when I generated some Terraform for an architecture I was working on, I thought, “This is pretty good.” But then I realized, it's actually two years old and that's about how old my skills were as well. So, I needed to engage someone else on my team to help me get that job done.Corey: So, migrations have been one of those things that people have been talking about for well, as long as we've had more than one data center on the planet. “How do we get our stuff from over here to over there?” And so, on and so forth. But the context and tenor of those conversations has changed dramatically. What have you seen this past year or so as far as emerging trends? What is the industry doing that might not be obvious from the outside?David: Well, cost optimization has been number one on people's minds, and migrating with financial responsibility in mind has been refreshing. So, working backwards from what their customer outcomes are is still number one in our book, and when we see increasingly customers say, “Hey, I want to migrate to cloud to close a data center or avoid some capital outlay,” that's the first thing we hear, but then we work backwards from what was their three-year plan. And then what we've seen so far is that customers have changed from a very IT-centric view of cloud and what they're trying to deliver to much more business-centric. Now, they'll say things like, “I want to be able to bring new capabilities to market more quickly. I want to be able to operate and leverage some of these new generative AI technologies.” So, they actually have that as a driving force for migrations, as opposed to an afterthought.Corey: What I have found is that, for whatever reason, not giving a shit about the AWS bill in my business was a zero-interest-rate phenomenon. Suddenly people care an awful lot. But they're caring is bounded. If there's a bunch of easy stuff to do that saves a giant pile of money, great, yeah, most folks are going to do that. But then it gets into the idea of opportunity cost and trade-offs. And there's been a shift there that I've seen where people are willing to invest more in that cost-cutting work than they were in previous years.It makes sense, but it's also nice to finally have a moment to validate what I've assumed for seven years now that, yeah, in a recession or a retraction of the broader industry, suddenly, this is going to be top-of-mind for a lot of folks. And it's nice to see that that approach was vindicated because the earlier approach that I saw when we saw something like this was at the start of Covid. And at that point, no one knew what was happening week-to-week and consulting leads basically stopped for six months. And that was oh, maybe we don't have a counter-cyclical business. But no, it turns out that when money means something again as interest rates rise, people care about it more.David: Yeah. It is nice to see that. And people are trying to do more with less and become more efficient in an advanced pace these days. I don't know about you, but I've seen the trends towards the low-hanging fruit being done at this point so people have already started using savings plans and capabilities like that, and now they're embarking in more re-architecture of applications. But I think one stumbling block that we've noticed is that customers are still struggling to know where to apply those transformations across their portfolio. They'll have one or two target apps that everybody knows because they're the big ones on the bill, but beneath that, the other 900 applications in their portfolio, which ones do I do next? And that's still a question that we're seeing come up, time and again.Corey: One thing that I'm starting to see people talking about from my perspective, has been suddenly they really care about networking in a way that they did not previously. And I mean, this in the TCP/IP sense, not the talking to interesting people and doing interesting things. That's been basically steady-state for a while. But from my perspective, the conversations I'm having are being driven by, “Wait a minute. AWS is going to start charging $3.50 a month per assigned IPV4 address. Oh, dear. We have been careless in our approach to this.” Is that something that you're seeing shaping the conversations you're having with folks?David: Oh yeah, absolutely. I mean right off the bat, our team went through very quickly and inventoried our IPV4, and certainly, customers are doing that as well. I found that, you know, in the last seven years, the migration conversations were having become broader across an enterprise customer. So, we've mapped out different personas now, and the networking teams playing a bigger role for migrations, but also optimizations in the cloud. And I'll give you an example.So, one large enterprise, their networking team approached us at the same time as their cloud architects who were trying to work on a migration approached us. And the networking team had a different use case. They wanted to inventory all the IP addresses on-premises, and some that they already had in the cloud. So, they actually leveraged—shameless plug here—but they leveraged out a LightMesh IPAM solution to do that. And what that brought to light for us was that the integration of these different teams working together now, as opposed to working around each other. And I do think that's a bit of a trend change for us.Corey: IPAM has always been one of those interesting things to me because originally, the gold standard in this space was—let's not kid ourselves—a Microsoft Excel spreadsheet. And then there are a bunch of other offerings that entered into the space. And for a while I thought most of these were ridiculous because the upgrade was, you know, Google Sheets so you can collaborate. But having this done in a way with particular permissions and mapping in a way that's intuitive and doesn't require everyone to not mess up when they're looking at it, especially as you get into areas of shared responsibility between different divisions or different team members who are in different time zones and whatnot, this becomes a more and more intractable problem. It's one of those areas where small, scrappy startups don't understand what the fuss is about, and big enterprises absolutely despair of finding something that works for them.AWS launched their VPC IPAM offering a while back and if you look at it from the perspective of competing with Google Sheets, its pricing is Looney Tunes. But I've met an awful lot of people who have sworn by it in the process, as they look at these things. Now, of course, the caveat is that like most AWS offerings, it's great in a pure AWS native environment, but as soon as you start getting into other providers and whatnot, it gets very tricky very quickly.David: No, absolutely. And usability of an IP address management solution is something to consider. So, you know, if you're trying to get on board with IPAM, do you want to do three easy steps or do you want to follow 150? And I think that's a really big barrier to entry for a lot of networking teams, especially those that are not too familiar with cloud already. But yeah, where we've seen the networking folks get more involved is around, like, identifying endpoints and devices that must be migrated to cloud, but also managing those subnets and planning their VPC designs upfront.You've probably seen this before yourself where customers have allocated a whole bunch of address space over time—an overlapping address space, I should say—only to then later want to [peer 00:13:47] those networks. And that's something that if you think you're going to be doing downstream, you should really plan for that ahead of time and make sure your address space is allocated correctly. Problems vary. Like, everyone's architecture is different, of course, but we've certainly noticed that being one of the top-button items. And then that leads into a migration itself. You're not migrating to cloud now; you're migrating within the cloud and trying to reorganize address spaces, which is a whole other planning activity to consider.Corey: When you take a look at, I guess the next step in these things, what's coming next in the world of migrations? I recently got to talk to someone who was helping their state migrate from, effectively, mainframes in many cases into a cloud environment. And it seems, on some level, like everyone on a mainframe, one, is very dependent on that workload; those things are important, so that's why they're worth the extortionate piles of money, but it also feels like they've been trying to leave the mainframe for decades in many cases. Now, there's a sense that for a lot of these folks, the end is nigh for their mainframe's lifespan, so they're definitely finally taking the steps to migrate. What's the next big frontier once the, I guess, either the last holdouts from that side of the world wind up getting into a cloud or decide they never will? It always felt to me like migrations are one of those things that's going to taper off and it's not going to be something that is going to be a growth industry because the number of legacy workloads is, at least theoretically, declining. Not so sure that's accurate, though.David: I don't think it is either. If we look back at past migrations, you know, 90, 95% of them are often lift-and-shift to EC2 or x86 on VMware in the cloud. And a lot of the work that we're seeing now is being described as optimization. Like, “Look at my EC2 workloads and come up with cloud-native or transformative processes for me.” But those are migrations as well because we run the same set of software, the same processes over those workloads to determine how we can re-platform and refactor them into more native services.So, I think, you know, the big shift for us is just recognizing that the term ‘migrations' needs to be well-defined and communicated with folks. Migrations are actually constant now and I would argue we're doing more migrations within customers now than we have in the past because the rate of change is just so much faster. And I should add, on the topic of mainframe and legacy systems, we have seen this pivot away from teams looking for emulation layers for those technologies, you know, where they want to forklift the functionality, but they don't want to really roll up their sleeves and do any coding work. So, they're previously looking to automatically translate code or emulate that compute layer in the cloud, and the big pivot we've seen in the last 12 months, I'd say, is that customers are more willing to actually understand how to rebuild their applications in the cloud. And that's a fantastic story because it means they're not kicking that technology debt can down the road any further. They're really trying to embrace cloud and leverage some of these new capabilities that have come to market.Corey: What do you see as, I guess, the reason that a number of holdouts have not yet done a migration? Like, historically, I've seen some that are pretty obvious: the technology wasn't there. Well, cloud has gotten to a point now where it is hard to identify a capability that isn't there in some form. And there's always been the sunk cost fallacy where, “Well, we've already bought all this stuff, and it's running here, so if we're not replacing it anytime soon, there's no cost benefit for us to replace it.” And that's actually correct. That's not a fallacy there. But there's also the, “Well, it would be too much work to move.” Sometimes true, sometimes not. Are you seeing a shift in the reasons that people are giving to not migrate?David: No, I haven't. It's been those points mostly. And I'd say one of the biggest inhibitors to people actually getting it done is this misconception that it costs a lot of money to transform and to adopt cloud tools. You've seen this through the technology keeps getting easier and easier to adopt and cheaper to use. When you can provision services for $0 a month and then scale with usage patterns, there's really no reason not to try today because the opportunity cost is so low.So, I think that one of the big inhibitors that comes up, though, is this cultural barrier within organizations where teams haven't been empowered to try new things. And that's the one thing that I think is improving nowadays, as more of this how-to-build-in-the-cloud capability becomes permeated throughout the organization. People are saying, “Well, why can't we do that?” As opposed to, “We can't do that.” You know what I mean? It's a subtle difference, but once leadership starts to say, “Why can't we do this modern thing in the cloud? Why can't we leverage AI?” Teams are given more rope to try and experiment, and fail, of course. And I think ultimately, that culture shift is starting to take root across enterprise and across public sector as well.Corey: One of the things that I find surprising is the enthusiasm with which different market segments jump onto different aspects of cloud. Lambda is a classic example, in that it might be one of the services that is more quickly adopted by enterprises than by startups and a lot of cases. But there's also the idea of, “Oh, we built this thing last night, and it's awesome.” And enterprises, like you know, including banks and insurance companies don't want to play those games, for obvious reasons.Generative AI seems to be a mixed bag around a lot of these things. Have you had conversations with a number of your clients around the generative AI stuff? Because I've seen Amazon, for example, talking about it, “Oh, all our customers are asking us about it.” And, mmm, I don't know. Because I definitely have questions about and I'm exploring it, but I don't know that I'm turning to Amazon, of all companies, to answer those questions, either.David: Yeah. We've certainly had customer conversations about it. And it depends, again, on those personas. On the IT side, the conversations are mostly around how can they do their jobs better. They're not thinking forwards about the business capabilities. So, IT comes to us and they want to know how can we use generative AI to create Lambda functions and create stateless applications more quickly as a part of a migration effort. And that's great. That's a really cool use case. We've used that generative AI approach to create code ourselves.But on the business side, they're looking forwards, they want to use generative AI in the, again, the sample size of my customer conversations, but they see that the barrier to entry is getting their data in a place that they can leverage it. And to them, to the business, that's what's driving the migration conversations they're having with us, is, “How do I exfil my data and get it into the cloud where I can start to leverage these great AI tools?”Corey: Yeah, I'm still looking at use cases that I think are a little less terrifying. Like, I want to wind up working on a story or something. Or I'll use it to write blog posts; I have a great approach. It's, “Write a blog post about this topic and here are some salient points and do it in the style of Corey Quinn.” I'll ask Chat-Gippity to do that and it spits out something that is, frankly, garbage.And I get angry at it and I basically copy it into a text editor and spent 20 minutes mansplain-correcting the robot. And by the time it's done, I have, like, a structure of an article that talks about the things I want to talk about correctly. And there may be three words in a sequence that were originally there. And frankly, I'm okay with plagiarizing from the thing that is plagiarizing from me. It's a beautiful circle of ripping things off that that's glorious for me.But that's also not something that I could see being useful at any kind of scale, where I see companies getting excited about a lot of this stuff, it all seems to be a thin veneer over, “And then we can fire our customer service people,” which from a labor perspective is not great, but ignoring that entirely, as a customer, I don't want that. Because by the time I have to reach out to a company's customer service apparatus, something has gone wrong and it isn't going to be solved by the standard list of frequently asked questions that I clicked on. It's something that is off the beaten path and anomalous and requires human judgment. Making it harder for me to get to people who can fix those things does not thrill and delight me.David: I agree. I'm with you there. Where I get excited about it, though, is how much of a force multiplier it can be on that human interaction. So, for example, in that customer's service case you mentioned, you know, if that customer service rep is empowered by an AI dashboard that's listening to my conversation and taking notes and automatically looking up in my knowledge base how to support that customer, then that customer success person can be more successful more quickly, I think they can be more responsive to customer needs and maybe improve the quality, not just the volume of work they do but improve the quality, too.Corey: That's part of the challenge, too. There have been a number of companies that have gotten basically rapped across the snout for just putting out articles as content, written by AI without any human oversight. And these don't just include, you know, small, scrappy content mills; they include Microsoft, and I believe CNN, if I'm not mistaken, had something similar with that going on. I'm not certain on that last one. I don't want to defame them, but I know for a fact Microsoft did.David: Yeah, and I think some of the email generators are plugging into AI now, too, because my spam count has gone through the roof lately.Corey: Oh, my God. I got one recently saying, “Hey, I noticed at The Duckbill Group that you fix AWS bills. Great. That's awesome and super valuable for your clients.” And then try to sell me bill optimization and process improvement stuff. And it was signed by the CEO of the company that was reaching out.And then there was like—I expand the signature view, and it's all just very light gray text make it harder to read, saying, “This is AI generated, yadda, yadda, yadda.” Called the company out on Twitter, and they're like, “Oh, we only have a 0.15% error rate.” That sounds suspiciously close to email marketing response rates. “Welp, that means 99% of it was perfect.” No, it means that you didn't get in front of most of those people. They just ignored it without reading it the way we do most email outreach. So, that bugs me a fair bit. Because my perspective on it is if you don't care enough to actually craft a message to send me, why should I care enough to read it?David: Completely agree. I think a lot of people are out there looking for that asymmetric, you know, leverage that you can get over the market, and generating content, to them, has been a blocker for so long and now they're just opening up the fire hose and drowning us all with it. So I'm, like, with you. I think that I personally don't expect to get value back from someone unless I put value into that relationship. That's my starting point coming into it, so I would maybe use AI to help assist forming a message to someone, but I'm not going to blast the internet with content. I just think that's a cheeky low-value way to go about it.Corey: I don't track the numbers anymore, but I know that at this point, through the size of my audience and the content that I put out, I have taken, collectively, millennia of human time focusing on—that has been spent consuming the content that I put out. And as a result of that, I have a guiding principle here, which is first and foremost, you've got to respect your audience. And I'm just going to have a robot phone it in is not respecting your audience. I have no problem with AI assistants, but it requires human oversight before it goes out. I would never in a million years send anything out to the audience that I hadn't at least read or validated first.But yeah, some of the signups that go out, the automatic things that you click a button and sign up for my newsletter at lastweekinaws.com, you get an auto message that comes out. Yeah, it comes out under my name and I either wrote it or reviewed it, depending on what generation of system we're on these days, because it has my name attached to it. That's the way that this works. Your credibility is important and having a robot spout off complete nonsense and you get the credit or blame for it? No thanks. I want to be doomed from my own sins, not the ones that a computer makes on my behalf.David: [laugh]. Yeah, I'm with you. It's unfortunate that so many people expect the emails from you are generated now. We have the same thing when people sign up for Tidal Accelerator or Tidal LightMesh, they get a personal email from me. They'll get the automated one as well, but I generally get in there through our CRM, and I send them a message, too. And sometimes they'll respond and say, “This isn't really David, is it?” No, no, it's me. You don't have to respond. I wanted to let you know that I'm thankful for you trialing our software.Corey: Oh, yeah. You can hit reply to any email I send out. It comes from corey@lastweekinaws.com and it goes to my inbox. The reason that works, frankly, at this scale is because no one does it. People don't believe that that'll actually work. So, on a busy week, I'll get maybe a dozen email replies to it or one or two misconfigured bounces from systems that aren't set up properly to do those things. And I weed those out because they drive me nuts.But it's a yeah, the only emails that I get to that address, honestly, are the test copies of those messages that go out, too, because I'm on my own newsletter list. Who knew? I have two at the moment. I have—yes, I have two specific addresses on that, so I guess technically, I'm inflating the count of subscribers by two, if advertisers ask. But you know, at 32,000 and change, I will take the statistical fudging.David: Absolutely. We all expect that.Corey: No, the depressing part, when I think about that is, there's a number of readers I have on the list that I know for a fact that I've been acquainted with who have passed away. They're never going to unsubscribe from these things until the email starts bouncing at some and undefinable point in the future. But it's also—it feels morbid, but on some level, if I continue doing this for the rest of my life, I'm going to have a decent proportion of the subscriber base who's died. At least when people leave their jobs, like, their email address gets turned off, things start bouncing and cool that gets turned off automatically because even when people leave voluntarily, no one bothers to go through an unsubscribe from all this stuff. So, automated systems have to do it. That's great. I'm not saying computers shouldn't make life better. I am saying that they can't replace a fundamental aspect of human caring.David: So, Corey Quinn, who has influence over the living and the dead. It's impressive.Corey: Oh, absolutely. Honestly, if I were to talk to whoever came up with IBM's marketing strategy, I feel like I'd need to conduct a seance because they're probably 300 years old if they're still alive.David: [laugh]. Absolutely.Corey: No, I get passionate about this stuff because so much of a lot of the hype now has been shifting away from letting people expand their reach further and doing things in intentional ways and instead toward absolute garbage, such as, “Cool, we want to get a whole bunch of clicks so we can show ads to them, so we're going to just generate all bunch of crap to your content and throw it out there.” Everything I write, even stuff that admittedly, from time to time, is aimed for SEO purposes for specific things that we're doing, but that's always done from a perspective of okay, my primary SEO strategy is write compelling, original content and then people presumably link to it. And it works. It's about respecting the audience and so many things get that wrong.David: Yeah, absolutely. It's kind of scary now because I always thought that podcasts and video were the last refuge of authentic content. And now people are generating that as well. You know, you're watching a video and you realize hey, that voice sounds exactly consistent, you know, all the way through. And then it turns out, it's generated. And there's a YouTube channel I follow because I'm an avid sailor, called World On Water. And recently, I've noticed that voice changed, and I'm pretty sure they're using AI to generate it now.Corey: Here's a story I don't think you probably know about yourself. So, for those who are unaware, David, I hang out from time to time in various places. There's a international boundary between us, but occasionally one of us will broach it, and good for us. And we have social conversations where somehow one of us doesn't have a microphone in front of our face. Imagine that. I don't know what that's like most weeks.And like, at some level, the public face comes off and people start acting like human beings. And something I've always noticed about you, David, is that you don't commit the cardinal sin, for an awful lot of people I meet, which is displaying contempt for your customers. When I have found people who do that, I think less of them in almost every case and I lose so much interest in whatever it is that they're doing. If you don't like the problem space that you're in and don't have respect for the people paying you to make these problems go away, you shouldn't be doing it. Like, I'll laugh at silly AWS misconfigurations, but my customers are there because they have a problem and they're bringing me in to fix it. And would I be making fun of? “Ha ha ha, you didn't spend eight months of your life learning the ins and outs of how exactly reserved instances apply in this particular context? What a fool is you.” That's not how it ever works. I wish I could say it wasn't quite as rare as it is but I'm tired of talking to people who have just nothing but contempt for their market. Good work on that.David: Thank you. Yeah, I appreciate that. You know, I had a penny-drop moment when I was doing a lot of consulting work as an independent contractor, working with different customers at different stages of their own journey and different levels of technology capabilities. You know, you work with management, with project people, with software engineers, and you start to realize everybody's coming from a different place. So, you have to empathize with where they're at.They're coming to you usually because you have a level of expertise, that you've got some specialization and they want to tap into that capability that you've created. And that's great. I love having people come to me and ask me questions. Sometimes they don't come to me nicely asking questions, they make some assumptions about me and might challenge me right off the bat, but you have to realize that that's just where they're coming from at that point in time. And once you connect with them, they'll open up a little bit more, too; they'll empathize with yourself. So yeah, I've always found that it's really important for myself personally, but also for our team to empathize with customers, meet them where they're at, understand that they're coming from a different level of experience, and then help them solve their problems. That's job number one.Corey: And I'm a firm believer that if you don't respect your customer's business, they shouldn't be your customer. It's happened remarkably few times in the however many years I've been doing this, but there have been a couple of folks that have reached out I always very politely decline to work with them when this happens. Because you don't want to make people feel obnoxious for reaching out and, like, “Can you help me with my problem?” “How dare you? Who do you think you are?”No, no, no, no, no, none of that. But if there's a value misalignment or I don't think that your product is going to benefit people who use it as directed, I will not let you sponsor what I do as an easy example. Because I can always find another sponsor and make more money, but once I start losing the audience's trust, I'll never get that back, and I know that. It's the entire reason I do things the way that I do them. And maybe, on some level, from purely capitalist perspective, I'm being an absolute fool, but you know, if you have to pick a way to fail and assume you're going to get it wrong, how do you want to be wrong? I'll take this way.David: Yeah, I agree. Keep your ethics high, keep your morals high, and the rest will fall into place.Corey: I love how we started having ethical and morality discussions that started as, “So, cloud migrations. How are they going for you?”David: Yeah [laugh]. Certainly wandered into some uncharted territories on that one.Corey: Exactly. We started off in one place; wound up someplace completely removed from anything we could have reasonably expected at the start. Why? Because this entire episode has been a beautiful metaphor for cloud migrations. I really want to thank you for taking the time to chat with me on this stuff. If people want to learn more, where should they go to find you?David: tidal.cloud or LinkedIn, I'm very active on LinkedIn these days.Corey: And we will, of course, put links to both of those in the show notes. Thank you so much for going down this path with me. I didn't expect it to lead where it did, but I'm glad we went there.David: Like the tides ebbing and flowing. I'll be back soon, Corey.Corey: [laugh]. I will take you up on that and hold you to it.David: [laugh]. Sounds great.Corey: David Colebatch, CEO at Tidal. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry, upset comment that doesn't actually make cohesive sense because you outsourced it to a robot.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.
Steve Tuck, Co-Founder & CEO of Oxide Computer Company, joins Corey on Screaming in the Cloud to discuss his work to make modern computers cloud-friendly. Steve describes what it was like going through early investment rounds, and the difficult but important decision he and his co-founder made to build their own switch. Corey and Steve discuss the demand for on-prem computers that are built for cloud capability, and Steve reveals how Oxide approaches their product builds to ensure the masses can adopt their technology wherever they are. About SteveSteve is the Co-founder & CEO of Oxide Computer Company. He previously was President & COO of Joyent, a cloud computing company acquired by Samsung. Before that, he spent 10 years at Dell in a number of different roles. Links Referenced: Oxide Computer Company: https://oxide.computer/ On The Metal Podcast: https://oxide.computer/podcasts/on-the-metal TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is brought to us in part by our friends at RedHat. As your organization grows, so does the complexity of your IT resources. You need a flexible solution that lets you deploy, manage, and scale workloads throughout your entire ecosystem. The Red Hat Ansible Automation Platform simplifies the management of applications and services across your hybrid infrastructure with one platform. Look for it on the AWS Marketplace.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. You know, I often say it—but not usually on the show—that Screaming in the Cloud is a podcast about the business of cloud, which is intentionally overbroad so that I can talk about basically whatever the hell I want to with whoever the hell I'd like. Today's guest is, in some ways of thinking, about as far in the opposite direction from Cloud as it's possible to go and still be involved in the digital world. Steve Tuck is the CEO at Oxide Computer Company. You know, computers, the things we all pretend aren't underpinning those clouds out there that we all use and pay by the hour, gigabyte, second-month-pound or whatever it works out to. Steve, thank you for agreeing to come back on the show after a couple years, and once again suffer my slings and arrows.Steve: Much appreciated. Great to be here. It has been a while. I was looking back, I think three years. This was like, pre-pandemic, pre-interest rates, pre… Twitter going totally sideways.Corey: And I have to ask to start with that, it feels, on some level, like toward the start of the pandemic, when everything was flying high and we'd had low interest rates for a decade, that there was a lot of… well, lunacy lurking around in the industry, my own business saw it, too. It turns out that not giving a shit about the AWS bill is in fact a zero interest rate phenomenon. And with all that money or concentrated capital sloshing around, people decided to do ridiculous things with it. I would have thought, on some level, that, “We're going to start a computer company in the Bay Area making computers,” would have been one of those, but given that we are a year into the correction, and things seem to be heading up into the right for you folks, that take was wrong. How'd I get it wrong?Steve: Well, I mean, first of all, you got part of it right, which is there were just a litany of ridiculous companies and projects and money being thrown in all directions at that time.Corey: An NFT of a computer. We're going to have one of those. That's what you're selling, right? Then you had to actually hard pivot to making the real thing.Steve: That's it. So, we might as well cut right to it, you know. This is—we went through the crypto phase. But you know, our—when we started the company, it was yes, a computer company. It's on the tin. It's definitely kind of the foundation of what we're building. But you know, we think about what a modern computer looks like through the lens of cloud.I was at a cloud computing company for ten years prior to us founding Oxide, so was Bryan Cantrill, CTO, co-founder. And, you know, we are huge, huge fans of cloud computing, which was an interesting kind of dichotomy. Instead of conversations when we were raising for Oxide—because of course, Sand Hill is terrified of hardware. And when we think about what modern computers need to look like, they need to be in support of the characteristics of cloud, and cloud computing being not that you're renting someone else's computers, but that you have fully programmable infrastructure that allows you to slice and dice, you know, compute and storage and networking however software needs. And so, what we set out to go build was a way for the companies that are running on-premises infrastructure—which, by the way, is almost everyone and will continue to be so for a very long time—access to the benefits of cloud computing. And to do that, you need to build a different kind of computing infrastructure and architecture, and you need to plumb the whole thing with software.Corey: There are a number of different ways to view cloud computing. And I think that a lot of the, shall we say, incumbent vendors over in the computer manufacturing world tend to sound kind of like dinosaurs, on some level, where they're always talking in terms of, you're a giant company and you already have a whole bunch of data centers out there. But one of the magical pieces of cloud is you can have a ridiculous idea at nine o'clock tonight and by morning, you'll have a prototype, if you're of that bent. And if it turns out it doesn't work, you're out, you know, 27 cents. And if it does work, you can keep going and not have to stop and rebuild on something enterprise-grade.So, for the small-scale stuff and rapid iteration, cloud providers are terrific. Conversely, when you wind up in the giant fleets of millions of computers, in some cases, there begin to be economic factors that weigh in, and for some on workloads—yes, I know it's true—going to a data center is the economical choice. But my question is, is starting a new company in the direction of building these things, is it purely about economics or is there a capability story tied in there somewhere, too?Steve: Yeah, it's actually economics ends up being a distant third, fourth, in the list of needs and priorities from the companies that we're working with. When we talk about—and just to be clear we're—our demographic, that kind of the part of the market that we are focused on are large enterprises, like, folks that are spending, you know, half a billion, billion dollars a year in IT infrastructure, they, over the last five years, have moved a lot of the use cases that are great for public cloud out to the public cloud, and who still have this very, very large need, be it for latency reasons or cost reasons, security reasons, regulatory reasons, where they need on-premises infrastructure in their own data centers and colo facilities, et cetera. And it is for those workloads in that part of their infrastructure that they are forced to live with enterprise technologies that are 10, 20, 30 years old, you know, that haven't evolved much since I left Dell in 2009. And, you know, when you think about, like, what are the capabilities that are so compelling about cloud computing, one of them is yes, what you mentioned, which is you have an idea at nine o'clock at night and swipe a credit card, and you're off and running. And that is not the case for an idea that someone has who is going to use the on-premises infrastructure of their company. And this is where you get shadow IT and 16 digits to freedom and all the like.Corey: Yeah, everyone with a corporate credit card winds up being a shadow IT source in many cases. If your processes as a company don't make it easier to proceed rather than doing it the wrong way, people are going to be fighting against you every step of the way. Sometimes the only stick you've got is that of regulation, which in some industries, great, but in other cases, no, you get to play Whack-a-Mole. I've talked to too many companies that have specific scanners built into their mail system every month looking for things that look like AWS invoices.Steve: [laugh]. Right, exactly. And so, you know, but if you flip it around, and you say, well, what if the experience for all of my infrastructure that I am running, or that I want to provide to my software development teams, be it rented through AWS, GCP, Azure, or owned for economic reasons or latency reasons, I had a similar set of characteristics where my development team could hit an API endpoint and provision instances in a matter of seconds when they had an idea and only pay for what they use, back to kind of corporate IT. And what if they were able to use the same kind of developer tools they've become accustomed to using, be it Terraform scripts and the kinds of access that they are accustomed to using? How do you make those developers just as productive across the business, instead of just through public cloud infrastructure?At that point, then you are in a much stronger position where you can say, you know, for a portion of things that are, as you pointed out, you know, more unpredictable, and where I want to leverage a bunch of additional services that a particular cloud provider has, I can rent that. And where I've got more persistent workloads or where I want a different economic profile or I need to have something in a very low latency manner to another set of services, I can own it. And that's where I think the real chasm is because today, you just don't—we take for granted the basic plumbing of cloud computing, you know? Elastic Compute, Elastic Storage, you know, networking and security services. And us in the cloud industry end up wanting to talk a lot more about exotic services and, sort of, higher-up stack capabilities. None of that basic plumbing is accessible on-prem.Corey: I also am curious as to where exactly Oxide lives in the stack because I used to build computers for myself in 2000, and it seems like having gone down that path a bit recently, yeah, that process hasn't really improved all that much. The same off-the-shelf components still exist and that's great. We always used to disparagingly call spinning hard drives as spinning rust in racks. You named the company Oxide; you're talking an awful lot about the Rust programming language in public a fair bit of the time, and I'm starting to wonder if maybe words don't mean what I thought they meant anymore. Where do you folks start and stop, exactly?Steve: Yeah, that's a good question. And when we started, we sort of thought the scope of what we were going to do and then what we were going to leverage was smaller than it has turned out to be. And by that I mean, man, over the last three years, we have hit a bunch of forks in the road where we had questions about do we take something off the shelf or do we build it ourselves. And we did not try to build everything ourselves. So, to give you a sense of kind of where the dotted line is, around the Oxide product, what we're delivering to customers is a rack-level computer. So, the minimum size comes in rack form. And I think your listeners are probably pretty familiar with this. But, you know, a rack is—Corey: You would be surprised. It's basically, what are they about seven feet tall?Steve: Yeah, about eight feet tall.Corey: Yeah, yeah. Seven, eight feet, weighs a couple 1000 pounds, you know, make an insulting joke about—Steve: Two feet wide.Corey: —NBA players here. Yeah, all kinds of these things.Steve: Yeah. And big hunk of metal. And in the cases of on-premises infrastructure, it's kind of a big hunk of metal hole, and then a bunch of 1U and 2U boxes crammed into it. What the hyperscalers have done is something very different. They started looking at, you know, at the rack level, how can you get much more dense, power-efficient designs, doing things like using a DC bus bar down the back, instead of having 64 power supplies with cables hanging all over the place in a rack, which I'm sure is what you're more familiar with.Corey: Tremendous amount of weight as well because you have the metal chassis for all of those 1U things, which in some cases, you wind up with, what, 46U in a rack, assuming you can even handle the cooling needs of all that.Steve: That's right.Corey: You have so much duplication, and so much of the weight is just metal separating one thing from the next thing down below it. And there are opportunities for massive improvement, but you need to be at a certain point of scale to get there.Steve: You do. You do. And you also have to be taking on the entire problem. You can't pick at parts of these things. And that's really what we found. So, we started at this sort of—the rack level as sort of the design principle for the product itself and found that that gave us the ability to get to the right geometry, to get as much CPU horsepower and storage and throughput and networking into that kind of chassis for the least amount of wattage required, kind of the most power-efficient design possible.So, it ships at the rack level and it ships complete with both our server sled systems in Oxide, a pair of Oxide switches. This is—when I talk about, like, design decisions, you know, do we build our own switch, it was a big, big, big question early on. We were fortunate even though we were leaning towards thinking we needed to go do that, we had this prospective early investor who was early at AWS and he had asked a very tough question that none of our other investors had asked to this point, which is, “What are you going to do about the switch?”And we knew that the right answer to an investor is like, “No. We're already taking on too much.” We're redesigning a server from scratch in, kind of, the mold of what some of the hyperscalers have learned, doing our own Root of Trust, we're doing our own operating system, hypervisor control plane, et cetera. Taking on the switch could be seen as too much, but we told them, you know, we think that to be able to pull through all of the value of the security benefits and the performance and observability benefits, we can't have then this [laugh], like, obscure third-party switch rammed into this rack.Corey: It's one of those things that people don't think about, but it's the magic of cloud with AWS's network, for example, it's magic. You can get line rate—or damn near it—between any two points, sustained.Steve: That's right.Corey: Try that in the data center, you wind into massive congestion with top-of-rack switches, where, okay, we're going to parallelize this stuff out over, you know, two dozen racks and we're all going to have them seamlessly transfer information between each other at line rate. It's like, “[laugh] no, you're not because those top-of-rack switches will melt and become side-of-rack switches, and then bottom-puddle-of-rack switches. It doesn't work that way.”Steve: That's right.Corey: And you have to put a lot of thought and planning into it. That is something that I've not heard a traditional networking vendor addressing because everyone loves to hand-wave over it.Steve: Well so, and this particular prospective investor, we told him, “We think we have to go build our own switch.” And he said, “Great.” And we said, “You know, we think we're going to lose you as an investor as a result, but this is what we're doing.” And he said, “If you're building your own switch, I want to invest.” And his comment really stuck with us, which is AWS did not stand on their own two feet until they threw out their proprietary switch vendor and built their own.And that really unlocked, like you've just mentioned, like, their ability, both in hardware and software to tune and optimize to deliver that kind of line rate capability. And that is one of the big findings for us as we got into it. Yes, it was really, really hard, but based on a couple of design decisions, P4 being the programming language that we are using as the surround for our silicon, tons of opportunities opened up for us to be able to do similar kinds of optimization and observability. And that has been a big, big win.But to your question of, like, where does it stop? So, we are delivering this complete with a baked-in operating system, hypervisor, control plane. And so, the endpoint of the system, where the customer meets is either hitting an API or a CLI or a console that delivers and kind of gives you the ability to spin up projects. And, you know, if one is familiar with EC2 and EBS and VPC, that VM level of abstraction is where we stop.Corey: That, I think, is a fair way of thinking about it. And a lot of cloud folks are going to pooh-pooh it as far as saying, “Oh well, just virtual machines. That's old cloud. That just treats the cloud like a data center.” And in many cases, yes, it does because there are ways to build modern architectures that are event-driven on top of things like Lambda, and API Gateway, and the rest, but you take a look at what my customers are doing and what drives the spend, it is invariably virtual machines that are largely persistent.Sometimes they scale up, sometimes they scale down, but there's always a baseline level of load that people like to hand-wave away the fact that what they're fundamentally doing in a lot of these cases, is paying the cloud provider to handle the care and feeding of those systems, which can be expensive, yes, but also delivers significant innovation beyond what almost any company is going to be able to deliver in-house. There is no way around it. AWS is better than you are—whoever you happen to—be at replacing failed hard drives. That is a simple fact. They have teams of people who are the best in the world of replacing failed hard drives. You generally do not. They are going to be better at that than you. But that's not the only axis. There's not one calculus that leads to, is cloud a scam or is cloud a great value proposition for us? The answer is always a deeply nuanced, “It depends.”Steve: Yeah, I mean, I think cloud is a great value proposition for most and a growing amount of software that's being developed and deployed and operated. And I think, you know, one of the myths that is out there is, hey, turn over your IT to AWS because we have or you know, a cloud provider—because we have such higher caliber personnel that are really good at swapping hard drives and dealing with networks and operationally keeping this thing running in a highly available manner that delivers good performance. That is certainly true, but a lot of the operational value in an AWS is been delivered via software, the automation, the observability, and not actual people putting hands on things. And it's an important point because that's been a big part of what we're building into the product. You know, just because you're running infrastructure in your own data center, it does not mean that you should have to spend, you know, 1000 hours a month across a big team to maintain and operate it. And so, part of that, kind of, cloud, hyperscaler innovation that we're baking into this product is so that it is easier to operate with much, much, much lower overhead in a highly available, resilient manner.Corey: So, I've worked in a number of data center facilities, but the companies I was working with, were always at a scale where these were co-locations, where they would, in some cases, rent out a rack or two, in other cases, they'd rent out a cage and fill it with their own racks. They didn't own the facilities themselves. Those were always handled by other companies. So, my question for you is, if I want to get a pile of Oxide racks into my environment in a data center, what has to change? What are the expectations?I mean, yes, there's obviously going to be power and requirements at the data center colocation is very conversant with, but Open Compute, for example, had very specific requirements—to my understanding—around things like the airflow construction of the environment that they're placed within. How prescriptive is what you've built, in terms of doing a building retrofit to start using you folks?Steve: Yeah, definitely not. And this was one of the tensions that we had to balance as we were designing the product. For all of the benefits of hyperscaler computing, some of the design center for you know, the kinds of racks that run in Google and Amazon and elsewhere are hyperscaler-focused, which is unlimited power, in some cases, data centers designed around the equipment itself. And where we were headed, which was basically making hyperscaler infrastructure available to, kind of, the masses, the rest of the market, these folks don't have unlimited power and they aren't going to go be able to go redesign data centers. And so no, the experience should be—with exceptions for folks maybe that have very, very limited access to power—that you roll this rack into your existing data center. It's on standard floor tile, that you give it power, and give it networking and go.And we've spent a lot of time thinking about how we can operate in the wide-ranging environmental characteristics that are commonplace in data centers that focus on themselves, colo facilities, and the like. So, that's really on us so that the customer is not having to go to much work at all to kind of prepare and be ready for it.Corey: One of the challenges I have is how to think about what you've done because you are rack-sized. But what that means is that my own experimentation at home recently with on-prem stuff for smart home stuff involves a bunch of Raspberries Pi and a [unintelligible 00:19:42], but I tend to more or less categorize you the same way that I do AWS Outposts, as well as mythical creatures, like unicorns or giraffes, where I don't believe that all these things actually exist because I haven't seen them. And in fact, to get them in my house, all four of those things would theoretically require a loading dock if they existed, and that's a hard thing to fake on a demo signup form, as it turns out. How vaporware is what you've built? Is this all on paper and you're telling amazing stories or do they exist in the wild?Steve: So, last time we were on, it was all vaporware. It was a couple of napkin drawings and a seed round of funding.Corey: I do recall you not using that description at the time, for what it's worth. Good job.Steve: [laugh]. Yeah, well, at least we were transparent where we were going through the race. We had some napkin drawings and we had some good ideas—we thought—and—Corey: You formalize those and that's called Microsoft PowerPoint.Steve: That's it. A hundred percent.Corey: The next generative AI play is take the scrunched-up, stained napkin drawing, take a picture of it, and convert it to a slide.Steve: Google Docs, you know, one of those. But no, it's got a lot of scars from the build and it is real. In fact, next week, we are going to be shipping our first commercial systems. So, we have got a line of racks out in our manufacturing facility in lovely Rochester, Minnesota. Fun fact: Rochester, Minnesota, is where the IBM AS/400s were built.Corey: I used to work in that market, of all things.Steve: Really?Corey: Selling tape drives in the AS/400. I mean, I still maintain there's no real mainframe migration to the cloud play because there's no AWS/400. A joke that tends to sail over an awful lot of people's heads because, you know, most people aren't as miserable in their career choices as I am.Steve: Okay, that reminds me. So, when we were originally pitching Oxide and we were fundraising, we [laugh]—in a particular investor meeting, they asked, you know, “What would be a good comp? Like how should we think about what you are doing?” And fortunately, we had about 20 investor meetings to go through, so burning one on this was probably okay, but we may have used the AS/400 as a comp, talking about how [laugh] mainframe systems did such a good job of building hardware and software together. And as you can imagine, there were some blank stares in that room.But you know, there are some good analogs to historically in the computing industry, when you know, the industry, the major players in the industry, were thinking about how to deliver holistic systems to support end customers. And, you know, we see this in the what Apple has done with the iPhone, and you're seeing this as a lot of stuff in the automotive industry is being pulled in-house. I was listening to a good podcast. Jim Farley from Ford was talking about how the automotive industry historically outsourced all of the software that controls cars, right? So, like, Bosch would write the software for the controls for your seats.And they had all these suppliers that were writing the software, and what it meant was that innovation was not possible because you'd have to go out to suppliers to get software changes for any little change you wanted to make. And in the computing industry, in the 80s, you saw this blow apart where, like, firmware got outsourced. In the IBM and the clones, kind of, race, everyone started outsourcing firmware and outsourcing software. Microsoft started taking over operating systems. And then VMware emerged and was doing a virtualization layer.And this, kind of, fragmented ecosystem is the landscape today that every single on-premises infrastructure operator has to struggle with. It's a kit car. And so, pulling it back together, designing things in a vertically integrated manner is what the hyperscalers have done. And so, you mentioned Outposts. And, like, it's a good example of—I mean, the most public cloud of public cloud companies created a way for folks to get their system on-prem.I mean, if you need anything to underscore the draw and the demand for cloud computing-like, infrastructure on-prem, just the fact that that emerged at all tells you that there is this big need. Because you've got, you know, I don't know, a trillion dollars worth of IT infrastructure out there and you have maybe 10% of it in the public cloud. And that's up from 5% when Jassy was on stage in '21, talking about 95% of stuff living outside of AWS, but there's going to be a giant market of customers that need to own and operate infrastructure. And again, things have not improved much in the last 10 or 20 years for them.Corey: They have taken a tone onstage about how, “Oh, those workloads that aren't in the cloud, yet, yeah, those people are legacy idiots.” And I don't buy that for a second because believe it or not—I know that this cuts against what people commonly believe in public—but company execs are generally not morons, and they make decisions with context and constraints that we don't see. Things are the way they are for a reason. And I promise that 90% of corporate IT workloads that still live on-prem are not being managed or run by people who've never heard of the cloud. There was a decision made when some other things were migrating of, do we move this thing to the cloud or don't we? And the answer at the time was no, we're going to keep this thing on-prem where it is now for a variety of reasons of varying validity. But I don't view that as a bug. I also, frankly, don't want to live in a world where all the computers are basically run by three different companies.Steve: You're spot on, which is, like, it does a total disservice to these smart and forward-thinking teams in every one of the Fortune 1000-plus companies who are taking the constraints that they have—and some of those constraints are not monetary or entirely workload-based. If you want to flip it around, we were talking to a large cloud SaaS company and their reason for wanting to extend it beyond the public cloud is because they want to improve latency for their e-commerce platform. And navigating their way through the complex layers of the networking stack at GCP to get to where the customer assets are that are in colo facilities, adds lag time on the platform that can cost them hundreds of millions of dollars. And so, we need to think behind this notion of, like, “Oh, well, the dark ages are for software that can't run in the cloud, and that's on-prem. And it's just a matter of time until everything moves to the cloud.”In the forward-thinking models of public cloud, it should be both. I mean, you should have a consistent experience, from a certain level of the stack down, everywhere. And then it's like, do I want to rent or do I want to own for this particular use case? In my vast set of infrastructure needs, do I want this to run in a data center that Amazon runs or do I want this to run in a facility that is close to this other provider of mine? And I think that's best for all. And then it's not this kind of false dichotomy of quality infrastructure or ownership.Corey: I find that there are also workloads where people will come to me and say, “Well, we don't think this is going to be economical in the cloud”—because again, I focus on AWS bills. That is the lens I view things through, and—“The AWS sales rep says it will be. What do you think?” And I look at what they're doing and especially if involves high volumes of data transfer, I laugh a good hearty laugh and say, “Yeah, keep that thing in the data center where it is right now. You will thank me for it later.”It's, “Well, can we run this in an economical way in AWS?” As long as you're okay with economical meaning six times what you're paying a year right now for the same thing, yeah, you can. I wouldn't recommend it. And the numbers sort of speak for themselves. But it's not just an economic play.There's also the story of, does this increase their capability? Does it let them move faster toward their business goals? And in a lot of cases, the answer is no, it doesn't. It's one of those business process things that has to exist for a variety of reasons. You don't get to reimagine it for funsies and even if you did, it doesn't advance the company in what they're trying to do any, so focus on something that differentiates as opposed to this thing that you're stuck on.Steve: That's right. And what we see today is, it is easy to be in that mindset of running things on-premises is kind of backwards-facing because the experience of it is today still very, very difficult. I mean, talking to folks and they're sharing with us that it takes a hundred days from the time all the different boxes land in their warehouse to actually having usable infrastructure that developers can use. And our goal and what we intend to go hit with Oxide as you can roll in this complete rack-level system, plug it in, within an hour, you have developers that are accessing cloud-like services out of the infrastructure. And that—God, countless stories of firmware bugs that would send all the fans in the data center nonlinear and soak up 100 kW of power.Corey: Oh, God. And the problems that you had with the out-of-band management systems. For a long time, I thought Drax stood for, “Dell, RMA Another Computer.” It was awful having to deal with those things. There was so much room for innovation in that space, which no one really grabbed onto.Steve: There was a really, really interesting talk at DEFCON that we just stumbled upon yesterday. The NVIDIA folks are giving a talk on BMC exploits… and like, a very, very serious BMC exploit. And again, it's what most people don't know is, like, first of all, the BMC, the Baseboard Management Controller, is like the brainstem of the computer. It has access to—it's a backdoor into all of your infrastructure. It's a computer inside a computer and it's got software and hardware that your server OEM didn't build and doesn't understand very well.And firmware is even worse because you know, firmware written by you know, an American Megatrends or other is a big blob of software that gets loaded into these systems that is very hard to audit and very hard to ascertain what's happening. And it's no surprise when, you know, back when we were running all the data centers at a cloud computing company, that you'd run into these issues, and you'd go to the server OEM and they'd kind of throw their hands up. Well, first they'd gaslight you and say, “We've never seen this problem before,” but when you thought you've root-caused something down to firmware, it was anyone's guess. And this is kind of the current condition today. And back to, like, the journey to get here, we kind of realized that you had to blow away that old extant firmware layer, and we rewrote our own firmware in Rust. Yes [laugh], I've done a lot in Rust.Corey: No, it was in Rust, but, on some level, that's what Nitro is, as best I can tell, on the AWS side. But it turns out that you don't tend to have the same resources as a one-and-a-quarter—at the moment—trillion-dollar company. That keeps [valuing 00:30:53]. At one point, they lost a comma and that was sad and broke all my logic for that and I haven't fixed it since. Unfortunate stuff.Steve: Totally. I think that was another, kind of, question early on from certainly a lot of investors was like, “Hey, how are you going to pull this off with a smaller team and there's a lot of surface area here?” Certainly a reasonable question. Definitely was hard. The one advantage—among others—is, when you are designing something kind of in a vertical holistic manner, those design integration points are narrowed down to just your equipment.And when someone's writing firmware, when AMI is writing firmware, they're trying to do it to cover hundreds and hundreds of components across dozens and dozens of vendors. And we have the advantage of having this, like, purpose-built system, kind of, end-to-end from the lowest level from first boot instruction, all the way up through the control plane and from rack to switch to server. That definitely helped narrow the scope.Corey: This episode has been fake sponsored by our friends at AWS with the following message: Graviton Graviton, Graviton, Graviton, Graviton, Graviton, Graviton, Graviton, Graviton. Thank you for your l-, lack of support for this show. Now, AWS has been talking about Graviton an awful lot, which is their custom in-house ARM processor. Apple moved over to ARM and instead of talking about benchmarks they won't publish and marketing campaigns with words that don't mean anything, they've let the results speak for themselves. In time, I found that almost all of my workloads have moved over to ARM architecture for a variety of reason, and my laptop now gets 15 hours of battery life when all is said and done. You're building these things on top of x86. What is the deal there? I do not accept that if that you hadn't heard of ARM until just now because, as mentioned, Graviton, Graviton, Graviton.Steve: That's right. Well, so why x86, to start? And I say to start because we have just launched our first generation products. And our first-generation or second-generation products that we are now underway working on are going to be x86 as well. We've built this system on AMD Milan silicon; we are going to be launching a Genoa sled.But when you're thinking about what silicon to use, obviously, there's a bunch of parts that go into the decision. You're looking at the kind of applicability to workload, performance, power management, for sure, and if you carve up what you are trying to achieve, x86 is still a terrific fit for the broadest set of workloads that our customers are trying to solve for. And choosing which x86 architecture was certainly an easier choice, come 2019. At this point, AMD had made a bunch of improvements in performance and energy efficiency in the chip itself. We've looked at other architectures and I think as we are incorporating those in the future roadmap, it's just going to be a question of what are you trying to solve for.You mentioned power management, and that is kind of commonly been a, you know, low power systems is where folks have gone beyond x86. Is we're looking forward to hardware acceleration products and future products, we'll certainly look beyond x86, but x86 has a long, long road to go. It still is kind of the foundation for what, again, is a general-purpose cloud infrastructure for being able to slice and dice for a variety of workloads.Corey: True. I have to look around my environment and realize that Intel is not going anywhere. And that's not just an insult to their lack of progress on committed roadmaps that they consistently miss. But—Steve: [sigh].Corey: Enough on that particular topic because we want to keep this, you know, polite.Steve: Intel has definitely had some struggles for sure. They're very public ones, I think. We were really excited and continue to be very excited about their Tofino silicon line. And this came by way of the Barefoot networks acquisition. I don't know how much you had paid attention to Tofino, but what was really, really compelling about Tofino is the focus on both hardware and software and programmability.So, great chip. And P4 is the programming language that surrounds that. And we have gotten very, very deep on P4, and that is some of the best tech to come out of Intel lately. But from a core silicon perspective for the rack, we went with AMD. And again, that was a pretty straightforward decision at the time. And we're planning on having this anchored around AMD silicon for a while now.Corey: One last question I have before we wind up calling it an episode, it seems—at least as of this recording, it's still embargoed, but we're not releasing this until that winds up changing—you folks have just raised another round, which means that your napkin doodles have apparently drawn more folks in, and now that you're shipping, you're also not just bringing in customers, but also additional investor money. Tell me about that.Steve: Yes, we just completed our Series A. So, when we last spoke three years ago, we had just raised our seed and had raised $20 million at the time, and we had expected that it was going to take about that to be able to build the team and build the product and be able to get to market, and [unintelligible 00:36:14] tons of technical risk along the way. I mean, there was technical risk up and down the stack around this [De Novo 00:36:21] server design, this the switch design. And software is still the kind of disproportionate majority of what this product is, from hypervisor up through kind of control plane, the cloud services, et cetera. So—Corey: We just view it as software with a really, really confusing hardware dongle.Steve: [laugh]. Yeah. Yes.Corey: Super heavy. We're talking enterprise and government-grade here.Steve: That's right. There's a lot of software to write. And so, we had a bunch of milestones that as we got through them, one of the big ones was getting Milan silicon booting on our firmware. It was funny it was—this was the thing that clearly, like, the industry was most suspicious of, us doing our own firmware, and you could see it when we demonstrated booting this, like, a year-and-a-half ago, and AMD all of a sudden just lit up, from kind of arm's length to, like, “How can we help? This is amazing.” You know? And they could start to see the benefits of when you can tie low-level silicon intelligence up through a hypervisor there's just—Corey: No I love the existing firmware I have. Looks like it was written in 1984 and winds up having terrible user ergonomics that hasn't been updated at all, and every time something comes through, it's a 50/50 shot as whether it fries the box or not. Yeah. No, I want that.Steve: That's right. And you look at these hyperscale data centers, and it's like, no. I mean, you've got intelligence from that first boot instruction through a Root of Trust, up through the software of the hyperscaler, and up to the user level. And so, as we were going through and kind of knocking down each one of these layers of the stack, doing our own firmware, doing our own hardware Root of Trust, getting that all the way plumbed up into the hypervisor and the control plane, number one on the customer side, folks moved from, “This is really interesting. We need to figure out how we can bring cloud capabilities to our data centers. Talk to us when you have something,” to, “Okay. We actually”—back to the earlier question on vaporware, you know, it was great having customers out here to Emeryville where they can put their hands on the rack and they can, you know, put your hands on software, but being able to, like, look at real running software and that end cloud experience.And that led to getting our first couple of commercial contracts. So, we've got some great first customers, including a large department of the government, of the federal government, and a leading firm on Wall Street that we're going to be shipping systems to in a matter of weeks. And as you can imagine, along with that, that drew a bunch of renewed interest from the investor community. Certainly, a different climate today than it was back in 2019, but what was great to see is, you still have great investors that understand the importance of making bets in the hard tech space and in companies that are looking to reinvent certain industries. And so, we added—our existing investors all participated. We added a bunch of terrific new investors, both strategic and institutional.And you know, this capital is going to be super important now that we are headed into market and we are beginning to scale up the business and make sure that we have a long road to go. And of course, maybe as importantly, this was a real confidence boost for our customers. They're excited to see that Oxide is going to be around for a long time and that they can invest in this technology as an important part of their infrastructure strategy.Corey: I really want to thank you for taking the time to speak with me about, well, how far you've come in a few years. If people want to learn more and have the requisite loading dock, where should they go to find you?Steve: So, we try to put everything up on the site. So, oxidecomputer.com or oxide.computer. We also, if you remember, we did [On the Metal 00:40:07]. So, we had a Tales from the Hardware-Software Interface podcast that we did when we started. We have shifted that to Oxide and Friends, which the shift there is we're spending a little bit more time talking about the guts of what we built and why. So, if folks are interested in, like, why the heck did you build a switch and what does it look like to build a switch, we actually go to depth on that. And you know, what does bring-up on a new server motherboard look like? And it's got some episodes out there that might be worth checking out.Corey: We will definitely include a link to that in the [show notes 00:40:36]. Thank you so much for your time. I really appreciate it.Steve: Yeah, Corey. Thanks for having me on.Corey: Steve Tuck, CEO at Oxide Computer Company. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this episode, please leave a five-star review on your podcast platform of choice, along with an angry ranting comment because you are in fact a zoology major, and you're telling me that some animals do in fact exist. But I'm pretty sure of the two of them, it's the unicorn.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.
Welcome episode 227 of the Cloud Pod podcast - where the forecast is always cloudy! This week your hosts are Justin, Jonathan, Matthew and Ryan - and they're REALLY excited to tell you all about the 161 one things announced at Google Next. Literally, all the things. We're also saying farewell to EC2 Classic, Amazon SES, and Azure's Explicit Proxy - which probably isn't what you think it is. Titles we almost went with this week:
Welcome episode 226 of the Cloud Pod podcast - where the forecast is always cloudy! This week Justin, Matt and Ryan chat about all the news and announcements from Google Next, including - surprise surprise - the hot topic of AI, GKE Enterprise, Duet, Co-Pilot, Code Whisperer and more! There's even some non-Next news thrown into the episode. So whether you're interested in BART or Bard, we've got the news from SF just for you. Titles we almost went with this week:
Today's Day Two Cloud kicks off an occasional series on cloud essentials. For the first episode we discuss the Virtual Private Cloud (VPC). A VPC is an fundamental construct of a public cloud. It's essentially your slice of the shared cloud infrastructure, and you can launch and run other elements within a VPC to support your workload. Ned Bellavance walks through key VPC components including regions and AZs, networking and IP addressing, paid add-ons, data egress and associated charges, monitoring and troubleshooting, and basic security controls.
Today's Day Two Cloud kicks off an occasional series on cloud essentials. For the first episode we discuss the Virtual Private Cloud (VPC). A VPC is an fundamental construct of a public cloud. It's essentially your slice of the shared cloud infrastructure, and you can launch and run other elements within a VPC to support your workload. Ned Bellavance walks through key VPC components including regions and AZs, networking and IP addressing, paid add-ons, data egress and associated charges, monitoring and troubleshooting, and basic security controls. The post Day Two Cloud 209: Cloud Essentials – Virtual Private Clouds (VPCs) appeared first on Packet Pushers.
Today's Day Two Cloud kicks off an occasional series on cloud essentials. For the first episode we discuss the Virtual Private Cloud (VPC). A VPC is an fundamental construct of a public cloud. It's essentially your slice of the shared cloud infrastructure, and you can launch and run other elements within a VPC to support your workload. Ned Bellavance walks through key VPC components including regions and AZs, networking and IP addressing, paid add-ons, data egress and associated charges, monitoring and troubleshooting, and basic security controls.
Today's Day Two Cloud kicks off an occasional series on cloud essentials. For the first episode we discuss the Virtual Private Cloud (VPC). A VPC is an fundamental construct of a public cloud. It's essentially your slice of the shared cloud infrastructure, and you can launch and run other elements within a VPC to support your workload. Ned Bellavance walks through key VPC components including regions and AZs, networking and IP addressing, paid add-ons, data egress and associated charges, monitoring and troubleshooting, and basic security controls. The post Day Two Cloud 209: Cloud Essentials – Virtual Private Clouds (VPCs) appeared first on Packet Pushers.