Podcasts about open api

166PODCASTS
254EPISODES
42mAVG DURATION
1EPISODE EVERY OTHER WEEK
Feb 23, 2026LATEST

POPULARITY

20192020202120222023202420252026

Best podcasts about open api

Python Bytes

6 episodes with open api

Latent Space: The AI Engineer Podcast â€” CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

5 episodes with open api

All JavaScript Podcasts by Devchat.tv

4 episodes with open api

Azure Friday (HD) - Channel 9

4 episodes with open api

Les Cast Codeurs Podcast

7 episodes with open api

Azure Friday (Audio) - Channel 9

4 episodes with open api

Thinking Elixir Podcast

3 episodes with open api

Web and Mobile App Development (Language Agnostic, and Based on Real-life experience!)

4 episodes with open api

The .NET Core Podcast

3 episodes with open api

Latest podcast episodes about open api

333 - Contract Testing per API .NET: stop ai breaking change con Pact.NET

.NET in pillole

Play Episode Listen Later Feb 23, 2026 14:04

In questa puntata parlo di contract testing per API .NET e di come evitare quei breaking change che scopriamo solo dopo il deploy. Ti racconto in modo pratico cos'è un contract test, quando usarlo rispetto a integration ed E2E, e confronto due approcci concreti: Pact (Consumer-Driven Contracts) e test basati su OpenAPI.https://github.com/pact-foundation/pact-nethttps://pact.io/https://devblogs.microsoft.com/ise/pact-contract-testing-because-not-everything-needs-full-integration-tests/https://pactflow.io/blog/what-is-contract-testing/#dotnet #dotnetdeveloper #webapi #restapi #contracttesting #pact #pactnet #openapi #swagger #microservices #csharp #softwarearchitecture #ci #cicd #devops #backenddeveloper #podcast #dotnetinpillole

testing contract api pact open api e2e dotnet

Your 2026 Game Plan with Mark Hiller and Drew Coleman | Ep 093

Real Estate Team OS

Play Episode Listen Later Dec 29, 2025 39:06

This week, you've got two episodes in one! Both were recorded live on stage at Unlock by Zillow in November 2025. The first half takes you inside Hiller Group on Florida's Emerald Coast and Opt Real Estate in Portland, Oregon.MARK HILLERFinding and empowering the right person allowed Mark to double production as a solo agent, start a real estate team, and set it up to scale. He shares lessons from that person and process, explains how he preserved profits as his housing market slowed to a halt, and gives you specific tips for working successfully with virtual assistants.Go inside Hiller Group, a 12-agent, 5-staff, 8-VA team in Niceville, Florida.Watch or listen for insights from Mark on:What allowed him to double his transaction count and start his teamSpecific things his Director of Operations did to set them up to double agent count without adding any additional costHow they successfully integrated 8 VAs into their organization and how they're taskedWhy quarter-to-quarter planning makes more sense for his team than annual planningHow he preserved profit while losing 100 transactions as the housing market halted back in 2022A top takeaway for you: “You can't scale chaos. Get your people aligned and watch the business become fun again.”His top takeaway from Unlock (standards!)DREW COLEMANOpt Real Estate is a 100-person company on pace to for 1,000 transactions and more than $500M in sales. But for Founder Drew Coleman, that's not a goal - it's an outcome of a dedication to “fabled service” for agents and their clients.Be sure to listen for a great Olympics metaphor that serves as an important reminder and even a caution about recruiting and retention!Watch or listen for insights from Drew on:His goal of becoming “the best brokerage that's ever existed,” how it's like a team, and what “best” meansWho's successful in the Opt systemThe transformative nature of FUB's open API and how tools like Sisu, Rokrbox, StackWrap, and HouseWhisper helpThe value of in-house and offshore talentWhat they walk agents through for annual business planningAn AI solution they're working on for 2026A top takeaway for you: “Success requires two things: a path for agents to grow and a culture that fuels, not drains, their momentum.”His top takeaway from Unlock (Zillow Pro!)Team Bot (free, always on):→ https://realestateteamos.com/botConnect with Mark Hiller:→ https://www.instagram.com/markhillerflConnect with Drew Coleman:→ https://www.instagram.com/drewcoleman→ (503) 351 3739Connect with Real Estate Team OS→ https://www.realestateteamos.com→ https://linktr.ee/realestateteamos→ https://www.instagram.com/realestateteamos/

JSMP 29: Daria Poliakova & Ihor Maistrenko on Auto-Generated TypeScript API Clients

JavaScript Master Podcast

Play Episode Listen Later Dec 22, 2025 27:47

ai tools clients auto expand react api advantages typical javascript generated swagger team lead angular cee typescript open api ihor

#84 Tech Writer gada z aplikacją, czyli podstawy używania REST API

Tech Writer koduje

Play Episode Listen Later Dec 20, 2025 42:15

Application Programming Interface (API) pozwala na komunikację pomiędzy programami. Ma ściśle określone reguły, które mówią nam w jaki sposób możemy zażądać informacji od jakiejś aplikacji i czego możemy spodziewać się w odpowiedzi.Czy taka wiedza jest potrzebna Tech Writerom? Według nas jak najbardziej, szczególnie jeśli tworzą dokumentację dla deweloperów. Ten odcinek to podsumowanie wiedzy jaką przekazaliśmy uczestnikom podczas darmowych warsztatów "Using REST APIs for Technical Writers", które przeprowadziliśmy 28.11.2025 przy współpracy z Content Bytes.Rozmawiamy o tym czym jest API, jak sformułować żądanie, które pozwoli nam otrzymać odpowiedź, jakie typy żądań i odpowiedzi istnieją, jakich narzędzi możemy użyć i o innych podstawach, które ułatwią technoskrybom wejście w świat interakcji z aplikacjami poprzez API.Dźwięki wykorzystane w audycji pochodzą z kolekcji "107 Free Retro Game Sounds" dostępnej na stronie https://dominik-braun.net, udostępnianej na podstawie licencji Creative Commons license CC BY 4.0 (https://creativecommons.org/licenses/by/4.0/).Linki:CAKE conf: https://cakeconf.contentbytes.pl/"API", Wikipedia: https://en.wikipedia.org/wiki/API"What Is a REST API? Examples, Uses, and Challenges", Postman Blog: https://blog.postman.com/rest-api-examples/"SOAP", Wikipedia: https://en.wikipedia.org/wiki/SOAPSwagger UI: https://swagger.io/tools/swagger-ui/OpenAPI: https://www.openapis.org/cURL: https://curl.se/Visual Studio (VS) Code: https://code.visualstudio.comREST Client: https://marketplace.visualstudio.com/items?itemName=humao.rest-clientPostman API client: https://www.postman.com/product/api-client/Requestly: https://requestly.com/

challenges tech writer wikipedia api soap czy czyli curl rozmawiamy rest apis gada open api podstawy itemname

Jentic targets enterprise AI adoption with API readiness platform

Irish Tech News Audio Articles

Play Episode Listen Later Dec 18, 2025 8:46

Interview with Dorothy Creaven and Michael Cordner at AWS re:Invent Dublin-based startup Jentic was the first Irish company to complete the AWS Generative AI Accelerator, which concluded recently at AWS re:Invent in Las Vegas. The company is now focused on building enterprise awareness of its platform, supported by the launch of its AI Readiness Scorecard and its listing on AWS Marketplace. Founded in 2024 by Sean Blanchfield, Michael Cordner, and Dorothy Creaven, Jentic applies middleware and enterprise integration engineering to AI adoption, focusing on how APIs are defined, governed and safely executed by automated and agentic systems. AI adoption with API readiness platform Jentic Jentic operates at the integration layer, working with existing enterprise systems and APIs to make them clearer, more structured and more governable. This allows organisations to connect AI systems to real business infrastructure in a controlled and observable way, without replacing existing platforms or bypassing established security and compliance processes. Built on Enterprise Infrastructure Experience The company's approach is shaped by the founders' backgrounds in building large-scale infrastructure. Blanchfield previously co-founded Demonware, acquired by Activision Blizzard, and PageFair. Cordner co-founded Mindconnex, while Creaven previously led Rent the Runway's Irish operations. Speaking to Irish Tech News at AWS re:Invent, Michael Cordner, CTO of Jentic, said many enterprises are now encountering limits in how their systems were originally built. "We got away with cutting corners for 20 years when we were developing APIs for developers," said Cordner . "But now we're trying to let AI loose on those same APIs, and the standards are much more stringent. Even the most intelligent AI in the world is useless without the right information on how to actually use a system." From Jentic's perspective, the current interest in AI exposes long-standing weaknesses in enterprise integration. Automated systems can reason and decide, but they can only act through APIs. If those interfaces are poorly documented, inconsistently structured or weakly governed, behaviour becomes unpredictable. "We're a business logic and infrastructure layer for AI agents," explains Dorothy Creaven, COO of Jentic. "Software has always been built on APIs, but for AI to connect properly to enterprise systems, there has to be something that can make sense of those APIs and turn them into workflows organisations can rely on." Addressing Enterprise Control and Governance A recurring issue Jentic encounters with enterprise customers is organisational hesitation. Senior leadership often wants progress on AI strategy, while technology and security teams are concerned about control, traceability and risk. "Everyone is afraid to let AI loose in their organisation," Creaven observes. "There's a real concern about what systems might do when nobody is watching, whether actions can be traced, and how failures are handled." To address this, Jentic's platform includes a sandboxed execution environment that mirrors production APIs. This allows organisations to test AI-driven workflows, observe behaviour and understand failure modes before anything is connected to live systems. "We provide an environment that mirrors real APIs, but in a way that's safe," Creaven comments. "You can see exactly what's happening, with auditability and logging, and you can only move forward once you're confident the behaviour is correct." Launch of the AI Readiness Scorecard This approach underpins the launch of Jentic's AI Readiness Scorecard, a free, automated assessment tool introduced at AWS re:Invent. The scorecard evaluates APIs across multiple dimensions, including structure, security, documentation quality and discoverability. According to Jentic, its analysis of more than 1,500 well-known APIs highlights repeated gaps. These include missing authentication details, invalid OpenAPI specifications, i...

ai interview las vegas speaking irish built launch software coo senior adoption platform rent founded cto targets api aws automated readiness apis activision blizzard runway invent enterprise ai blanchfield open api aws marketplace irish tech news pagefair

321 - Le evoluzioni di ASP.NET Core (con .NET 10) che gli sviluppatori non possono ignorare

.NET in pillole

Play Episode Listen Later Dec 1, 2025 19:53

n questa puntata esploriamo le principali novità introdotte in ASP.NET Core 10 (escludendo Blazor) dalle ottimizzazioni di Kestrel alla validazione nelle Minimal API, dal nuovo supporto agli SSE fino ai miglioramenti di OpenAPI, sicurezza e performance. Un aggiornamento ricco di funzionalità pratiche che semplificano lo sviluppo di API moderne, più veloci e più sicure.https://learn.microsoft.com/en-us/aspnet/core/release-notes/aspnetcore-10.0https://learn.microsoft.com/en-us/aspnet/core/security/authentication/passkeys/blazorhttps://learn.microsoft.com/en-us/aspnet/core/migration/90-to-100#dotnet #aspnet #dotnet10 #podcast #dotnetinpillole

api asp possono sse kestrel net core open api dotnet blazor asp.net sviluppatori

Building on .NET 10: A Chat with Kajetan Duszyńsk, Author of '.NET 10 Revealed'

The .NET Core Podcast

Play Episode Listen Later Nov 28, 2025 67:39

Strategic Technology Consultation Services This episode of The Modern .NET Show is supported, in part, by RJJ Software's Strategic Technology Consultation Services. If you're an SME (Small to Medium Enterprise) leader wondering why your technology investments aren't delivering, or you're facing critical decisions about AI, modernization, or team productivity, let's talk. Show Notes "You actually cannot do proper vertical slice if you are bounded to controllers. Because there are some additional dependencies that you can download, like Ardalis [ApiEndpoints] or like Fast Endpoints that will give you actually what Minimal API is giving you. But with the standard controller-based approach you are not able to do the full vertical slice, because every time you'll need to take this, let's say presentation layer, outside your slice because it needs to be, just as you said, in the class that is inheriting from Controller and doing all the actions and stuff like this."— Kajetan Duszyński Hey everyone, and welcome back to The Modern .NET Show; the premier .NET podcast, focusing entirely on the knowledge, tools, and frameworks that all .NET developers should have in their toolbox. I'm your host Jamie Taylor, bringing you conversations with the brightest minds in the .NET ecosystem. Today, we're joined by Kajetan Duszyński to talk about some of the new things that are coming up in .NET 10. We cover some of the big things that you might have missed, some of the optimisations you can make by removing code (listen up for one in a few moments), and we also talk about his new book ".NET 10 Revealed." "So you all need to remember that if you are using Minimal APIs and you've used the extension method WithOpenAPI(), which created a proper OpenAPI schema. Right now it won't be used, so you'll need to delete every usage of this method from your whole application, because it will be um added by default in the pipeline of creating, of starting up the application."— Kajetan Duszyński Along the way we talked about allocations, the importance of learning MSIL (what your C# and F# code is compiled to), memory management, how fast .NET is moving and when we're likely to see the first public preview of .NET 11, and the vertical slice architecture. One of the biggest things that I think will cause some head scratching in .NET 10 is the new local self-signed TLS certificate. I've linked to an article by the folks at Duende about this, and it'll be worth adding it to your reading list. It's a great addition to .NET 10, but it'll catch some folks out. Before we jump in, a quick reminder: if The Modern .NET Show has become part of your learning journey, please consider supporting us through Patreon or Buy Me A Coffee. Every contribution helps us continue bringing you these in-depth conversations with industry experts. You'll find all the links in the show notes. Anyway, without further ado, let's sit back, open up a terminal, type in `dotnet new podcast` and we'll dive into the core of Modern .NET. Full Show Notes The full show notes, including links to some of the things we discussed and a full transcription of this episode, can be found at: https://dotnetcore.show/season-8/building-on-net-10-a-chat-with-kajetan-duszynsk-author-of-net-10-revealed Useful Links: Ardalis ApiEndpoints REPR pattern Fast Endpoints Why You Should Be Using .NET 10's New TLS Certificate Kajetan's .NET school Kajetan on LinkedIn .NET 10 Revealed Supporting the show: Leave a rating or review Buy the show a coffee Become a patron Getting in touch: via the contact page joining the Discord Podcast editing services provided by Matthew Bliss Music created by Mono Memory Music, licensed to RJJ Software for use in The Modern .NET Show Editing and post-production services for this episode were provided by MB Podcast Services Supporting the show: Leave a rating or review Buy the show a coffee Become a patron Getting in Touch: Via the contact page Joining the Discord Remember to rate and review the show on Apple Podcasts, Podchaser, or wherever you find your podcasts, this will help the show's audience grow. Or you can just share the show with a friend. And don't forget to reach out via our Contact page. We're very interested in your opinion of the show, so please get in touch. You can support the show by making a monthly donation on the show's Patreon page at: https://www.patreon.com/TheDotNetCorePodcast. Music created by Mono Memory Music, licensed to RJJ Software for use in The Modern .NET Show. Editing and post-production services for this episode were provided by MB Podcast Services.

music ai modern revealed editing controller podchaser tls duende open api jamie taylor

LCC 332 - Groquik revient, Emmanuel s'en va

Les Cast Codeurs Podcast

Play Episode Listen Later Nov 18, 2025 92:07

Dans cet épisode, Emmanuel, Katia et Guillaume discutent de Spring 7, Quarkus, d'Infinispan et Keycloak. On discute aussi de projets sympas comme Javelit, de comment démarre une JVM, du besoin d'argent de NTP. Et puis on discute du changement de carrière d'Emmanuel. Enregistré le 14 novembre 2025 Téléchargement de l'épisode LesCastCodeurs-Episode-332.mp3 ou en vidéo sur YouTube. News Emmanuel quitte Red Hat après 20 ans https://emmanuelbernard.com/blog/2025/11/13/leaving-redhat/ Langages Support HTTP/3 dans le HttpClient de JDK 26 - https://inside.java/2025/10/22/http3-support/ JDK 26 introduit le support de HTTP/3 dans l'API HttpClient existante depuis Java 11 HTTP/3 utilise le protocole QUIC sur UDP au lieu de TCP utilisé par HTTP/2 Par défaut HttpClient préfère HTTP/2, il faut explicitement configurer HTTP/3 avec Version.HTTP_3 Le client effectue automatiquement un downgrade vers HTTP/2 puis HTTP/1.1 si le serveur ne supporte pas HTTP/3 On peut forcer l'utilisation exclusive de HTTP/3 avec l'option H3_DISCOVERY en mode HTTP_3_URI_ONLY HttpClient apprend qu'un serveur supporte HTTP/3 via le header alt-svc (RFC 7838) et utilise cette info pour les requêtes suivantes La première requête peut utiliser HTTP/2 même avec HTTP/3 préféré, mais la seconde utilisera HTTP/3 si le serveur l'annonce L'équipe OpenJDK encourage les tests et retours d'expérience sur les builds early access de JDK 26 Librairies Eclispe Jetty et CometD changent leurs stratégie de support https://webtide.com/end-of-life-changes-to-eclipse-jetty-and-cometd/ À partir du 1er janvier 2026, Webtide ne publiera plus Jetty 9/10/11 et CometD 5/6/7 sur Maven Central Pendant 20 ans, Webtide a financé les projets Jetty et CometD via services et support, publiant gratuitement les mises à jour EOL Le comportement des entreprises a changé : beaucoup cherchent juste du gratuit plutôt que du véritable support Des sociétés utilisent des versions de plus de 10 ans sans migrer tant que les correctifs CVE sont gratuits Cette politique gratuite a involontairement encouragé la complaisance et retardé les migrations vers versions récentes MITRE développe des changements au système CVE pour mieux gérer les concepts d'EOL Webtide lance un programme de partenariat avec TuxCare et HeroDevs pour distribuer les résolutions CVE des versions EOL Les binaires EOL seront désormais distribués uniquement aux clients commerciaux et via le réseau de partenaires Webtide continue le support standard open-source : quand Jetty 13 sortira, Jetty 12.1 recevra des mises à jour pendant 6 mois à un an Ce changement vise à clarifier la politique EOL avec une terminologie industrielle établie Améliorations cloud du SDK A2A Java https://quarkus.io/blog/quarkus-a2a-cloud-enhancements/ Version 0.3.0.Final du SDK A2A Java apporte des améliorations pour les environnements cloud et distribués Composants en mémoire remplacés par des implémentations persistantes et répliquées pour environnements multi-instances JpaDatabaseTaskStore et JpaDatabasePushNotificationConfigStore permettent la persistance des tâches et configurations en base PostgreSQL ReplicatedQueueManager assure la réplication des événements entre instances A2A Agent via Kafka et MicroProfile Reactive Messaging Exemple complet de déploiement Kubernetes avec Kind incluant PostgreSQL, Kafka via Strimzi, et load balancing entre pods Démonstration pratique montrant que les messages peuvent être traités par différents pods tout en maintenant la cohérence des tâches Architecture inspirée du SDK Python A2A, permettant la gestion de tâches asynchrones longues durée en environnement distribué Quarkus 3.29 sort avec des backends de cache multiples et support du débogueur Qute https://quarkus.io/blog/quarkus-3-29-released/ Possibilité d'utiliser plusieurs backends de cache simultanément dans une même application Chaque cache peut être associé à un backend spécifique (par exemple Caffeine et Redis ou Infinispan) Support du Debug Adapter Protocol (DAP) pour déboguer les templates Qute directement dans l'IDE et dans la version 3.28 Configuration programmatique de la protection CSRF via une API fluent Possibilité de restreindre les filtres OIDC à des flux d'authentification spécifiques avec annotations Support des dashboards Grafana personnalisés via fichiers JSON dans META-INF/grafana/ Extension Liquibase MongoDB supporte désormais plusieurs clients simultanés Amélioration significative des performances de build avec réduction des allocations mémoire Parallélisation de tâches comme la génération de proxies Hibernate ORM et la construction des Jar Et l'utilisation des fichiers .proto est plus simple dans Quarkus avbec Quarkus gRPC Zero https://quarkus.io/blog/grpc-zero/ c'est toujours galere des fichiers .proto car les generateurs demandent des executables natifs maintenant ils sont bundlés dans la JVM et vous n'avez rien a configurer cela utilise Caffeine pour faire tourner cela en WASM dans la JVM Spring AI 1.1 est presque là https://spring.io/blog/2025/11/08/spring-ai-1-1-0-RC1-available-now support des MCP tool caching pour les callback qui reduit les iooerations redondantes Access au contenu de raisonnement OpenAI Un modele de Chat MongoDB Support du modele de penser Ollama Reessaye sur les echec de reseau OpenAI speech to text Spring gRPC Les prochaines étapes pour la 1.0.0 https://spring.io/blog/2025/11/05/spring-grpc-next-steps Spring gRPC 1.0 arrive prochainement avec support de Spring Boot 4 L'intégration dans Spring Boot 4.0 est reportée, prévue pour Spring Boot 4.1 Les coordonnées Maven restent sous org.springframework.grpc pour la version 1.0 Le jar spring-grpc-test est renommé en spring-grpc-test-spring-boot-autoconfigure Les packages d'autoconfiguration changent de nom nécessitant de modifier les imports Les dépendances d'autoconfiguration seront immédiatement dépréciées après la release 1.0 Migration minimale attendue pour les projets utilisant déjà la version 0.x La version 1.0.0-RC1 sera publiée dès que possible avant la version finale Spring arrete le support reactif d'Apache Pulsar https://spring.io/blog/2025/10/29/spring-pulsar-reactive-discontinued logique d'évaluer le temps passé vs le nombre d'utilisateurs c'est cependant une tendance qu'on a vu s'accélerer Spring 7 est sorti https://spring.io/blog/2025/11/13/spring-framework-7-0-general-availability Infrastructure Infinispan 16.0 https://infinispan.org/blog/2025/11/10/infinispan-16-0 Ajout majeur : migration en ligne sans interruption pour les nœuds d'un cluster (rolling upgrades) (infinispan.org) Messages de clustering refaits avec Protocol Buffers + ProtoStream : meilleure compatibilité, schéma évolutif garanti (infinispan.org) Console Web améliorée API dédiée de gestion des schémas (SchemasAdmin) pour gérer les schémas ProtoStream à distance (infinispan.org) Module de requête (query) optimisé : support complet des agrégations (sum, avg …) dans les requêtes indexées en cluster grâce à l'intégration de Hibernate Search 8.1 (infinispan.org) Serveur : image conteneur minimalisée pour réduire la surface d'attaque (infinispan.org) démarrage plus rapide grâce à séparation du démarrage cache/serveur (infinispan.org) caches pour connecteurs (Memcached, RESP) créés à la demande (on-demand) et non à l'initiaton automatique (infinispan.org) moteur Lua 5.1 mis à jour avec corrections de vulnérabilités et opérations dangereuses désactivées (infinispan.org) Support JDK : version minimale toujours JDK 17 (infinispan.org) prise en charge des threads virtuels (virtual threads) et des fonctionnalités AOT (Ahead-of-Time) de JDK plus récentes (infinispan.org) Web Javelit, une nouvelle librairie Java inspirée de Streamlit pour faire facilement et rapidement des petites interfaces web https://glaforge.dev/posts/2025/10/24/javelit-to-create-quick-interactive-app-frontends-in-java/ Site web du projet : https://javelit.io/ Javelit : outil pour créer rapidement des applications de données (mais pas que) en Java. Simplifie le développement : élimine les tracas du frontend et de la gestion des événements. Transforme une classe Java en application web en quelques minutes. Inspiré par la simplicité de Streamlit de l'écosystème Python (ou Gradio et Mesop), mais pour Java. Développement axé sur la logique : pas de code standard répétitif (boilerplate), rechargement à chaud. Interactions faciles : les widgets retournent directement leur valeur, sans besoin de HTML/CSS/JS ou gestion d'événements. Déploiement flexible : applications autonomes ou intégrables dans des frameworks Java (Spring, Quarkus, etc.). L'article de Guillaume montre comment créer une petite interface pour créer et modifier des images avec le modèle génératif Nano Banana Un deuxième article montre comment utiliser Javelit pour créer une interface de chat avec LangChain4j https://glaforge.dev/posts/2025/10/25/creating-a-javelit-chat-interface-for-langchain4j/ Améliorer l'accessibilité avec les applis JetPack Compose https://blog.ippon.fr/2025/10/29/rendre-son-application-accessible-avec-jetpack-compose/ TalkBack est le lecteur d'écran Android qui vocalise les éléments sélectionnés pour les personnes malvoyantes Accessibility Scanner et les outils Android Studio détectent automatiquement les problèmes d'accessibilité statiques Les images fonctionnelles doivent avoir un contentDescription, les images décoratives contentDescription null Le contraste minimum requis est de 4.5:1 pour le texte normal et 3:1 pour le texte large ou les icônes Les zones cliquables doivent mesurer au minimum 48dp x 48dp pour faciliter l'interaction Les formulaires nécessitent des labels visibles permanents et non de simples placeholders qui disparaissent Modifier.semantics permet de définir l'arbre sémantique lu par les lecteurs d'écran Les propriétés mergeDescendants et traversalIndex contrôlent l'ordre et le regroupement de la lecture Diriger le navigateur Chrome avec le modèle Gemini Computer Use https://glaforge.dev/posts/2025/11/03/driving-a-web-browser-with-gemini-computer-use-model-in-java/ Objectif : Automatiser la navigation web en Java avec le modèle "Computer Use" de Gemini 2.5 Pro. Modèle "Computer Use" : Gemini analyse des captures d'écran et génère des actions d'interface (clic, saisie, etc.). Outils : Gemini API, Java, Playwright (pour l'interaction navigateur). Fonctionnement : Boucle agent où Gemini reçoit une capture, propose une action, Playwright l'exécute, puis une nouvelle capture est envoyée à Gemini. Implémentation clé : Toujours envoyer une capture d'écran à Gemini après chaque action pour qu'il comprenne l'état actuel. Défis : Lenteur, gestion des CAPTCHA et pop-ups (gérables). Potentiel : Automatisation des tâches web répétitives, création d'agents autonomes. Data et Intelligence Artificielle Apicurio ajoute le support de nouveaux schema sans reconstruire Apicurio https://www.apicur.io/blog/2025/10/27/custom-artifact-types Apicurio Registry 3.1.0 permet d'ajouter des types d'artefacts personnalisés au moment du déploiement sans recompiler le projet Supporte nativement OpenAPI, AsyncAPI, Avro, JSON Schema, Protobuf, GraphQL, WSDL et XSD Trois approches d'implémentation disponibles : classes Java pour la performance maximale, JavaScript/TypeScript pour la facilité de développement, ou webhooks pour une flexibilité totale Configuration via un simple fichier JSON pointant vers les implémentations des composants personnalisés Les scripts JavaScript sont exécutés via QuickJS dans un environnement sandboxé sécurisé Un package npm TypeScript fournit l'autocomplétion et la sécurité de type pour le développement Six composants optionnels configurables : détection automatique de type, validation, vérification de compatibilité, canonicalisation, déréférencement et recherche de références Cas d'usage typiques : formats propriétaires internes, support RAML, formats legacy comme WADL, schémas spécifiques à un domaine métier Déploiement simple via Docker en montant les fichiers de configuration et scripts comme volumes Les performances varient selon l'approche : Java offre les meilleures performances, JavaScript un bon équilibre, webhooks la flexibilité maximale Le truc interessant c'est que c'est Quarkus based et donc demandait le rebuilt donc pour eviter cela, ils ont ajouter QuickJS via Chicorey un moteur WebAssembly GPT 5.1 pour les développeurs est sorti. https://openai.com/index/gpt-5-1-for-developers/ C'est le meilleur puisque c'est le dernier :slightly_smiling_face: Raisonnement Adaptatif et Efficace : GPT-5.1 ajuste dynamiquement son temps de réflexion en fonction de la complexité de la tâche, le rendant nettement plus rapide et plus économique en jetons pour les tâches simples, tout en maintenant des performances de pointe sur les tâches difficiles. Nouveau Mode « Sans Raisonnement » : Un mode (reasoning_effort='none') a été introduit pour les cas d'utilisation sensibles à la latence, permettant une réponse plus rapide avec une intelligence élevée et une meilleure exécution des outils. Cache de Prompt Étendu : La mise en cache des invites est étendue jusqu'à 24 heures (contre quelques minutes auparavant), ce qui réduit la latence et le coût pour les interactions de longue durée (chats multi-tours, sessions de codage). Les jetons mis en cache sont 90 % moins chers. Améliorations en Codage : Le modèle offre une meilleure personnalité de codage, une qualité de code améliorée et de meilleures performances sur les tâches d'agenticité de code, atteignant 76,3 % sur SWE-bench Verified. Nouveaux Outils pour les Développeurs : Deux nouveaux outils sont introduits ( https://cookbook.openai.com/examples/build_a_coding_agent_with_gpt-5.1 ) : L'outil apply_patch pour des modifications de code plus fiables via des diffs structurés. L'outil shell qui permet au modèle de proposer et d'exécuter des commandes shell sur une machine locale, facilitant les boucles d'inspection et d'exécution. Disponibilité : GPT-5.1 (ainsi que les modèles gpt-5.1-codex) est disponible pour les développeurs sur toutes les plateformes API payantes, avec les mêmes tarifs et limites de débit que GPT-5. Comparaison de similarité d'articles et de documents avec les embedding models https://glaforge.dev/posts/2025/11/12/finding-related-articles-with-vector-embedding-models/ Principe : Convertir les articles en vecteurs numériques ; la similarité sémantique est mesurée par la proximité de ces vecteurs. Démarche : Résumé des articles via Gemini-2.5-flash. Conversion des résumés en vecteurs (embeddings) par Gemini-embedding-001. Calcul de la similarité entre vecteurs par similarité cosinus. Affichage des 3 articles les plus pertinents (>0.75) dans le frontmatter Hugo. Bilan : Approche "résumé et embedding" efficace, pragmatique et améliorant l'engagement des lecteurs. Outillage Composer : Nouveau modèle d'agent rapide pour l'ingénierie logicielle - https://cursor.com/blog/composer Composer est un modèle d'agent conçu pour l'ingénierie logicielle qui génère du code quatre fois plus rapidement que les modèles similaires Le modèle est entraîné sur de vrais défis d'ingénierie logicielle dans de grandes bases de code avec accès à des outils de recherche et d'édition Il s'agit d'un modèle de type mixture-of-experts optimisé pour des réponses interactives et rapides afin de maintenir le flux de développement L'entraînement utilise l'apprentissage par renforcement dans divers environnements de développement avec des outils comme la lecture de fichiers, l'édition, les commandes terminal et la recherche sémantique Cursor Bench est un benchmark d'évaluation basé sur de vraies demandes d'ingénieurs qui mesure la correction et le respect des abstractions du code existant Le modèle apprend automatiquement des comportements utiles comme effectuer des recherches complexes, corriger les erreurs de linter et écrire des tests unitaires L'infrastructure d'entraînement utilise PyTorch et Ray avec des kernels MXFP8 pour entraîner sur des milliers de GPUs NVIDIA Le système exécute des centaines de milliers d'environnements de codage sandboxés concurrents dans le cloud pour l'entraînement Composer est déjà utilisé quotidiennement par les développeurs de Cursor pour leur propre travail Le modèle se positionne juste derrière GPT-5 et Sonnet 4.5 en termes de performance sur les benchmarks internes Rex sur l'utilisation de l'IA pour les développeurs, un gain de productivité réel et des contextes adaptés https://mcorbin.fr/posts/2025-10-17-genai-dev/ Un développeur avec 18 ans d'expérience partage son retour sur l'IA générative après avoir changé d'avis Utilise exclusivement Claude Code dans le terminal pour coder en langage naturel Le "vibe coding" permet de générer des scripts et interfaces sans regarder le code généré Génération rapide de scripts Python pour traiter des CSV, JSON ou créer des interfaces HTML Le mode chirurgien résout des bugs complexes en one-shot, exemple avec un plugin Grafana fixé en une minute Pour le code de production, l'IA génère les couches repository, service et API de manière itérative, mais le dev controle le modele de données Le développeur relit toujours le code et ajuste manuellement ou via l'IA selon le besoin L'IA ne remplacera pas les développeurs car la réflexion, conception et expertise technique restent essentielles La construction de produits robustes, scalables et maintenables nécessite une expérience humaine L'IA libère du temps sur les tâches répétitives et permet de se concentrer sur les aspects complexes ce que je trouve interessant c'est la partie sur le code de prod effectivement, je corrige aussi beaucoup les propositions de l'IA en lui demandant de faire mieux dans tel ou tel domaine Sans guide, tout cela serait perdu Affaire a suivre un article en parallele sur le métier de designer https://blog.ippon.fr/2025/11/03/lia-ne-remplace-pas-un-designer-elle-amplifie-la-difference-entre-faire-et-bien-faire/ Plus besoin de se rappeler les racourcis dans IntelliJ idea avec l'universal entry point https://blog.jetbrains.com/idea/2025/11/universal-entry-point-a-single-entry-point-for-context-aware-coding-assistance/ IntelliJ IDEA introduit Command Completion, une nouvelle façon d'accéder aux actions de l'IDE directement depuis l'éditeur Fonctionne comme la complétion de code : tapez point (.) pour voir les actions contextuelles disponibles Tapez double point (..) pour filtrer et n'afficher que les actions disponibles Propose des corrections, refactorings, génération de code et navigation selon le contexte Complète les fonctionnalités existantes sans les remplacer : raccourcis, Alt+Enter, Search Everywhere Facilite la découverte des fonctionnalités de l'IDE sans interrompre le flux de développement En Beta dans la version 2025.2, sera activé par défaut dans 2025.3 Support actuel pour Java et Kotlin, avec actions spécifiques aux frameworks comme Spring et Hibernate Homebrew, package manage pour macOS et Linux passe en version 5 https://brew.sh/2025/11/12/homebrew-5.0.0/ Téléchargements Parallèles par Défaut : Le paramètre HOMEBREW_DOWNLOAD_CONCURRENCY=auto est activé par défaut, permettant des téléchargements concurrents pour tous les utilisateurs, avec un rapport de progression. Support Linux ARM64/AArch64 en Tier 1 : Le support pour Linux ARM64/AArch64 a été promu au niveau "Tier 1" (support officiel de premier plan). Feuille de Route pour les Dépréciations macOS : Septembre 2026 (ou plus tard) : Homebrew ne fonctionnera plus sur macOS Catalina (10.15) et versions antérieures. macOS Intel (x86_64) passera en "Tier 3" (fin du support CI et des binaires précompilés/bottles). Septembre 2027 (ou plus tard) : Homebrew ne fonctionnera plus sur macOS Big Sur (11) sur Apple Silicon ni du tout sur Intel (x86_64). Sécurité et Casks : Dépréciation des Casks sans signature de code. Désactivation des Casks échouant aux vérifications Gatekeeper en septembre 2026. Les options --no-quarantine et --quarantine sont dépréciés pour ne plus faciliter le contournement des fonctionnalités de sécurité de macOS. Nouvelles Fonctionnalités & Améliorations : Support officiel pour macOS 26 (Tahoe). brew bundle supporte désormais l'installation de packages Go via un Brewfile. Ajout de la commande brew info --sizes pour afficher la taille des formulae et casks. La commande brew search --alpine permet de chercher des packages Alpine Linux. Architecture Selon l'analyste RedMonk, Java reste très pertinent dans l'aire de l'IA et des agents https://redmonk.com/jgovernor/java-relevance-in-the-ai-era-agent-frameworks-emerge/ Java reste pertinent à l'ère de l'IA, pas besoin d'apprendre une pile technique entièrement nouvelle. Capacité d'adaptation de Java ("anticorps") aux innovations (Big Data, cloud, IA), le rendant idéal pour les contextes d'entreprise. L'écosystème JVM offre des avantages sur Python pour la logique métier et les applications sophistiquées, notamment en termes de sécurité et d'évolutivité. Embabel (par Rod Johnson, créateur de Spring) : un framework d'agents fortement typé pour JVM, visant le déterminisme des projets avant la génération de code par LLM. LangChain4J : facilite l'accès aux capacités d'IA pour les développeurs Java, s'aligne sur les modèles d'entreprise établis et permet aux LLM d'appeler des méthodes Java. Koog (Jetbrains) : framework d'agents basé sur Kotlin, typé et spécifique aux développeurs JVM/Kotlin. Akka : a pivoté pour se concentrer sur les flux de travail d'agents IA, abordant la complexité, la confiance et les coûts des agents dans les systèmes distribués. Le Model Context Protocol (MCP) est jugé insuffisant, manquant d'explicabilité, de découvrabilité, de capacité à mélanger les modèles, de garde-fous, de gestion de flux, de composabilité et d'intégration sécurisée. Les développeurs Java sont bien placés pour construire des applications compatibles IA et intégrer des agents. Des acteurs majeurs comme IBM, Red Hat et Oracle continuent d'investir massivement dans Java et son intégration avec l'IA. Sécurité AI Deepfake, Hiring … A danger réel https://www.eu-startups.com/2025/10/european-startups-get-serious-about-deepfakes-as-ai-fraud-losses-surpass-e1-3-billion/ Pertes liées aux deepfakes en Europe : > 1,3 milliard € (860 M € rien qu'en 2025). Création de deepfakes désormais possible pour quelques euros. Fraudes : faux entretiens vidéo, usurpations d'identité, arnaques diverses. Startups actives : Acoru, IdentifAI, Trustfull, Innerworks, Keyless (détection et prévention). Réglementation : AI Act et Digital Services Act imposent transparence et contrôle. Recommandations : vérifier identités, former employés, adopter authentification multi-facteurs. En lien : https://www.techmonitor.ai/technology/cybersecurity/remote-hiring-cybersecurity 1 Candidat sur 4 sera Fake en 2028 selon Gartner research https://www.gartner.com/en/newsroom/press-releases/2025-07-31-gartner-survey-shows-j[…]-percent-of-job-applicants-trust-ai-will-fairly-evaluate-them Loi, société et organisation Amazon - prévoit supprimer 30.000 postes https://www.20minutes.fr/economie/4181936-20251028-amazon-prevoit-supprimer-30-000-emplois-bureau-selon-plusieurs-medias Postes supprimés : 30 000 bureaux Part des effectifs : ~10 % des employés corporatifs Tranche confirmée : 14 000 postes Divisions touchées : RH, Opérations, Devices & Services, Cloud Motifs : sur-recrutement, bureaucratie, automatisation/IA Accompagnement : 90 jours pour poste interne + aides Non concernés : entrepôts/logistique Objectif : concentrer sur priorités stratégiques NTP a besoin d'argent https://www.ntp.org/ Il n'est que le protocole qui synchronise toutes les machines du monde La fondation https://www.nwtime.org/ recherche 11000$ pour maintenir son activité Rubrique débutant Une plongée approfondie dans le démarrage de la JVM https://inside.java/2025/01/28/jvm-start-up La JVM effectue une initialisation complexe avant d'exécuter le code : validation des arguments, détection des ressources système et sélection du garbage collector approprié Le chargement de classes suit une stratégie lazy où chaque classe charge d'abord ses dépendances dans l'ordre de déclaration, créant une chaîne d'environ 450 classes même pour un simple Hello World La liaison de classes comprend trois sous-processus : vérification de la structure, préparation avec initialisation des champs statiques à leurs valeurs par défaut, et résolution des références symboliques du Constant Pool Le CDS améliore les performances au démarrage en fournissant des classes pré-vérifiées, réduisant le travail de la JVM L'initialisation de classe exécute les initialiseurs statiques via la méthode spéciale clinit générée automatiquement par javac Le Project Leyden introduit la compilation AOT dans JDK 24 pour réduire le temps de démarrage en effectuant le chargement et la liaison de classes en avance de phase Pas si débutant finalement Conférences La liste des conférences provenant de Developers Conferences Agenda/List par Aurélie Vache et contributeurs : 12-14 novembre 2025 : Devoxx Morocco - Marrakech (Morocco) 15-16 novembre 2025 : Capitole du Libre - Toulouse (France) 19 novembre 2025 : SREday Paris 2025 Q4 - Paris (France) 19-21 novembre 2025 : Agile Grenoble - Grenoble (France) 20 novembre 2025 : OVHcloud Summit - Paris (France) 21 novembre 2025 : DevFest Paris 2025 - Paris (France) 24 novembre 2025 : Forward Data & AI Conference - Paris (France) 27 novembre 2025 : DevFest Strasbourg 2025 - Strasbourg (France) 28 novembre 2025 : DevFest Lyon - Lyon (France) 1-2 décembre 2025 : Tech Rocks Summit 2025 - Paris (France) 4-5 décembre 2025 : Agile Tour Rennes - Rennes (France) 5 décembre 2025 : DevFest Dijon 2025 - Dijon (France) 9-11 décembre 2025 : APIdays Paris - Paris (France) 9-11 décembre 2025 : Green IO Paris - Paris (France) 10-11 décembre 2025 : Devops REX - Paris (France) 10-11 décembre 2025 : Open Source Experience - Paris (France) 11 décembre 2025 : Normandie.ai 2025 - Rouen (France) 14-17 janvier 2026 : SnowCamp 2026 - Grenoble (France) 22 janvier 2026 : DevCon #26 : sécurité / post-quantique / hacking - Paris (France) 29-31 janvier 2026 : Epitech Summit 2026 - Paris - Paris (France) 2-5 février 2026 : Epitech Summit 2026 - Moulins - Moulins (France) 2-6 février 2026 : Web Days Convention - Aix-en-Provence (France) 3 février 2026 : Cloud Native Days France 2026 - Paris (France) 3-4 février 2026 : Epitech Summit 2026 - Lille - Lille (France) 3-4 février 2026 : Epitech Summit 2026 - Mulhouse - Mulhouse (France) 3-4 février 2026 : Epitech Summit 2026 - Nancy - Nancy (France) 3-4 février 2026 : Epitech Summit 2026 - Nantes - Nantes (France) 3-4 février 2026 : Epitech Summit 2026 - Marseille - Marseille (France) 3-4 février 2026 : Epitech Summit 2026 - Rennes - Rennes (France) 3-4 février 2026 : Epitech Summit 2026 - Montpellier - Montpellier (France) 3-4 février 2026 : Epitech Summit 2026 - Strasbourg - Strasbourg (France) 3-4 février 2026 : Epitech Summit 2026 - Toulouse - Toulouse (France) 4-5 février 2026 : Epitech Summit 2026 - Bordeaux - Bordeaux (France) 4-5 février 2026 : Epitech Summit 2026 - Lyon - Lyon (France) 4-6 février 2026 : Epitech Summit 2026 - Nice - Nice (France) 12-13 février 2026 : Touraine Tech #26 - Tours (France) 26-27 mars 2026 : SymfonyLive Paris 2026 - Paris (France) 27-29 mars 2026 : Shift - Nantes (France) 31 mars 2026 : ParisTestConf - Paris (France) 16-17 avril 2026 : MiXiT 2026 - Lyon (France) 22-24 avril 2026 : Devoxx France 2026 - Paris (France) 23-25 avril 2026 : Devoxx Greece - Athens (Greece) 6-7 mai 2026 : Devoxx UK 2026 - London (UK) 22 mai 2026 : AFUP Day 2026 Lille - Lille (France) 22 mai 2026 : AFUP Day 2026 Paris - Paris (France) 22 mai 2026 : AFUP Day 2026 Bordeaux - Bordeaux (France) 22 mai 2026 : AFUP Day 2026 Lyon - Lyon (France) 17 juin 2026 : Devoxx Poland - Krakow (Poland) 11-12 juillet 2026 : DevLille 2026 - Lille (France) 4 septembre 2026 : JUG Summer Camp 2026 - La Rochelle (France) 17-18 septembre 2026 : API Platform Conference 2026 - Lille (France) 5-9 octobre 2026 : Devoxx Belgium - Antwerp (Belgium) Nous contacter Pour réagir à cet épisode, venez discuter sur le groupe Google https://groups.google.com/group/lescastcodeurs Contactez-nous via X/twitter https://twitter.com/lescastcodeurs ou Bluesky https://bsky.app/profile/lescastcodeurs.com Faire un crowdcast ou une crowdquestion Soutenez Les Cast Codeurs sur Patreon https://www.patreon.com/LesCastCodeurs Tous les épisodes et toutes les infos sur https://lescastcodeurs.com/

amazon time europe google spring data fake startups android dans ibm architecture conversion messages oracle route intel tier ia migration pas gemini openai faire big data blue sky api composer conf chrome gpt caffeine python toujours linux java guillaume rh divisions interactions aur javascript llm playwright macos kafka prompt gartner mod cas maven propose module tahoe normandie cache katia inspir gatekeeper verified homebrew affaire objectif red hat lua docker compl loi kubernetes sonnets talkback transforme revient fraudes enregistr parall utilise json mcp capacit mitre configuration london uk candidat cursor ai act paris france typescript aot cve fonctionne graphql possibilit captcha digital services act tcp recommandations csv rfc capitole resp redis comparaison apple silicon ntp postgresql kotlin postes feuille calcul modifier jvm wasm vache jetty tranche diriger eol macos big sur swe macos catalina pytorch mixit udp pertes grafana akka ajout open api avro lenteur impl devcon rubrique keyless spring boot quic casks jdk android studio lyon france serveur intellij openjdk rc1 csrf raml rod johnson redmonk affichage intellij idea memcached provence france streamlit strasbourg france oidc alpine linux lille france gradio json schema wsdl dijon france herodevs devoxx france

Testing Made Easy: Debbie O'Brien Explains Playwright and its Game-Changing MCP Server

The .NET Core Podcast

Play Episode Listen Later Nov 14, 2025 57:33

Strategic Technology Consultation Services This episode of The Modern .NET Show is supported, in part, by RJJ Software's Strategic Technology Consultation Services. If you're an SME (Small to Medium Enterprise) leader wondering why your technology investments aren't delivering, or you're facing critical decisions about AI, modernization, or team productivity, let's talk. Show Notes "It's not just guessing. It's not just saying, "oh, there's something to log in. I think we'll call the button login." It actually knows the button is called Login, it's seen it. So that makes a big difference and makes it much more resilient. So that's definitely a big change, right? It's not just guessing. So definitely you should try it out."— Debbie O'Brien Hey everyone, and welcome back to The Modern .NET Show; the premier .NET podcast, focusing entirely on the knowledge, tools, and frameworks that all .NET developers should have in their toolbox. I'm your host Jamie Taylor, bringing you conversations with the brightest minds in the .NET ecosystem. Today, we're joined by Debbie O'Brien to talk about both Playwright and the Playwright MCP server. We started with an introduction to Playwright, and talked about how both it and the MCP server for it can help you to automate both the writing and running of tests for your applications. Pro tip: If you've been using the Swagger UI in your applications, you've been using Open API. "And that's where the Playwright MCP comes In because it can automate a browser, it can basically go to the website, it can navigate, it can click, it can hover, it can do everything that you are doing in your tests."— Debbie O'Brien Along the way, we talked about how Playwright's MCP server can help you to find test cases that you might not have thought of initially. As a perfect example of this while recording this episode, Debbie found a bug in the app that I use to record episodes of the show, and talked about how Playwright MCP would help to recreate and debug the issue. It's worth pointing out that we recorded this in early August 2025, and that AI quite literally moves very rapidly. Whilst Playwright and MCP servers are not likely to change too much between recording the episode an when it went out, it'll be worth bearing that in mind as we talk about some of the AI stuff. Before we jump in, a quick reminder: if The Modern .NET Show has become part of your learning journey, please consider supporting us through Patreon or Buy Me A Coffee. Every contribution helps us continue bringing you these in-depth conversations with industry experts. You'll find all the links in the show notes. Anyway, without further ado, let's sit back, open up a terminal, type in `dotnet new podcast` and we'll dive into the core of Modern .NET. Full Show Notes The full show notes, including links to some of the things we discussed and a full transcription of this episode, can be found at: https://dotnetcore.show/season-8/testing-made-easy-debbie-obrien-explains-playwright-and-its-game-changing-mcp-server Useful Links: Playwright Jamie's "small" open source project Playwright VSCode extension Playwright on GitHub Playwright's tests on GitHub... written with Playwright Debbie's Movies App website GitHub Spark Debbie's website Debbie on LinkedIn Playwright Docs Playwright YouTube Channel playwright Discord Server Supporting the show: Leave a rating or review Buy the show a coffee Become a patron Getting in touch: via the contact page joining the Discord Podcast editing services provided by Matthew Bliss Music created by Mono Memory Music, licensed to RJJ Software for use in The Modern .NET Show Editing and post-production services for this episode were provided by MB Podcast Services Remember to rate and review the show on Apple Podcasts, Podchaser, or wherever you find your podcasts, this will help the show's audience grow. Or you can just share the show with a friend. And don't forget to reach out via our Contact page. We're very interested in your opinion of the show, so please get in touch. You can support the show by making a monthly donation on the show's Patreon page at: https://www.patreon.com/TheDotNetCorePodcast. Music created by Mono Memory Music, licensed to RJJ Software for use in The Modern .NET Show. Editing and post-production services for this episode were provided by MB Podcast Services.

music ai modern testing editing server github game changing playwright podchaser login mcp open api jamie taylor

Building the Future of APIs: Mike Kistler's Insights on OpenAPI and MCP

The .NET Core Podcast

Play Episode Listen Later Nov 7, 2025 70:02

Strategic Technology Consultation Services This episode of The Modern .NET Show is supported, in part, by RJJ Software's Strategic Technology Consultation Services. If you're an SME (Small to Medium Enterprise) leader wondering why your technology investments aren't delivering, or you're facing critical decisions about AI, modernization, or team productivity, let's talk. Show Notes "And we talk about that contract. We say, "this is your contract. This Open API definition that you have is the contract for your service." And in the end, that's how customers interact with Azure is through APIs. And so it's important to have that contract so that customers know how things work, how to use them, hopefully how to use them easily, right?"— Mike Kistler Hey everyone, and welcome back to The Modern .NET Show; the premier .NET podcast, focusing entirely on the knowledge, tools, and frameworks that all .NET developers should have in their toolbox. I'm your host Jamie Taylor, bringing you conversations with the brightest minds in the .NET ecosystem. Today, we're joined by Mike Kistler to talk about two topics (we usually only tackle one topic per episode, so you're getting a bonus with this episode): Open API and both MCP and the MCP SDK for C#. We started our conversation by focussing on Open API, as this is a passion of Mike's. We talked about what it is, how you've likely already been using it with any ASP .NET Core WebAPIs that you've worked on, and how the latest versions of ASP .NET Core can generate a lot of the Open API specification for you without having to add lots and lots of metadata an attributes. Pro tip: If you've been using the Swagger UI in your applications, you've been using Open API. "And when the LLM decides that it wants to use an MCP tool or access an MCP resource, it doesn't go and do that directly. It comes back to the MCP host and asks the MCP host to call a tool with a particular set of parameters, or to access an MCP resource. And at first, when I saw this in the MCP architecture, I thought, "boy, that's clunky. Why not have the LLM just call these things directly?" And there's a deliberate reason why it was done this way."— Mike Kistler We then pivoted over to talking about MCP (or Model Context Protocol) which is a rapidly evolving standard for creating your own agents and applications which can communicate with or be instructed by, LLMs. We talked about how the MCP standard works, and how the standard is written in such a way that there's always a human in the loop. We also talked about how you can build your own MCP servers using the MCP SDK for C#. It's worth pointing out that both MCP and Open API are evolving standards. While Open API tends to evolve with a much more relaxed pace, the MCP standard (having not even reached a year old when we recorded) uses the date as it's version number. And Mike actually references the latest version of the MCP spec in our conversation, which will give you a clue as to when we recorded it. Before we jump in, a quick reminder: if The Modern .NET Show has become part of your learning journey, please consider supporting us through Patreon or Buy Me A Coffee. Every contribution helps us continue bringing you these in-depth conversations with industry experts. You'll find all the links in the show notes. Anyway, without further ado, let's sit back, open up a terminal, type in `dotnet new podcast` and we'll dive into the core of Modern .NET. Full Show Notes The full show notes, including links to some of the things we discussed and a full transcription of this episode, can be found at: https://dotnetcore.show/season-8/building-the-future-of-apis-mike-kistlers-insights-on-openapi-and-mcp Useful Links: OpenAPI API Blueprint RAML ProducesResponseType attribute Minimal API TypedResults S07E16 - From Code to Cloud in 15 Minutes: Jason Taylor's Expert Insights And The Clean Architecture Template GitHub MCP Server MCP Transports MCP C# SDK Current version of the MCP spec as of the date of recording (aka version 2025-06-18) Microsoft MCP Servers List Mike on LinkedIn .NET Community Standup Supporting the show: Leave a rating or review Buy the show a coffee Become a patron Getting in Touch: Via the contact page Joining the Discord Remember to rate and review the show on Apple Podcasts, Podchaser, or wherever you find your podcasts, this will help the show's audience grow. Or you can just share the show with a friend. And don't forget to reach out via our Contact page. We're very interested in your opinion of the show, so please get in touch. You can support the show by making a monthly donation on the show's Patreon page at: https://www.patreon.com/TheDotNetCorePodcast. Music created by Mono Memory Music, licensed to RJJ Software for use in The Modern .NET Show. Editing and post-production services for this episode were provided by MB Podcast Services.

music ai modern cloud editing apis azure llm podchaser asp building the future mcp net core open api kistler jamie taylor

LCC 331 - Le retour des jackson 5

Les Cast Codeurs Podcast

Play Episode Listen Later Nov 6, 2025 73:01

Dans cet épisode, Arnaud et Guillaume discutent des dernières évolutions dans le monde de la programmation, notamment les nouveautés de Java 25, JUnit 6, et Jackson 3. Ils abordent également les récents développements en IA, les problèmes rencontrés dans le cloud, et l'état actuel de React et du web. Dans cette conversation, les intervenants abordent divers sujets liés à la technologie, notamment les spécifications de Wasteme, l'utilisation des UUID dans les bases de données, l'approche RAG en intelligence artificielle, les outils MCP, et la création d'images avec Nano Banana. Ils discutent également des complexités du format YAML, des récents dramas dans la communauté Ruby, de l'importance d'une bonne documentation, des politiques de retour au bureau, et des avancées de Cloud Code. Enfin, ils évoquent l'initiative de cafés IA pour démystifier l'intelligence artificielle. Enregistré le 24 octobre 2025 Téléchargement de l'épisode LesCastCodeurs-Episode-331.mp3 ou en vidéo sur YouTube. News Langages GraalVM se détache du release train de Java https://blogs.oracle.com/java/post/detaching-graalvm-from-the-java-ecosystem-train Un article de Loic Mathieu sur Java 25 et ses nouvelles fonctionalités https://www.loicmathieu.fr/wordpress/informatique/java-25-whats-new/ Sortie de Groovy 5.0 ! https://groovy-lang.org/releasenotes/groovy-5.0.html Groovy 5: Évolution des versions précédentes, nouvelles fonctionnalités et simplification du code. Compatibilité JDK étendue: Full support JDK 11-25, fonctionnalités JDK 17-25 disponibles sur les JDK plus anciens. Extension majeure des méthodes: Plus de 350 méthodes améliorées, opérations sur tableaux jusqu'à 10x plus rapides, itérateurs paresseux. Améliorations des transformations AST: Nouveau @OperatorRename, génération automatique de @NamedParam pour @MapConstructor et copyWith. REPL (groovysh) modernisé: Basé sur JLine 3, support multi-plateforme, coloration syntaxique, historique et complétion. Meilleure interopérabilité Java: Pattern Matching pour instanceof, support JEP-512 (fichiers source compacts et méthodes main d'instance). Standards web modernes: Support Jakarta EE (par défaut) et Javax EE (héritage) pour la création de contenu web. Vérification de type améliorée: Contrôle des chaînes de format plus robuste que Java. Additions au langage: Génération d'itérateurs infinis, variables d'index dans les boucles, opérateur d'implication logique ==>. Améliorations diverses: Import automatique de java.time.**, var avec multi-assignation, groupes de capture nommés pour regex (=~), méthodes utilitaires de graphiques à barres ASCII. Changements impactants: Plusieurs modifications peuvent nécessiter une adaptation du code existant (visibilité, gestion des imports, comportement de certaines méthodes). **Exigences JDK*: Construction avec JDK17+, exécution avec JDK11+. Librairies Intégration de LangChain4j dans ADK pour Java, permettant aux développeurs d'utiliser n'importe quel LLM avec leurs agents ADK https://developers.googleblog.com/en/adk-for-java-opening-up-to-third-party-language-models-via-langchain4j-integration/ ADK pour Java 0.2.0 : Nouvelle version du kit de développement d'agents de Google. Intégration LangChain4j : Ouvre ADK à des modèles de langage tiers. Plus de choix de LLM : En plus de Gemini et Claude, accès aux modèles d'OpenAI, Anthropic, Mistral, etc. Modèles locaux supportés : Utilisation possible de modèles via Ollama ou Docker Model Runner. Améliorations des outils : Création d'outils à partir d'instances d'objets, meilleur support asynchrone et contrôle des boucles d'exécution. Logique et mémoire avancées : Ajout de callbacks en chaîne et de nouvelles options pour la gestion de la mémoire et le RAG (Retrieval-Augmented Generation). Build simplifié : Introduction d'un POM parent et du Maven Wrapper pour un processus de construction cohérent. JUnit 6 est sorti https://docs.junit.org/6.0.0/release-notes/ :sparkles: Java 17 and Kotlin 2.2 baseline :sunrise_over_mountains: JSpecify nullability annotations :airplane_departure: Integrated JFR support :suspension_railway: Kotlin suspend function support :octagonal_sign: Support for cancelling test execution :broom: Removal of deprecated APIs JGraphlet, une librairie Java sans dépendances pour créer des graphes de tâches à exécuter https://shaaf.dev/post/2025-08-25-think-in-graphs-not-just-chains-jgraphlet-for-taskpipelines/ JGraphlet: Bibliothèque Java légère (zéro-dépendance) pour construire des pipelines de tâches. Principes clés: Simplicité, basée sur un modèle d'exécution de graphe. Tâches: Chaque tâche a une entrée/sortie, peut être asynchrone (Task) ou synchrone (SyncTask). Pipeline: Un TaskPipeline construit et exécute le graphe, gère les I/O. Modèle Graph-First: Le flux de travail est un Graphe Orienté Acyclique (DAG). Définition des tâches comme des nœuds, des connexions comme des arêtes. Support naturel des motifs fan-out et fan-in. API simple: addTask("id", task), connect("fromId", "toId"). Fan-in: Une tâche recevant plusieurs entrées reçoit une Map (clés = IDs des tâches parentes). Exécution: pipeline.run(input) retourne un CompletableFuture (peut être bloquant via .join() ou asynchrone). Cycle de vie: TaskPipeline est AutoCloseable, garantissant la libération des ressources (try-with-resources). Contexte: PipelineContext pour partager des données/métadonnées thread-safe entre les tâches au sein d'une exécution. Mise en cache: Option de mise en cache pour les tâches afin d'éviter les re-calculs. Au tour de Microsoft de lancer son (Microsoft) Agent Framework, qui semble être une fusion / réécriture de AutoGen et de Semnatic Kernel https://x.com/pyautogen/status/1974148055701028930 Plus de détails dans le blog post : https://devblogs.microsoft.com/foundry/introducing-microsoft-agent-framework-the-open-source-engine-for-agentic-ai-apps/ SDK & runtime open-source pour systèmes multi-agents sophistiqués. Unifie Semantic Kernel et AutoGen. Piliers : Standards ouverts (MCP, A2A, OpenAPI) et interopérabilité. Passerelle recherche-production (patterns AutoGen pour l'entreprise). Extensible, modulaire, open-source, connecteurs intégrés. Prêt pour la production (observabilité, sécurité, durabilité, "human in the loop"). Relation SK/AutoGen : S'appuie sur eux, ne les remplace pas, simplifie la migration. Intégrations futures : Alignement avec Microsoft 365 Agents SDK et Azure AI Foundry Agent Service. Sortie de Jackson 3.0 (bientôt les Jackson Five !!!) https://cowtowncoder.medium.com/jackson-3-0-0-ga-released-1f669cda529a Jackson 3.0.0 a été publié le 3 octobre 2025. Objectif : base propre pour le développement à long terme, suppression de la dette technique, architecture simplifiée, amélioration de l'ergonomie. Principaux changements : Baseline Java 17 requise (vs Java 8 pour 2.x). Group ID Maven et package Java renommés en tools.jackson pour la coexistence avec Jackson 2.x. (Exception: jackson-annotations ne change pas). Suppression de toutes les fonctionnalités @Deprecated de Jackson 2.x et renommage de plusieurs entités/méthodes clés. Modification des paramètres de configuration par défaut (ex: FAIL_ON_UNKNOWN_PROPERTIES désactivé). ObjectMapper et TokenStreamFactory sont désormais immutables, la configuration se fait via des builders. Passage à des exceptions de base non vérifiées (JacksonException) pour plus de commodité. Intégration des "modules Java 8" (pour les noms de paramètres, Optional, java.time) directement dans l'ObjectMapper par défaut. Amélioration du modèle d'arbre JsonNode (plus de configurabilité, meilleure gestion des erreurs). Testcontainers Java 2.0 est sorti https://github.com/testcontainers/testcontainers-java/releases/tag/2.0.0 Removed JUnit 4 support -> ups Grails 7.0 est sortie, avec son arrivée à la fondation Apache https://grails.apache.org/blog/2025-10-18-introducing-grails-7.html Sortie d'Apache Grails 7.0.0 annoncée le 18 octobre 2025. Grails est devenu un projet de premier niveau (TLP) de l'Apache Software Foundation (ASF), graduant d'incubation. Mise à jour des dépendances vers Groovy 4.0.28, Spring Boot 3.5.6, Jakarta EE. Tout pour bien démarrer et développer des agents IA avec ADK pour Java https://glaforge.dev/talks/2025/10/22/building-ai-agents-with-adk-for-java/ Guillaume a partagé plein de resources sur le développement d'agents IA avec ADK pour Java Un article avec tous les pointeurs Un slide deck et l'enregistrement vidéo de la présentation faite lors de Devoxx Belgique Un codelab avec des instructions pour démarrer et créer ses premiers agents Plein d'autres samples pour s'inspirer et voir les possibilités offertes par le framework Et aussi un template de projet sur GitHub, avec un build Maven et un premier agent d'exemple Cloud Internet cassé, du moins la partie hébergée par AWS #hugops https://www.theregister.com/2025/10/20/aws_outage_amazon_brain_drain_corey_quinn/ Panne majeure d'AWS (région US-EAST-1) : problème DNS affectant DynamoDB, service fondamental, causant des défaillances en cascade de nombreux services internet. Réponse lente : 75 minutes pour identifier la cause profonde; la page de statut affichait initialement "tout va bien". Cause sous-jacente principale : "fuite des cerveaux" (départ d'ingénieurs AWS seniors). Perte de connaissances institutionnelles : des décennies d'expertise critique sur les systèmes AWS et les modes de défaillance historiques parties avec ces départs. Prédictions confirmées : un ancien d'AWS avait anticipé une augmentation des pannes majeures en 2024. Preuves de la perte de talents : Plus de 27 000 licenciements chez Amazon (2022-2025). Taux élevé de "départs regrettés" (69-81%). Mécontentement lié à la politique de "Return to Office" et au manque de reconnaissance de l'expertise. Conséquences : les nouvelles équipes, plus réduites, manquent de l'expérience nécessaire pour prévenir les pannes ou réduire les temps de récupération. Perspective : Le marché pourrait pardonner cette fois, mais le problème persistera, rendant les futurs incidents plus probables. Web React a gagné "par défaut" https://www.lorenstew.art/blog/react-won-by-default/ React domine par défaut, non par mérite technique, étouffant ainsi l'innovation front-end. Choix par réflexe ("tout le monde connaît React"), freinant l'évaluation d'alternatives potentiellement supérieures. Fondations techniques de React (V-DOM, complexité des Hooks, Server Components) vues comme des contraintes actuelles. Des frameworks innovants (Svelte pour la compilation, Solid pour la réactivité fine, Qwik pour la "resumability") offrent des modèles plus performants mais sont sous-adoptés. La monoculture de React génère une dette technique (runtime, réconciliation) et centre les compétences sur le framework plutôt que sur les fondamentaux web. L'API React est complexe, augmentant la charge cognitive et les risques de bugs, contrairement aux alternatives plus simples. L'effet de réseau crée une "prison": offres d'emploi spécifiques, inertie institutionnelle, leaders choisissant l'option "sûre". Nécessité de choisir les frameworks selon les contraintes du projet et le mérite technique, non par inertie. Les arguments courants (maturité de l'écosystème, recrutement, bibliothèques, stabilité) sont remis en question; une dépendance excessive peut devenir un fardeau. La monoculture ralentit l'évolution du web et détourne les talents, nuisant à la diversité essentielle pour un écosystème sain et innovant. Promouvoir la diversité des frameworks pour un écosystème plus résilient et innovant. WebAssembly 3 est sortie https://webassembly.org/news/2025-09-17-wasm-3.0/ Data et Intelligence Artificielle UUIDv4 ou UUIDv7 pour vos clés primaires ? Ça dépend… surtout pour les bases de données super distribuées ! https://medium.com/google-cloud/understanding-uuidv7-and-its-impact-on-cloud-spanner-b8d1a776b9f7 UUIDv4 : identifiants entièrement aléatoires. Cause des problèmes de performance dans les bases de données relationnelles (ex: PostgreSQL, MySQL, SQL Server) utilisant des index B-Tree. Inserts aléatoires réduisent l'efficacité du cache, entraînent des divisions de pages et la fragmentation. UUIDv7 : nouveau standard conçu pour résoudre ces problèmes. Intègre un horodatage (48 bits) en préfixe de l'identifiant, le rendant ordonné temporellement et "k-sortable". Améliore la performance dans les bases B-Tree en favorisant les inserts séquentiels, la localité du cache et réduisant la fragmentation. Problème de UUIDv7 pour certaines bases de données distribuées et scalables horizontalement comme Spanner : La nature séquentielle d'UUIDv7 (via l'horodatage) crée des "hotspots d'écriture" (points chauds) dans Spanner. Spanner distribue les données en "splits" (partitions) basées sur les plages de clés. Les clés séquentielles concentrent les écritures sur un seul "split". Ceci empêche Spanner de distribuer la charge et de scaler les écritures, créant un goulot d'étranglement ("anti-pattern"). Quand ce n'est PAS un problème pour Spanner : Si le taux d'écriture total est inférieur à environ 3 500 écritures/seconde pour un seul "split". Le hotspot est "bénin" à cette échelle et n'entraîne pas de dégradation de performance. Solutions pour Spanner : Principe clé : S'assurer que la première partie de la clé primaire est NON séquentielle pour distribuer les écritures. UUIDv7 peut être utilisé, mais pas comme préfixe. Nouvelle conception ("greenfield") : ▪︎ Utiliser une clé primaire non-séquentielle (ex: UUIDv4 simple). Pour les requêtes basées sur le temps, créer un index secondaire sur la colonne d'horodatage, mais le SHARDER (ex: shardId) pour éviter les hotspots sur l'index lui-même. Migration (garder UUIDv7) : ▪︎ Ajouter un préfixe de sharding : Introduire une colonne `shard` calculée (ex: `MOD(ABS(FARM_FINGERPRINT(order_id_v7)), N)`) et l'utiliser comme PREMIER élément d'une clé primaire composite (`PRIMARY KEY (shard, order_id_v7)`). Réordonner les colonnes (si clé primaire composite existante) : Si la clé primaire est déjà composite (ex: (order_id_v7, tenant_id)), réordonner en (tenant_id, order_id_v7). Cela aide si tenant_id a une cardinalité élevée et distribue bien. (Un tenant_id très actif pourrait toujours nécessiter un préfixe de sharding supplémentaire). RAG en prod, comment améliorer la pertinence des résultats https://blog.abdellatif.io/production-rag-processing-5m-documents Démarrage rapide avec Langchain + Llamaindex: prototype fonctionnel, mais résultats de production jugés "subpar" par les utilisateurs. Ce qui a amélioré la performance (par ROI): Génération de requêtes: LLM crée des requêtes sémantiques et mots-clés multiples basées sur le fil de discussion pour une meilleure couverture. Reranking: La technique la plus efficace, modifie grandement le classement des fragments (chunks). Stratégie de découpage (Chunking): Nécessite beaucoup d'efforts, compréhension des données, création de fragments logiques sans coupures. Métadonnées à l'LLM: L'injection de métadonnées (titre, auteur) améliore le contexte et les réponses. Routage de requêtes: Détecte et traite les questions non-RAG (ex: résumer, qui a écrit) via API/LLM distinct. Outillage Créer un serveur MCP (mode HTTP Streamable) avec Micronaut et quelques éléments de comparaison avec Quarkus https://glaforge.dev/posts/2025/09/16/creating-a-streamable-http-mcp-server-with-micronaut/ Micronaut propose désormais un support officiel pour le protocole MCP. Exemple : un serveur MCP pour les phases lunaires (similaire à une version Quarkus pour la comparaison). Définition des outils MCP via les annotations @Tool et @ToolArg. Point fort : Micronaut gère automatiquement la validation des entrées (ex: @NotBlank, @Pattern), éliminant la gestion manuelle des erreurs. Génération automatique de schémas JSON détaillés pour les structures d'entrée/sortie grâce à @JsonSchema. Nécessite une configuration pour exposer les schémas JSON générés comme ressources statiques. Dépendances clés : micronaut-mcp-server-java-sdk et les modules json-schema. Testé avec l'inspecteur MCP et intégration avec l'outil Gemini CLI. Micronaut offre une gestion élégante des entrées/sorties structurées grâce à son support JSON Schema riche. Un agent IA créatif : comment utiliser le modèle Nano Banana pour générer et éditer des images (en Java, avec ADK) https://glaforge.dev/posts/2025/09/22/creative-ai-agents-with-adk-and-nano-banana/ Modèles de langage (LLM) deviennent multimodaux : traitent diverses entrées (texte, images, vidéo, audio). Nano Banana (gemini-2.5-flash-image-preview) : modèle Gemini, génère et édite des images, pas seulement du texte. ADK (Agent Development Kit pour Java) : pour configurer des agents IA créatifs utilisant ce type de modèle. Application : Base pour des workflows créatifs complexes (ex: agent de marketing, enchaînement d'agents pour génération d'assets). Un vieil article (6 mois) qui illustre les problèmes du format de fichier YAML https://ruudvanasseldonk.com/2023/01/11/the-yaml-document-from-hell YAML est extrêmement complexe malgré son objectif de convivialité humaine. Spécification volumineuse et versionnée (YAML 1.1, 1.2 diffèrent significativement). Comportements imprévisibles et "pièges" (footguns) courants : Nombres sexagésimaux (ex: 22:22 parsé comme 1342 en YAML 1.1). Tags (!.git) pouvant mener à des erreurs ou à l'exécution de code arbitraire. "Problème de la Norvège" : no interprété comme false en YAML 1.1. Clés non-chaînes de caractères (on peut devenir une clé booléenne True). Nombres accidentels si non-guillemets (ex: 10.23 comme flottant). La coloration syntaxique n'est pas fiable pour détecter ces subtilités. Le templating de documents YAML est une mauvaise idée, source d'erreurs et complexe à gérer. Alternatives suggérées : TOML : Similaire à YAML mais plus sûr (chaînes toujours entre guillemets), permet les commentaires. JSON avec commentaires (utilisé par VS Code), mais moins répandu. Utiliser un sous-ensemble simple de YAML (difficile à faire respecter). Générer du JSON à partir de langages de programmation plus puissants : ▪︎ Nix : Excellent pour l'abstraction et la réutilisation de configuration. Python : Facilite la création de JSON avec commentaires et logique. Gros binz dans la communauté Ruby, avec l'influence de grosses boîtes, et des pratiques un peu douteuses https://joel.drapper.me/p/rubygems-takeover/ Méthodologies Les qualités d'une bonne documentation https://leerob.com/docs Rapidité Chargement très rapide des pages (préférer statique). Optimisation des images, polices et scripts. Recherche ultra-rapide (chargement et affichage des résultats). Lisibilité Concise, éviter le jargon technique. Optimisée pour le survol (gras, italique, listes, titres, images). Expérience utilisateur simple au départ, complexité progressive. Multiples exemples de code (copier/coller). Utilité Documenter les solutions de contournement (workarounds). Faciliter le feedback des lecteurs. Vérification automatisée des liens morts. Matériel d'apprentissage avec un curriculum structuré. Guides de migration pour les changements majeurs. Compatible IA Trafic majoritairement via les crawlers IA. Préférer cURL aux "clics", les prompts aux tutoriels. Barre latérale "Demander à l'IA" référençant la documentation. Prêt pour les agents Faciliter le copier/coller de contenu en Markdown pour les chatbots. Possibilité de visualiser les pages en Markdown (ex: via l'URL). Fichier llms.txt comme répertoire de fichiers Markdown. Finition soignée Zones de clic généreuses (boutons, barres latérales). Barres latérales conservant leur position de défilement et état déplié. Bons états actifs/survol. Images OG dynamiques. Titres/sections lienables avec ancres stables. Références et liens croisés entre guides, API, exemples. Balises méta/canoniques pour un affichage propre dans les moteurs de recherche. Localisée Pas de /en par défaut dans l'URL. Routage côté serveur pour la langue. Localisation des chaînes statiques et du contenu. Responsive Excellents menus mobiles / support Safari iOS. Info-bulles sur desktop, popovers sur mobile. Accessible Lien "ignorer la navigation" vers le contenu principal. Toutes les images avec des balises alt. Respect des paramètres système de mouvement réduit. Universelle Livrer la documentation "en tant que code" (JSDoc, package). Livrer via des plateformes comme Context7, ou dans node_modules. Fichiers de règles (ex: AGENTS.md) avec le produit. Évaluations et modèles spécifiques recommandés pour le produit. Loi, société et organisation Microsoft va imposer une politique de Return To Office https://www.businessinsider.com/microsoft-execs-explain-rto-mandate-in-internal-meeting-2025-9 Microsoft impose 3 jours de présence au bureau par semaine à partir de février 2026, débutant par la région de Seattle Le CEO Satya Nadella explique que le télétravail a affaibli les liens sociaux nécessaires à l'innovation Les dirigeants citent des données internes montrant que les employés présents au bureau "prospèrent" davantage L'équipe IA de Microsoft doit être présente 4 jours par semaine, règles plus strictes pour cette division stratégique Les employés peuvent demander des exceptions jusqu'au 19 septembre 2025 pour trajets complexes ou absence d'équipe locale Amy Coleman (RH) affirme que la collaboration en personne améliore l'énergie et les résultats, surtout à l'ère de l'IA La politique s'appliquera progressivement aux 228 000 employés dans le monde après les États-Unis Les réactions sont mitigées, certains employés critiquent la perte d'autonomie et les bureaux inadéquats Microsoft rattrape ses concurrents tech qui ont déjà imposé des retours au bureau plus stricts Cette décision intervient après 15 000 licenciements en 2025, créant des tensions avec les employés Comment Claude Code est né ? (l'histoire de sa création) https://newsletter.pragmaticengineer.com/p/how-claude-code-is-built Claude Code : outil de développement "AI-first" créé par Boris Cherny, Sid Bidasaria et Cat Wu. Performance impressionnante : 500M$ de revenus annuels, utilisation multipliée par 10 en 3 mois. Adoption interne massive : Plus de 80% des ingénieurs d'Anthropic l'utilisent quotidiennement, y compris les data scientists. Augmentation de productivité : 67% d'augmentation des Pull Requests (PR) par ingénieur malgré le doublement de l'équipe. Origine : Commande CLI simple évoluant vers un outil accédant au système de fichiers, exploitant le "product overhang" du modèle Claude. Raison du lancement public : Apprendre sur la sécurité et les capacités des modèles d'IA. Pile technologique "on distribution" : TypeScript, React (avec Ink), Yoga, Bun. Choisie car le modèle Claude est déjà très performant avec ces technologies. "Claude Code écrit 90% de son propre code" : Le modèle prend en charge la majeure partie du développement. Architecture légère : Simple "shell" autour du modèle Claude, minimisant la logique métier et le code (suppression constante de code superflu). Exécution locale : Privilégiée pour sa simplicité, sans virtualisation. Sécurité : Système de permissions granulaire demandant confirmation avant chaque action potentiellement dangereuse (ex: suppression de fichiers). Développement rapide : Jusqu'à 100 releases internes/jour, 1 release externe/jour. 5 Pull Requests/ingénieur/jour. Prototypage ultra-rapide (ex: 20+ prototypes d'une fonctionnalité en quelques heures) grâce aux agents IA. Innovation UI/UX : Redéfinit l'expérience du terminal grâce à l'interaction LLM, avec des fonctionnalités comme les sous-agents, les styles de sortie configurables, et un mode "Learning". Le 1er Café IA publique a Paris https://www.linkedin.com/pulse/my-first-caf%25C3%25A9-ia-paris-room-full-curiosity-an[…]o-goncalves-r9ble/?trackingId=%2FPHKdAimR4ah6Ep0Qbg94w%3D%3D Conférences La liste des conférences provenant de Developers Conferences Agenda/List par Aurélie Vache et contributeurs : 30-31 octobre 2025 : Agile Tour Bordeaux 2025 - Bordeaux (France) 30-31 octobre 2025 : Agile Tour Nantais 2025 - Nantes (France) 30 octobre 2025-2 novembre 2025 : PyConFR 2025 - Lyon (France) 4-7 novembre 2025 : NewCrafts 2025 - Paris (France) 5-6 novembre 2025 : Tech Show Paris - Paris (France) 5-6 novembre 2025 : Red Hat Summit: Connect Paris 2025 - Paris (France) 6 novembre 2025 : dotAI 2025 - Paris (France) 6 novembre 2025 : Agile Tour Aix-Marseille 2025 - Gardanne (France) 7 novembre 2025 : BDX I/O - Bordeaux (France) 12-14 novembre 2025 : Devoxx Morocco - Marrakech (Morocco) 13 novembre 2025 : DevFest Toulouse - Toulouse (France) 15-16 novembre 2025 : Capitole du Libre - Toulouse (France) 19 novembre 2025 : SREday Paris 2025 Q4 - Paris (France) 19-21 novembre 2025 : Agile Grenoble - Grenoble (France) 20 novembre 2025 : OVHcloud Summit - Paris (France) 21 novembre 2025 : DevFest Paris 2025 - Paris (France) 24 novembre 2025 : Forward Data & AI Conference - Paris (France) 27 novembre 2025 : DevFest Strasbourg 2025 - Strasbourg (France) 28 novembre 2025 : DevFest Lyon - Lyon (France) 1-2 décembre 2025 : Tech Rocks Summit 2025 - Paris (France) 4-5 décembre 2025 : Agile Tour Rennes - Rennes (France) 5 décembre 2025 : DevFest Dijon 2025 - Dijon (France) 9-11 décembre 2025 : APIdays Paris - Paris (France) 9-11 décembre 2025 : Green IO Paris - Paris (France) 10-11 décembre 2025 : Devops REX - Paris (France) 10-11 décembre 2025 : Open Source Experience - Paris (France) 11 décembre 2025 : Normandie.ai 2025 - Rouen (France) 14-17 janvier 2026 : SnowCamp 2026 - Grenoble (France) 29-31 janvier 2026 : Epitech Summit 2026 - Paris - Paris (France) 2-5 février 2026 : Epitech Summit 2026 - Moulins - Moulins (France) 2-6 février 2026 : Web Days Convention - Aix-en-Provence (France) 3 février 2026 : Cloud Native Days France 2026 - Paris (France) 3-4 février 2026 : Epitech Summit 2026 - Lille - Lille (France) 3-4 février 2026 : Epitech Summit 2026 - Mulhouse - Mulhouse (France) 3-4 février 2026 : Epitech Summit 2026 - Nancy - Nancy (France) 3-4 février 2026 : Epitech Summit 2026 - Nantes - Nantes (France) 3-4 février 2026 : Epitech Summit 2026 - Marseille - Marseille (France) 3-4 février 2026 : Epitech Summit 2026 - Rennes - Rennes (France) 3-4 février 2026 : Epitech Summit 2026 - Montpellier - Montpellier (France) 3-4 février 2026 : Epitech Summit 2026 - Strasbourg - Strasbourg (France) 3-4 février 2026 : Epitech Summit 2026 - Toulouse - Toulouse (France) 4-5 février 2026 : Epitech Summit 2026 - Bordeaux - Bordeaux (France) 4-5 février 2026 : Epitech Summit 2026 - Lyon - Lyon (France) 4-6 février 2026 : Epitech Summit 2026 - Nice - Nice (France) 12-13 février 2026 : Touraine Tech #26 - Tours (France) 26-27 mars 2026 : SymfonyLive Paris 2026 - Paris (France) 31 mars 2026 : ParisTestConf - Paris (France) 16-17 avril 2026 : MiXiT 2026 - Lyon (France) 22-24 avril 2026 : Devoxx France 2026 - Paris (France) 23-25 avril 2026 : Devoxx Greece - Athens (Greece) 6-7 mai 2026 : Devoxx UK 2026 - London (UK) 22 mai 2026 : AFUP Day 2026 Lille - Lille (France) 22 mai 2026 : AFUP Day 2026 Paris - Paris (France) 22 mai 2026 : AFUP Day 2026 Bordeaux - Bordeaux (France) 22 mai 2026 : AFUP Day 2026 Lyon - Lyon (France) 17 juin 2026 : Devoxx Poland - Krakow (Poland) 4 septembre 2026 : JUG Summer Camp 2026 - La Rochelle (France) 17-18 septembre 2026 : API Platform Conference 2026 - Lille (France) 5-9 octobre 2026 : Devoxx Belgium - Antwerp (Belgium) Nous contacter Pour réagir à cet épisode, venez discuter sur le groupe Google https://groups.google.com/group/lescastcodeurs Contactez-nous via X/twitter https://twitter.com/lescastcodeurs ou Bluesky https://bsky.app/profile/lescastcodeurs.com Faire un crowdcast ou une crowdquestion Soutenez Les Cast Codeurs sur Patreon https://www.patreon.com/LesCastCodeurs Tous les épisodes et toutes les infos sur https://lescastcodeurs.com/

amazon learning ai google pr office performance simple data microsoft yoga respect tool cycle dans adoption standards premier option tasks architecture cons solid tout mat ia migration pas react pattern quand gemini faire alternatives extension ils blue sky passage import guides api map enfin io zones cela aws nouvelle toutes removal recherche java bas github guillaume int aur pile probl ink jusqu strat optional bons mise ids llm plusieurs gros apprendre apache syst mod ceci arnaud le retour suppression contr plein hooks maven additions choix groovy 500m exception exp dns normandie meilleure anthropic raison objectif nombres barre loi rag utiliser sortie curl perte sdks privil multiples panne pom norv exemple modification augmentation optimisation enregistr taux changements json mistral mcp principes london uk demander paris france bun mysql typescript preuves possibilit return to office utilisation vs code markdown titres capitole comportements postgresql webassembly logique inserts sql server kotlin localisation ascii jackson 5 faciliter jep promouvoir vache alignement spanner simplicit jackson five grails mixit micronauts tlp principaux livrer yaml pull requests svelte a2a ajout open api optimis barres fondations unis les dynamodb passerelle spring boot repl fichier jdk lyon france us east adk extensible bordeaux france localis qwik junit provence france strasbourg france cloud code lille france json schema dijon france devoxx france

Open Ecosystems in CTRM - CTRMRadio 58

CTRM Radio

Play Episode Listen Later Sep 23, 2025 38:38

In this episode, I talk with Richard Williamson, CEO of Gen10, Asbjorn Hansen, CEO of Previse Systems, Simon Piercy, CEO of CTRMCubed and Marc, Zabal, CEO of OpenCTRM about the emergence of open ecosystems in the CTRM area. What are they, what value and business benefits do they bring and who might benefit. With the emergence of the cloud, SaaS and OpenAPI's, a number of vendors have either adopted a more open approach or come to market with point solutions designed to fit into an open ecosystem of solutions. These can be used to supplement an existing solution to get functionality delivered faster or can be used to entirely replace legacy solutions. The four vendors for the podcast selected represent some of those that offer open solutions to the market. Our thanks to our guests for their participation. www.CTRMCenter.com - for everything CTRM www.ETTCenter.net - for everything Energy Transition Technologies #CTRM

ceo saas ecosystems open api gen10 ctrm

Is es-toolkit the Lodash Killer?

Front-End Fire

Play Episode Listen Later Aug 4, 2025 47:54

There's a new utility library in town called es-toolkit, and it's gunning for Lodash. 2-3x faster, 97% smaller, full TypeScript support, and using modern JavaScript APIs, es-toolkit's just added a “Lodash compatibility layer” to ensure an identical API and 100% Lodash compatibility.oRPC is the newest wrinkle in the Remote Procedural Call (RPC) world, and it promotes easy to build APIs that are end-to-end type-safe and adhere to OpenAPI standards. Stack Overflow's 15th developer survey results are in, and the learnings are... interesting. Some of the takeaways are expected (React's still very popular, lots of devs have at least tried AI tools), but some seem willfully wrong (SO claims it's a new resource for devs that need to solve AI-related issues, but 43% of respondents said they rarely or never visit the site anymore). Either way, SO's use has declined dramatically over the last few years due to the rise of AI, and we'll see how much longer it can hang on as a vital part of the developer ecosystem.Timestamps:0:57 - es-toolkit update7:46 - oRPC17:13 - Stack Overflow Developer Survey31:43 - Bolt hackathon winners34:11 - Microsoft Edge Copilot Mode38:46 - State of HTML survey39:30 - What's making us happyLinks:Paige - es-toolkit is coming for LodashJack - oRPCTJ - Stack Overflow 2025 survey resultsBolt hackathon winnersMicrosoft Edge Copilot ModeState of HTML surveyPaige - Monopoly Deal card gameJack - Gridfinity 3D printed grid storage systemTJ - NY Times Games AppThanks as always to our sponsor, the Blue Collar Coder channel on YouTube. You can join us in our Discord channel, explore our website and reach us via email, or talk to us on X, Bluesky, or YouTube.Front-end Fire websiteBlue Collar Coder on YouTubeBlue Collar Coder on DiscordReach out via emailTweet at us on X @front_end_fireFollow us on Bluesky @front-end-fire.comSubscribe to our YouTube channel @Front-EndFirePodcast

ai state killers discord front react blue sky api toolkit apis bolt html stack overflow typescript open api javascript apis lodash

Kodsnack 652 - The best of nature, with Grace Jansen

Kodsnack in English

Play Episode Listen Later Jul 22, 2025 36:12

Fredrik talks to Grace Jansen about cloud tools, and bringing them to your local machine in a better way. Opentelemetry is a great tool, but it’s not the whole story for observability. Gathering the data is just the first step. In the second half, we leave telemetry and talk about realizing you have things to share and sharing them with other people. Find out what makes you tick, and share experiences around that. Grace also shares some concrete presentation-building tips at the end. Ask the question, and be more you! Recorded during Øredev 2024. Thank you Cloudnet for sponsoring our VPS! Comments, questions or tips? We a re @kodsnack, @tobiashieta, @oferlund and @bjoreman on Twitter, have a page on Facebook and can be emailed at info@kodsnack.se if you want to write longer. We read everything we receive. If you enjoy Kodsnack we would love a review in iTunes! You can also support the podcast by buying us a coffee (or two!) through Ko-fi. Links Grace Øredev 2024 Grace’s Øredev 2024 presentations: Cloud-native dev tools: bringing the cloud back to earth, and Becoming a cloud-native doctor Opentelemetry Distributed tracing Microprofile - open source specification for distributed tracing Jakarta - the artist previously known as Java EE Reactive messaging Openapi Telemetry Openliberty Quarkus Payara Jboss Prometheus Grafana Kibana Fluid Jaeger - tracing platform Torill Kornfeldt talked about resurrecting mammoths at Øredev 2015 Sven Jungmann - can we teach machines to smell? Support us on Ko-fi! Ants and AI models Holly Cummins Less waste, more joy, and a lot more green: How Quarkus makes Java better - Holly’s Øredev 2024 presentation Titles After-lunch lull So polyglot Ready for microservices (You need) Many minds Now I have a pile (Take) The best of nature The path was being them Something I bring to the table Ask the question A unique presentation

ai nature virtual cloud ko titles ants java prometheus jakarta fluid reactive originalsubdomain fredrik jansen distributed jaeger vps telemetry grafana open api kibana java ee jboss torill kornfeldt payara kodsnack cloudnet

Ep. 406: Open API's $10M Consulting Service, AI Regulation & Business Adoption

Run Your Day

Play Episode Listen Later Jul 8, 2025 37:45

In this episode of Today in Tech, Dan Hafner introduces a new live show format focusing on technology, business, and AI. He discusses OpenAI's recent launch of a $10 million enterprise AI consulting service, the implications of AI regulation following a significant Senate decision, trends in AI adoption despite employee resistance, and the energy demands associated with AI infrastructure. The conversation emphasizes the importance of understanding these developments for small business owners and entrepreneurs.03:50: OpenAI's New Consulting Service13:39: Senate's AI Regulation Decision21:52: AI Adoption Trends and Challenges29:54: OpenAI's Energy Demand and Future ProspectsBrought to you by Dapper No-Code

ai service tech senate adoption consulting openai ai regulation open api

Release Notes: SaaS-Ready API Docs, Visual Polish Across the Board, Smoother Checkout Flows & Post-Purchase Control

Freemius

Play Episode Listen Later Jul 8, 2025

This release brings clear, polished improvements you can rely on. We've made the checkout smoother to reduce drop-offs, updated invoices to lower support needs, and released OpenAPI-powered API docs to...

saas polish visual checkout api docs flows smoother across the board release notes open api

#500: MCP Demo using Python, AI and a self healing network (Model Context Protocol)

David Bombal

Play Episode Listen Later Jul 2, 2025 23:21

Big thank you to Cisco for sponsoring this video and sponsoring my trip to Cisco Live San Diego. See how Cisco engineer Kareem Iskander teams up with David Bombal at Cisco Live San Diego 2025 to build a self-healing network in real time. Using the new Model Context Protocol (MCP), Splunk logs, Meraki APIs, and Anthropic Claude, Kareem's Python code lets an LLM detect configuration drift and automatically revert changes, no manual troubleshooting required. You will learn: • What MCP is and how it exposes trusted tools to an LLM • How Claude reads Splunk, correlates Meraki changes, and repairs configs • Why two lines of code can spin up an entire MCP server from OpenAPI specs • Where to find Kareem's full code on GitHub and his upcoming Cisco U tutorial // Code // Get the code here: https://github.com/kiskander/mcp-splu... //Kareem Iskander SOCIALS // LinkedIn: / kiskander X: https://x.com/kareem_isk Cisco Blogs: https://blogs.cisco.com/author/kareem... // Website REFERENCE // https://github.com/kiskander/mcp-splu... https://u.cisco.com/tutorials/enhance... https://u.cisco.com/tutorials/network... https://u.cisco.com/tutorials/network... // David's SOCIAL // Discord: discord.com/invite/usKSyzb Twitter: www.twitter.com/davidbombal Instagram: www.instagram.com/davidbombal LinkedIn: www.linkedin.com/in/davidbombal Facebook: www.facebook.com/davidbombal.co TikTok: tiktok.com/@davidbombal YouTube: / @davidbombal Spotify: open.spotify.com/show/3f6k6gE... SoundCloud: / davidbombal Apple Podcast: podcasts.apple.com/us/podcast... // MY STUFF // https://www.amazon.com/shop/davidbombal // SPONSORS // Interested in sponsoring my videos? Reach out to my team here: sponsors@davidbombal.com Please note that links listed may be affiliate links and provide me with a small percentage/kickback should you use them to purchase any of the items listed or recommended. Thank you for supporting me and this channel! Disclaimer: This video is for educational purposes only.

reach model network context demo protocol cisco python github llm self healing splunk mcp meraki open api

Jimmy Bogard: MediatR & AutoMapper - Episode 356

Azure DevOps Podcast

Play Episode Listen Later Jun 30, 2025 46:08

Today's guest is a true heavyweight in the .NET open-source world — someone whose work has quietly but profoundly shaped the way countless developers build software. Jimmy Bogard is the creator and maintainer of two of the most widely used OSS libraries in the .NET ecosystem: AutoMapper and MediatR. If you've ever tried to simplify object mapping or decouple application logic, chances are you've used his tools. Based in Austin, Texas, Jimmy is an independent software consultant and a perennial recipient of the Microsoft Most Valuable Professional award every single year since 2009. That's more than a decade and a half of consistent, community-driven excellence. AutoMapper alone has been around for 17 years and has racked up hundreds of millions of downloads. It started as a personal tool to streamline development for client projects and grew into a global standard for object mapping. Topics of Discussion: [3:15] What keeps Jimmy passionate about coding? [5:19] The decision to commercialize both libraries. [6:33] What dual licensing means in practice. [12:11] Which version of each library will include the license change? [16:26] Current major versions of AutoMapper (v14) and MediatR (v12). [17:28] MediatR: the problem it solves and how it structures code. [20:45] Organizing code by use case. [26:00] AutoMapper: what it is and why it helps. [33:28] API design strategy and tailoring endpoints to use cases. [37:25] OpenAPI vs asyncAPI for message-based systems. [41:49] Blazor WebAssembly and remote handlers. Mentioned in this Episode: Clear Measure Way Architect Forum Software Engineer Forum Programming with Palermo — New Video Podcast! Email us at programming@palermo.net. Clear Measure, Inc. (Sponsor) “Jimmy Bogard: .NET 7 and Azure Modernization - Episode 264” GitHub — Jimmy Bogard eShop GitHub — Jimmy Bogard GitHub — Automapper NuGet Gallery NuGet Gallery — MediatR Releases — J. Bogard Jimmy Bogard AutoMapper and MediatR Licensing Update Want to Learn More? Visit AzureDevOps.Show for show notes and additional episodes.

texas current organizing api oss open api microsoft most valuable professional jimmy bogard

WireMock with Tom Akehurst

TestTalks | Automation Awesomeness | Helping YOU Succeed with Test Automation

Play Episode Listen Later Jun 22, 2025 39:16

In this episode, we sit down with Tom Akehurst, the creator of WireMock, the popular open-source API mocking tool now powering simulation at scale in modern development pipelines. We unpack the real-world benefits of API mocking, when to simulate vs. integrate, and how WireMock Cloud is helping teams manage drift, complexity, and change. If you're exploring API testing, mocking tools, or wondering how OpenAPI, AI, and MCP (Model Context Protocols) fit into the picture—this one's for you.

ai api open api

Discover the hidden opportunities of embedded finance for startups

GREY Journal Daily News Podcast

Play Episode Listen Later Apr 30, 2025 3:13

Embedded finance responds to consumer demands for speed and simplicity, transforming everyday platforms into financial ecosystems. Startups reshape business models and create new revenue streams through this approach. Convenience drives adoption, allowing consumers to manage transactions within preferred applications, shifting the transaction benchmark to a simple screen tap. Revenue from embedded finance platforms projects to increase from $21 billion in 2021 to $51 billion by 2026, with transaction volume reaching $7 trillion, representing 10% of U.S. financial transactions. Technological advancements like Open APIs facilitate efficient integration, lowering costs and complexity for startups. Embedded finance enables companies to monetize transactions, enhancing customer loyalty through financial services like payments and loans. Traditional banks face challenges with legacy systems that hinder modernization and competitiveness against agile fintech companies. While some banks invest in Open API technology, others struggle due to outdated leadership and regulations. Customers now expect integrated platforms with various financial services seamlessly delivered in a user-friendly environment. Partnerships with technology providers can aid banks, but achieving a seamless user experience remains essential for competitive advantage. Learn more on this news visit us at: https://greyjournal.net/news/ Hosted on Acast. See acast.com/privacy for more information.

discover opportunities startups hidden partnership traditional customers acast revenue convenience technological embedded open api embedded finance

The Creators of Model Context Protocol

Latent Space: The AI Engineer Podcast â€” CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Apr 3, 2025 79:57

We are happy to announce that there will be a dedicated MCP track at the 2025 AI Engineer World's Fair, taking place Jun 3rd to 5th in San Francisco, where the MCP core team and major contributors and builders will be meeting. Join us and apply to speak or sponsor!When we first wrote Why MCP Won, we had no idea how quickly it was about to win.In the past 4 weeks, OpenAI and now Google have now announced the MCP support, effectively confirming our prediction that MCP was the presumptive winner of the agent standard wars. MCP has now overtaken OpenAPI, the incumbent option and most direct alternative, in GitHub stars (3 months ahead of conservative trendline):We have explored the state of MCP at AIE (now the first ever >100k views workshop):And since then, we've added a 7th reason why MCP won - this team acts very quickly on feedback, with the 2025-03-26 spec update adding support for stateless/resumable/streamable HTTP transports, and comprehensive authz capabilities based on OAuth 2.1.This bodes very well for the future of the community and project. For protocol and history nerds, we also asked David and Justin to tell the origin story of MCP, which we leave to the reader to enjoy (you can also skim the transcripts, or, the changelogs of a certain favored IDE). It's incredible the impact that individual engineers solving their own problems can have on an entire industry.Full video episodeLike and subscribe on YouTube!Show Links* David* Justin* MCP* Why MCP WonTimestamps* 00:00 Introduction and Guest Welcome* 00:37 What is MCP?* 02:00 The Origin Story of MCP* 05:18 Development Challenges and Solutions* 08:06 Technical Details and Inspirations* 29:45 MCP vs Open API* 32:48 Building MCP Servers* 40:39 Exploring Model Independence in LLMs* 41:36 Building Richer Systems with MCP* 43:13 Understanding Agents in MCP* 45:45 Nesting and Tool Confusion in MCP* 49:11 Client Control and Tool Invocation* 52:08 Authorization and Trust in MCP Servers* 01:01:34 Future Roadmap and Stateless Servers* 01:10:07 Open Source Governance and Community Involvement* 01:18:12 Wishlist and Closing RemarksTranscriptAlessio [00:00:02]: Hey, everyone. Welcome back to Latent Space. This is Alessio, partner and CTO at Decibel, and I'm joined by my co-host Swyx, founder of Small AI.swyx [00:00:10]: Hey, morning. And today we have a remote recording, I guess, with David and Justin from Anthropic over in London. Welcome. Hey, good You guys have created a storm of hype because of MCP, and I'm really glad to have you on. Thanks for making the time. What is MCP? Let's start with a crisp what definition from the horse's mouth, and then we'll go into the origin story. But let's start off right off the bat. What is MCP?Justin/David [00:00:43]: Yeah, sure. So Model Context Protocol, or MCP for short, is basically something we've designed to help AI applications extend themselves or integrate with an ecosystem of plugins, basically. The terminology is a bit different. We use this client-server terminology, and we can talk about why that is and where that came from. But at the end of the day, it really is that. It's like extending and enhancing the functionality of AI application.swyx [00:01:05]: David, would you add anything?Justin/David [00:01:07]: Yeah, I think that's actually a good description. I think there's like a lot of different ways for how people are trying to explain it. But at the core, I think what Justin said is like extending AI applications is really what this is about. And I think the interesting bit here that I want to highlight, it's AI applications and not models themselves that this is focused on. That's a common misconception that we can talk about a bit later. But yeah. Another version that we've used and gotten to like is like MCP is kind of like the USB-C port of AI applications and that it's meant to be this universal connector to a whole ecosystem of things.swyx [00:01:44]: Yeah. Specifically, an interesting feature is, like you said, the client and server. And it's a sort of two-way, right? Like in the same way that said a USB-C is two-way, which could be super interesting. Yeah, let's go into a little bit of the origin story. There's many people who've tried to make statistics. There's many people who've tried to build open source. I think there's an overall, also, my sense is that Anthropic is going hard after developers in the way that other labs are not. And so I'm also curious if there was any external influence or was it just you two guys just in a room somewhere riffing?Justin/David [00:02:18]: It is actually mostly like us two guys in a room riffing. So this is not part of a big strategy. You know, if you roll back time a little bit and go into like July 2024. I was like, started. I started at Anthropic like three months earlier or two months earlier. And I was mostly working on internal developer tooling, which is what I've been doing for like years and years before. And as part of that, I think there was an effort of like, how do I empower more like employees at Anthropic to use, you know, to integrate really deeply with the models we have? Because we've seen these, like, how good it is, how amazing it will become even in the future. And of course, you know, just dogfoot your own model as much as you can. And as part of that. From my development tooling background, I quickly got frustrated by the idea that, you know, on one hand side, I have Cloud Desktop, which is this amazing tool with artifacts, which I really enjoyed. But it was very limited to exactly that feature set. And it was there was no way to extend it. And on the other hand side, I like work in IDEs, which could greatly like act on like the file system and a bunch of other things. But then they don't have artifacts or something like that. And so what I constantly did was just copy. Things back and forth on between Cloud Desktop and the IDE, and that quickly got me, honestly, just very frustrated. And part of that frustration wasn't like, how do I go and fix this? What, what do we need? And back to like this development developer, like focus that I have, I really thought about like, well, I know how to build all these integrations, but what do I need to do to let these applications let me do this? And so it's very quickly that you see that this is clearly like an M times N problem. Like you have multiple like applications. And multiple integrations you want to build and like, what that is better there to fix this than using a protocol. And at the same time, I was actually working on an LSP related thing internally that didn't go anywhere. But you put these things together in someone's brain and let them wait for like a few weeks. And out of that comes like the idea of like, let's build some, some protocol. And so back to like this little room, like it was literally just me going to a room with Justin and go like, I think we should build something like this. Uh, this is a good idea. And Justin. Lucky for me, just really took an interest in the idea, um, and, and took it from there to like, to, to build something, together with me, that's really the inception story is like, it's us to, from then on, just going and building it over, over the course of like, like a month and a half of like building the protocol, building the first integration, like Justin did a lot of the, like the heavy lifting of the first integrations in cloud desktop. I did a lot of the first, um, proof of concept of how this can look like in an IDE. And if you, we could talk about like some of. All the tidbits you can find way before the inception of like before the official release, if you were looking at the right repositories at the right time, but there you go. That's like some of the, the rough story.Alessio [00:05:12]: Uh, what was the timeline when, I know November 25th was like the official announcement date. When did you guys start working on it?Justin/David [00:05:19]: Justin, when did we start working on that? I think it, I think it was around July. I think, yeah, I, as soon as David pitched this initial idea, I got excited pretty quickly and we started working on it, I think. I think almost immediately after that conversation and then, I don't know, it was a couple, maybe a few months of, uh, building the really unrewarding bits, if we're being honest, because for, for establishing something that's like this communication protocol has clients and servers and like SDKs everywhere, there's just like a lot of like laying the groundwork that you have to do. So it was a pretty, uh, that was a pretty slow couple of months. But then afterward, once you get some things talking over that wire, it really starts to get exciting and you can start building. All sorts of crazy things. And I think this really came to a head. And I don't remember exactly when it was, maybe like approximately a month before release, there was an internal hackathon where some folks really got excited about MCP and started building all sorts of crazy applications. I think the coolest one of which was like an MCP server that can control a 3d printer or something. And so like, suddenly people are feeling this power of like cloud connecting to the outside world in a really tangible way. And that, that really added some, uh, some juice to us and to the release.Alessio [00:06:32]: Yeah. And we'll go into the technical details, but I just want to wrap up here. You mentioned you could have seen some things coming if you were looking in the right places. We always want to know what are the places to get alpha, how, how, how to find MCP early.Justin/David [00:06:44]: I'm a big Zed user. I liked the Zed editor. The first MCP implementation on an IDE was in Zed. It was written by me and it was there like a month and a half before the official release. Just because we needed to do it in the open because it's an open source project. Um, and so it was, it was not, it was named slightly differently because we. We were not set on the name yet, but it was there.swyx [00:07:05]: I'm happy to go a little bit. Anthropic also had some preview of a model with Zed, right? Some kind of fast editing, uh, model. Um, uh, I, I'm con I confess, you know, I'm a cursor windsurf user. Haven't tried Zed. Uh, what's, what's your, you know, unrelated or, you know, unsolicited two second pitch for, for Zed. That's a good question.Justin/David [00:07:28]: I, it really depends what you value in editors. For me. I, I wouldn't even say I like, I love Zed more than others. I like them all like complimentary in, in a way or another, like I do use windsurf. I do use Zed. Um, but I think my, my main pitch for Zed is low latency, super smooth experience editor with a decent enough AI integration.swyx [00:07:51]: I mean, and maybe, you know, I think that's, that's all it is for a lot of people. Uh, I think a lot of people obviously very tied to the VS code paradigm and the extensions that come along with it. Okay. So I wanted to go back a little bit. You know, on, on, on some of the things that you mentioned, Justin, uh, which was building MCP on paper, you know, obviously we only see the end result. It just seems inspired by LSP. And I, I think both of you have acknowledged that. So how much is there to build? And when you say build, is it a lot of code or a lot of design? Cause I felt like it's a lot of design, right? Like you're picking JSON RPC, like how much did you base off of LSP and, and, you know, what, what, what was the sort of hard, hard parts?Justin/David [00:08:29]: Yeah, absolutely. I mean, uh, we, we definitely did take heavy inspiration from LSP. David had much more prior experience with it than I did working on developer tools. So, you know, I've mostly worked on products or, or sort of infrastructural things. LSP was new to me. But as a, as a, like, or from design principles, it really makes a ton of sense because it does solve this M times N problem that David referred to where, you know, in the world before LSP, you had all these different IDEs and editors, and then all these different languages that each wants to support or that their users want them to support. And then everyone's just building like one. And so, like, you use Vim and you might have really great support for, like, honestly, I don't know, C or something, and then, like, you switch over to JetBrains and you have the Java support, but then, like, you don't get to use the great JetBrains Java support in Vim and you don't get to use the great C support in JetBrains or something like that. So LSP largely, I think, solved this problem by creating this common language that they could all speak and that, you know, you can have some people focus on really robust language server implementations, and then the IDE developers can really focus on that side. And they both benefit. So that was, like, our key takeaway for MCP is, like, that same principle and that same problem in the space of AI applications and extensions to AI applications. But in terms of, like, concrete particulars, I mean, we did take JSON RPC and we took this idea of bidirectionality, but I think we quickly took it down a different route after that. I guess there is one other principle from LSP that we try to stick to today, which is, like, this focus on how features manifest. More than. The semantics of things, if that makes sense. David refers to it as being presentation focused, where, like, basically thinking and, like, offering different primitives, not because necessarily the semantics of them are very different, but because you want them to show up in the application differently. Like, that was a key sort of insight about how LSP was developed. And that's also something we try to apply to MCP. But like I said, then from there, like, yeah, we spent a lot of time, really a lot of time, and we could go into this more separately, like, thinking about each of the primitives that we want to offer in MCP. And why they should be different, like, why we want to have all these different concepts. That was a significant amount of work. That was the design work, as you allude to. But then also already out of the gate, we had three different languages that we wanted to at least support to some degree. That was TypeScript, Python, and then for the Z integration, it was Rust. So there was some SDK building work in those languages, a mixture of clients and servers to build out to try to create this, like, internal ecosystem that we could start playing with. And then, yeah, I guess just trying to make everything, like, robust over, like, I don't know, this whole, like, concept that we have for local MCP, where you, like, launch subprocesses and stuff and making that robust took some time as well. Yeah, maybe adding to that, I think the LSP inference goes even a little bit further. Like, we did take actually quite a look at criticisms on LSP, like, things that LSP didn't do right and things that people felt they would love to have different and really took that to heart to, like, see, you know, what are some of the things. that we wish, you know, we should do better. We took a, you know, like, a lengthy, like, look at, like, their very unique approach to JSON RPC, I may say, and then we decided that this is not what we do. And so there's, like, these differences, but it's clearly very, very inspired. Because I think when you're trying to build and focus, if you're trying to build something like MCP, you kind of want to pick the areas you want to innovate in, but you kind of want to be boring about the other parts in pattern matching LSP. So the problem allows you to be boring in a lot of the core pieces that you want to be boring in. Like, the choice of JSON RPC is very non-controversial to us because it's just, like, it doesn't matter at all, like, what the action, like, bites on the bar that you're speaking. It makes no difference to us. The innovation is on the primitives you choose and these type of things. And so there's way more focus on that that we wanted to do. So having some prior art is good there, basically.swyx [00:12:26]: It does. I wanted to double click. I mean, there's so many things you can go into. Obviously, I am passionate about protocol design. I wanted to show you guys this. I mean, I think you guys know, but, you know, you already referred to the M times N problem. And I can just share my screen here about anyone working in developer tools has faced this exact issue where you see the God box, basically. Like, the fundamental problem and solution of all infrastructure engineering is you have things going to N things, and then you put the God box and they'll all be better, right? So here is one problem for Uber. One problem for... GraphQL, one problem for Temporal, where I used to work at, and this is from React. And I was just kind of curious, like, you know, did you solve N times N problems at Facebook? Like, it sounds like, David, you did that for a living, right? Like, this is just N times N for a living.Justin/David [00:13:16]: David Pérez- Yeah, yeah. To some degree, for sure. I did. God, what a good example of this, but like, I did a bunch of this kind of work on like source control systems and these type of things. And so there were a bunch of these type of problems. And so you just shove them into something that everyone can read from and everyone can write to, and you build your God box somewhere, and it works. But yeah, it's just in developer tooling, you're absolutely right. In developer tooling, this is everywhere, right?swyx [00:13:47]: And that, you know, it shows up everywhere. And what was interesting is I think everyone who makes the God box then has the same set of problems, which is also you now have like composability off and remotes versus local. So, you know, there's this very common set of problems. So I kind of want to take a meta lesson on how to do the God box, but, you know, we can talk about the sort of development stuff later. I wanted to double click on, again, the presentation that Justin mentioned of like how features manifest and how you said some things are the same, but you just want to reify some concepts so they show up differently. And I had that sense, you know, when I was looking at the MCP docs, I'm like, why do these two things need to be the difference in other? I think a lot of people treat tool calling as the solution to everything, right? And sometimes you can actually sort of view kinds of different kinds of tool calls as different things. And sometimes they're resources. Sometimes they're actually taking actions. Sometimes they're something else that I don't really know yet. But I just want to see, like, what are some things that you sort of mentally group as adjacent concepts and why were they important to you to emphasize?Justin/David [00:14:58]: Yeah, I can chat about this a bit. I think fundamentally we every sort of primitive that we thought through, we thought from the perspective of the application developer first, like if I'm building an application, whether it is an IDE or, you know, call a desktop or some agent interface or whatever the case may be, what are the different things that I would want to receive from like an integration? And I think once you take that lens, it becomes quite clear that that tool calling is necessary, but very insufficient. Like there are many other things you would want to do besides just get tools. And plug them into the model and you want to have some way of differentiating what those different things are. So the kind of core primitives that we started MCP with, we've since added a couple more, but the core ones are really tools, which we've already talked about. It's like adding, adding tools directly to the model or function calling is sometimes called resources, which is basically like bits of data or context that you might want to add to the context. So excuse me, to the, to the model context. And this, this is the first primitive where it's like, we, we. Decided this could be like application controlled, like maybe you want a model to automatically search through and, and find relevant resources and bring them into context. But maybe you also want that to be an explicit UI affordance in the application where the user can like, you know, pick through a dropdown or like a paperclip menu or whatever, and find specific things and tag them in. And then that becomes part of like their message to the LLM. Like those are both use cases for resources. And then the third one is prompts. Which are deliberately meant to be like user initiated or. Like. User substituted. Text or messages. So like the analogy here would be like, if you're an editor, like a slash command or something like that, or like an at, you know, auto completion type thing where it's like, I have this kind of macro effectively that I want to drop in and use. And we have sort of expressed opinions through MCP about the different ways that these things could manifest, but ultimately it is for application developers to decide, okay, you, you get these different concepts expressed differently. Um, and it's very useful as an application developer because you can decide. The appropriate experience for each, and actually this can be a point of differentiation to, like, we were also thinking, you know, from the application developer perspective, they, you know, application developers don't want to be commoditized. They don't want the application to end up the same as every other AI application. So like, what are the unique things that they could do to like create the best user experience even while connecting up to this big open ecosystem of integration? I, yeah. And I think to add to that, the, I think there are two, two aspects to that, that I want to. I want to mention the first one is that interestingly enough, like while nowadays tool calling is obviously like probably like 95% plus of the integrations, and I wish there would be, you know, more clients doing tool resources, doing prompts. The, the very first implementation in that is actually a prompt implementation. It doesn't deal with tools. And, and it, we found this actually quite useful because what it allows you to do is, for example, build an MCP server that takes like a backtrack. So it's, it's not necessarily like a tool that literally just like rawizes from Sentry or any other like online platform that, that tracks your, your crashes. And just lets you pull this into the context window beforehand. And so it's quite nice that way that it's like a user driven interaction that you does the user decide when to pull this in and don't have to wait for the model to do it. And so it's a great way to craft the prompt in a way. And I think similarly, you know, I wish, you know, more MCP servers today would bring prompts as examples of, like how to even use the tools. Yeah. at the same time. The resources bits are quite interesting as well. And I wish we would see more usage there because it's very easy to envision, but yet nobody has really implemented it. A system where like an MCP server exposes, you know, a set of documents that you have, your database, whatever you might want to as a set of resources. And then like a client application would build a full rack index around this, right? This is definitely an application use case we had in mind as to why these are exposed in such a way that they're not model driven, because you might want to have way more resource content than is, you know, realistically usable in a context window. And so I think, you know, I wish applications and I hope applications will do this in the next few months, use these primitives, you know, way better, because I think there's way more rich experiences to be created that way. Yeah, completely agree with that. And I would also add that I would go into it if I haven't.Alessio [00:19:30]: I think that's a great point. And everybody just, you know, has a hammer and wants to do tool calling on everything. I think a lot of people do tool calling to do a database query. They don't use resources for it. What are like the, I guess, maybe like pros and cons or like when people should use a tool versus a resource, especially when it comes to like things that do have an API interface, like for a database, you can do a tool that does a SQL query versus when should you do that or a resource instead with the data? Yeah.Justin/David [00:20:00]: The way we separate these is like tools are always meant to be initiated by the model. It's sort of like at the model's discretion that it will like find the right tool and apply it. So if that's the interaction you want as a server developer, where it's like, okay, this, you know, suddenly I've given the LLM the ability to run a SQL queries, for example, that makes sense as a tool. But resources are more flexible, basically. And I think, to be completely honest, the story here is practically a bit complicated today. Because many clients don't support resources yet. But like, I think in an ideal world where all these concepts are fully realized, and there's like full ecosystem support, you would do resources for things like the schemas of your database tables and stuff like that, as a way to like either allow the user to say like, okay, now, you know, cloud, I want to talk to you about this database table. Here it is. Let's have this conversation. Or maybe the particular AI application that you're using, like, you know, could be something agentic, like cloud code. is able to just like agentically look up resources and find the right schema of the database table you're talking about, like both those interactions are possible. But I think like, anytime you have this sort of like, you want to list a bunch of entities, and then read any of them, that makes sense to model as resources. Resources are also, they're uniquely identified by a URI, always. And so you can also think of them as like, you know, sort of general purpose transformers, even like, if you want to support an interaction where a user just like drops a URI in, and then you like automatically figure out how to interpret that, you could use MCP servers to do that interpretation. One of the interesting side notes here, back to the Z example of resources, is that has like a prompt library that you can do, that people can interact with. And we just exposed a set of default prompts that we want everyone to have as part of that prompt library. Yeah, resources for a while so that like, you boot up Zed and Zed will just populate the prompt library from an MCP server, which was quite a cool interaction. And that was, again, a very specific, like, both sides needed to agree upon the URI format and the underlying data format. And but that was a nice and kind of like neat little application of resources. There's also going back to that perspective of like, as an application developer, what are the things that I would want? Yeah. We also applied this thinking to like, you know, like, we can do this, we can do this, we can do this, we can do this. Like what existing features of applications could conceivably be kind of like factored out into MCP servers if you were to take that approach today. And so like basically any IDE where you have like an attachment menu that I think naturally models as resources. It's just, you know, those implementations already existed.swyx [00:22:49]: Yeah, I think the immediate like, you know, when you introduced it for cloud desktop and I saw the at sign there, I was like, oh, yeah, that's what Cursor has. But this is for everyone else. And, you know, I think like that that is a really good design target because it's something that already exists and people can map on pretty neatly. I was actually featuring this chart from Mahesh's workshop that presumably you guys agreed on. I think this is so useful that it should be on the front page of the docs. Like probably should be. I think that's a good suggestion.Justin/David [00:23:19]: Do you want to do you want to do a PR for this? I love it.swyx [00:23:21]: Yeah, do a PR. I've done a PR for just Mahesh's workshop in general, just because I'm like, you know. I know.SPEAKER_03 [00:23:28]: I approve. Yeah.swyx [00:23:30]: Thank you. Yeah. I mean, like, but, you know, I think for me as a developer relations person, I always insist on having a map for people. Here are all the main things you have to understand. We'll spend the next two hours going through this. So some one image that kind of covers all this, I think is pretty helpful. And I like your emphasis on prompts. I would say that it's interesting that like I think, you know, in the earliest early days of like chat GPT and cloud, people. Often came up with, oh, you can't really follow my screen, can you? In the early days of chat of, of chat, GPT and all that, like a lot, a lot of people started like, you know, GitHub for prompts, like we'll do prop manager libraries and, and like those never really took off. And I think something like this is helpful and important. I would say like, I've also seen prompt file from human loop, I think, as, as other ways to standardize how people share prompts. But yeah, I agree that like, there should be. There should be more innovation here. And I think probably people want some dynamicism, which I think you, you afford, you allow for. And I like that you have multi-step that this was, this is the main thing that got me like, like these guys really get it. You know, I think you, you maybe have a published some research that says like, actually sometimes to get, to get the model working the right way, you have to do multi-step prompting or jailbreaking to, to, to behave the way that you want. And so I think prompts are not just single conversations. They're sometimes chains of conversations. Yeah.Alessio [00:25:05]: Another question that I had when I was looking at some server implementations, the server builders kind of decide what data gets eventually returned, especially for tool calls. For example, the Google maps one, right? If you just look through it, they decide what, you know, attributes kind of get returned and the user can not override that if there's a missing one. That has always been my gripe with like SDKs in general, when people build like API wrapper SDKs. And then they miss one parameter that maybe it's new and then I can not use it. How do you guys think about that? And like, yeah, how much should the user be able to intervene in that versus just letting the server designer do all the work?Justin/David [00:25:41]: I think we probably bear responsibility for the Google maps one, because I think that's one of the reference servers we've released. I mean, in general, for things like for tool results in particular, we've actually made the deliberate decision, at least thus far, for tool results to be not like sort of structured JSON data, not matching a schema, really, but as like a text or images or basically like messages that you would pass into the LLM directly. And so I guess the correlation that is, you really should just return a whole jumble of data and trust the LLM to like sort through it and see. I mean, I think we've clearly done a lot of work. But I think we really need to be able to shift and like, you know, extract the information it cares about, because that's what that's exactly what they excel at. And we really try to think about like, yeah, how to, you know, use LLMs to their full potential and not maybe over specify and then end up with something that doesn't scale as LLMs themselves get better and better. So really, yeah, I suppose what should be happening in this example server, which again, will request welcome. It would be great. It's like if all these result types were literally just passed through from the API that it's calling, and then the API would be able to pass through automatically.Alessio [00:26:50]: Thank you for joining us.Alessio [00:27:19]: It's a hard to sign decisions on where to draw the line.Justin/David [00:27:22]: I'll maybe throw AI under the bus a little bit here and just say that Claude wrote a lot of these example servers. No surprise at all. But I do think, sorry, I do think there's an interesting point in this that I do think people at the moment still to mostly still just apply their normal software engineering API approaches to this. And I think we're still need a little bit more relearning of how to build something for LLMs and trust them, particularly, you know, as they are getting significantly better year to year. Right. And I think, you know, two years ago, maybe that approach would have been very valid. But nowadays, just like just throw data at that thing that is really good at dealing with data is a good approach to this problem. And I think it's just like unlearning like 20, 30, 40 years of software engineering practices that go a little bit into this to some degree. If I could add to that real quickly, just one framing as well for MCP is thinking in terms of like how crazily fast AI is advancing. I mean, it's exciting. It's also scary. Like thinking, us thinking that like the biggest bottleneck to, you know, the next wave of capabilities for models might actually be their ability to like interact with the outside world to like, you know, read data from outside data sources or like take stateful actions. Working at Anthropic, we absolutely care about doing that. Safely and with the right control and alignment measures in place and everything. But also as AI gets better, people will want that. That'll be key to like becoming productive with AI is like being able to connect them up to all those things. So MCP is also sort of like a bet on the future and where this is all going and how important that will be.Alessio [00:29:05]: Yeah. Yeah, I would say any API attribute that says formatted underscore should kind of be gone and we should just get the raw data from all of them. Because why, you know, why are you formatting? For me, the, the model is definitely smart enough to format an address. So I think that should go to the end user.swyx [00:29:23]: Yeah. I have, I think Alessio is about to move on to like server implementation. I wanted to, I think we were talking, we're still talking about sort of MCP design and goals and intentions. And we've, I think we've indirectly identified like some problems that MCP is really trying to address. But I wanted to give you the spot to directly take on MCP versus open API, because I think obviously there's a, this is a top question. I wanted to sort of recap everything we just talked about and give people a nice little segment that, that people can say, say, like, this is a definitive answer on MCP versus open API.Justin/David [00:29:56]: Yeah, I think fundamentally, I mean, open API specifications are a very great tool. And like I've used them a lot in developing APIs and consumers of APIs. I think fundamentally, or we think that they're just like too granular for what you want to do with LLMs. Like they don't express higher level AI specific concepts like this whole mental model. Yeah. But we've talked about with the primitives of MCP and thinking from the perspective of the application developer, like you don't get any of that when you encode this information into an open API specification. So we believe that models will benefit more from like the purpose built or purpose design tools, resources, prompts, and the other primitives than just kind of like, here's our REST API, go wild. I do think there, there's another aspect. I think that I'm not an open API expert, so I might, everything might not be perfectly accurate. But I do think that we're... Like there's been, and we can talk about this a bit more later. There's a deliberate design decision to make the protocol somewhat stateful because we do really believe that AI applications and AI like interactions will become inherently more stateful and that we're the current state of like, like need for statelessness is more a temporary point in time that will, you know, to some degree that will always exist. But I think like more statefulness will become increasingly more popular, particularly when you think about additional modalities that go beyond just pure text-based, you know, interactions with models, like it might be like video, audio, whatever other modalities exist and out there already. And so I do think that like having something a bit more stateful is just inherently useful in this interaction pattern. I do think they're actually more complimentary open API and MCP than if people wanted to make it out. Like people look. For these, like, you know, A versus B and like, you know, have, have all the, all the developers of these things go in a room and fist fight it out. But that's rarely what's going on. I think it's actually, they're very complimentary and they have their little space where they're very, very strong. And I think, you know, just use the best tool for the job. And if you want to have a rich interaction between an AI application, it's probably like, it's probably MCP. That's the right choice. And if, if you want to have like an API spec somewhere that is very easy and like a model can read. And to interpret, and that's what, what worked for you, then open API is the way to go. One more thing to add here is that we've already seen people, I mean, this happened very early. People in the community built like bridges between the two as well. So like, if what you have is an open API specification and no one's, you know, building a custom MCP server for it, there are already like translators that will take that and re-expose it as MCP. And you could do the other direction too. Awesome.Alessio [00:32:43]: Yeah. I think there's the other side of MCPs that people don't talk as much. Okay. I think there's the other side of MCPs that people don't talk as much about because it doesn't go viral, which is building the servers. So I think everybody does the tweets about like connect the cloud desktop to XMCP. It's amazing. How would you guys suggest people start with building servers? I think the spec is like, so there's so many things you can do that. It's almost like, how do you draw the line between being very descriptive as a server developer versus like going back to our discussion before, like just take the data and then let them auto manipulate it later. Do you have any suggestions for people?Justin/David [00:33:16]: I. I think there, I have a few suggestions. I think that one of the best things I think about MCP and something that we got right very early is that it's just very, very easy to build like something very simple that might not be amazing, but it's pretty, it's good enough because models are very good and get this going within like half an hour, you know? And so I think that the best part is just like pick the language of, you know, of your choice that you love the most, pick the SDK for it, if there's an SDK for it, and then just go build a tool of the thing that matters to you personally. And that you want to use. You want to see the model like interact with, build the server, throw the tool in, don't even worry too much about the description just yet, like do a bit of like, write your little description as you think about it and just give it to the model and just throw it to standard IO protocol transport wise into like an application that you like and see it do things. And I think that's part of the magic that, or like, you know, empowerment and magic for developers to get so quickly to something that the model does. Or something that you care about. That I think really gets you going and gets you into this flow of like, okay, I see this thing can do cool things. Now I go and, and can expand on this and now I can go and like really think about like, which are the different tools I want, which are the different raw resources and prompts I want. Okay. Now that I have that. Okay. Now do I, what do my evals look like for how I want this to go? How do I optimize my prompts for the evals using like tools like that? This is infinite depth so that you can do. But. Okay. Just start. As simple as possible and just go build a server in like half an hour in the language of your choice and how the model interacts with the things that matter to you. And I think that's where the fun is at. And I think people, I think a lot of what MCP makes great is it just adds a lot of fun to the development piece to just go and have models do things quickly. I also, I'm quite partial, again, to using AI to help me do the coding. Like, I think even during the initial development process, we realized it was quite easy to basically just take all the SDK code. Again, you know, what David suggested, like, you know, pick the language you care about, and then pick the SDK. And once you have that, you can literally just drop the whole SDK code into an LLM's context window and say, okay, now that you know MCP, build me a server that does that. This, this, this. And like, the results, I think, are astounding. Like, I mean, it might not be perfect around every single corner or whatever. And you can refine it over time. But like, it's a great way to kind of like one shot something that basically does what you want, and then you can iterate from there. And like David said, there has been a big emphasis from the beginning on like making servers as easy and simple to build as possible, which certainly helps with LLMs doing it too. We often find that like, getting started is like, you know, 100, 200 lines of code in the last couple of years. It's really quite easy. Yeah. And if you don't have an SDK, again, give the like, give the subset of the spec that you care about to the model, and like another SDK and just have it build you an SDK. And it usually works for like, that subset. Building a full SDK is a different story. But like, to get a model to tool call in Haskell or whatever, like language you like, it's probably pretty straightforward.swyx [00:36:32]: Yeah. Sorry.Alessio [00:36:34]: No, I was gonna say, I co-hosted a hackathon at the AGI house. I'm a personal agent, and one of the personal agents somebody built was like an MCP server builder agent, where they will basically put the URL of the API spec, and it will build an MCP server for them. Do you see that today as kind of like, yeah, most servers are just kind of like a layer on top of an existing API without too much opinion? And how, yeah, do you think that's kind of like how it's going to be going forward? Just like AI generated, exposed to API that already exists? Or are we going to see kind of like net new MCP experiences that you... You couldn't do before?Justin/David [00:37:10]: I think, go for it. I think both, like, I, I think there, there will always be value in like, oh, I have, you know, I have my data over here, and I want to use some connector to bring it into my application over here. That use case will certainly remain. I think, you know, this, this kind of goes back to like, I think a lot of things today are maybe defaulting to tool use when some of the other primitives would be maybe more appropriate over time. And so it could still be that connector. It could still just be that sort of adapter layer, but could like actually adapt it onto different primitives, which is one, one way to add more value. But then I also think there's plenty of opportunity for use cases, which like do, you know, or for MCP servers that kind of do interesting things in and out themselves and aren't just adapters. Some of the earliest examples of this were like, you know, the memory MCP server, which gives the LLM the ability to remember things across conversations or like someone who's a close coworker built the... I shouldn't have said that, not a close coworker. Someone. Yeah. Built the sequential thinking MCP server, which gives a model the ability to like really think step-by-step and get better at its reasoning capabilities. This is something where it's like, it really isn't integrating with anything external. It's just providing this sort of like way of thinking for a model.Justin/David [00:38:27]: I guess either way though, I think AI authorship of the servers is totally possible. Like I've had a lot of success in prompting, just being like, Hey, I want to build an MCP server that like does this thing. And even if this thing is not. Adapting some other API, but it's doing something completely original. It's usually able to figure that out too. Yeah. I do. I do think that the, to add to that, I do think that a good part of, of what MCP servers will be, will be these like just API wrapper to some degree. Um, and that's good to be valid because that works and it gets you very, very far. But I think we're just very early, like in, in exploring what you can do. Um, and I think as client support for like certain primitives get better, like we can talk about sampling. I'm playing with my favorite topic and greatest frustration at the same time. Um, I think you can just see it very easily see like way, way, way richer experiences and we have, we have built them internally for as prototyping aspects. And I think you see some of that in the community already, but there's just, you know, things like, Hey, summarize my, you know, my, my, my, my favorite subreddits for the morning MCP server that nobody has built yet, but it's very easy to envision. And the protocol can totally do this. And these are like slightly richer experiences. And I think as people like go away from like the, oh, I just want to like, I'm just in this new world where I can hook up the things that matter to me, to the LLM, to like actually want a real workflow, a real, like, like more richer experience that I, I really want exposed to the model. I think then you will see these things pop up, but again, that's a, there's a little bit of a chicken and egg problem at the moment with like what a client supported versus, you know, what servers like authors want to do. Yeah.Alessio [00:40:10]: That, that, that was. That's kind of my next question on composability. Like how, how do you guys see that? Do you have plans for that? What's kind of like the import of MCPs, so to speak, into another MCP? Like if I want to build like the subreddit one, there's probably going to be like the Reddit API, uh, MCP, and then the summarization MCP. And then how do I, how do I do a super MCP?Justin/David [00:40:33]: Yeah. So, so this is an interesting topic and I think there, um, so there, there are two aspects to it. I think that the one aspect is like, how can I build something? I think agentically that you requires an LLM call and like a one form of fashion, like for summarization or so, but I'm staying model independent and for that, that's where like part of this by directionality comes in, in this more rich experience where we do have this facility for servers to ask the client again, who owns the LLM interaction, right? Like we talk about cursor, who like runs the, the, the loop with the LLM for you there that for the server author to ask the client for a completion. Um, and basically have it like summarize something for the server and return it back. And so now what model summarizes this depends on which one you have selected in cursor and not depends on what the author brings. The author doesn't bring an SDK. It doesn't have, you had an API key. It's completely model independent, how you can build this. There's just one aspect to that. The second aspect to building richer, richer systems with MCP is that you can easily envision an MCP server that serves something to like something like cursor or win server. For a cloud desktop, but at the same time, also is an MCP client at the same time and itself can use MCP servers to create a rich experience. And now you have a recursive property, which we actually quite carefully in the design principles, try to retain. You, you know, you see it all over the place and authorization and other aspects, um, to the spec that we retain this like recursive pattern. And now you can think about like, okay, I have, I have this little bundle of applications, both a server and a client. And I can add. Add these in chains and build basically graphs like, uh, DAGs out of MCP servers, um, uh, that can just richly interact with each other. A agentic MCP server can also use the whole ecosystem of MCP servers available to themselves. And I think that's a really cool environment, cool thing you can do. And people have experimented with this. And I think you see hopefully more of this, particularly when you think about like auto-selecting, auto-installing, there's a bunch of these things you can do that make, uh, make a really fun experience. I, I think practically there are some niceties we still need to add to the SDKs to make this really simple and like easy to execute on like this kind of recursive MCP server that is also a client or like kind of multiplexing together the behaviors of multiple MCP servers into one host, as we call it. These are things we definitely want to add. We haven't been able to yet, but like, uh, I think that would go some way to showcasing these things that we know are already possible, but not necessarily taken up that much yet. Okay.swyx [00:43:08]: This is, uh, very exciting. And very, I'm sure, I'm sure a lot of people get very, very, uh, a lot of ideas and inspiration from this. Is an MCP server that is also a client, is that an agent?Justin/David [00:43:19]: What's an agent? There's a lot of definitions of agents.swyx [00:43:22]: Because like you're, in some ways you're, you're requesting something and it's going off and doing stuff that you don't necessarily know. There's like a layer of abstraction between you and the ultimate raw source of the data. You could dispute that. Yeah. I just, I don't know if you have a hot take on agents.Justin/David [00:43:35]: I do think, I do think that you can build an agent that way. For me, I think you need to define the difference between. An MCP server plus client that is just a proxy versus an agent. I think there's a difference. And I think that difference might be in, um, you know, for example, using a sample loop to create a more richer experience to, uh, to, to have a model call tools while like inside that MCP server through these clients. I think then you have a, an actual like agent. Yeah. I do think it's very simple to build agents that way. Yeah. I think there are maybe a few paths here. Like it definitely feels like there's some relationship. Between MCP and agents. One possible version is like, maybe MCP is a great way to represent agents. Maybe there are some like, you know, features or specific things that are missing that would make the ergonomics of it better. And we should make that part of MCP. That's one possibility. Another is like, maybe MCP makes sense as kind of like a foundational communication layer for agents to like compose with other agents or something like that. Or there could be other possibilities entirely. Maybe MCP should specialize and narrowly focus on kind of the AI application side. And not as much on the agent side. I think it's a very live question and I think there are sort of trade-offs in every direction going back to the analogy of the God box. I think one thing that we have to be very careful about in designing a protocol and kind of curating or shepherding an ecosystem is like trying to do too much. I think it's, it's a very big, yeah, you know, you don't want a protocol that tries to do absolutely everything under the sun because then it'll be bad at everything too. And so I think the key question, which is still unresolved is like, to what degree are agents. Really? Really naturally fitting in to this existing model and paradigm or to what degree is it basically just like orthogonal? It should be something.swyx [00:45:17]: I think once you enable two way and once you enable client server to be the same and delegation of work to another MCP server, it's definitely more agentic than not. But I appreciate that you keep in mind simplicity and not trying to solve every problem under the sun. Cool. I'm happy to move on there. I mean, I'm going to double click on a couple of things that I marked out because they coincide with things that we wanted to ask you. Anyway, so the first one is, it's just a simple, how many MCP things can one implementation support, you know, so this is the, the, the sort of wide versus deep question. And, and this, this is direct relevance to the nesting of MCPs that we just talked about in April, 2024, when, when Claude was launching one of its first contexts, the first million token context example, they said you can support 250 tools. And in a lot of cases, you can't do that. You know, so to me, that's wide in, in the sense that you, you don't have tools that call tools. You just have the model and a flat hierarchy of tools, but then obviously you have tool confusion. It's going to happen when the tools are adjacent, you call the wrong tool. You're going to get the bad results, right? Do you have a recommendation of like a maximum number of MCP servers that are enabled at any given time?Justin/David [00:46:32]: I think be honest, like, I think there's not one answer to this because to some extent, it depends on the model that you're using. To some extent, it depends on like how well the tools are named and described for the model and stuff like that to avoid confusion. I mean, I think that the dream is certainly like you just furnish all this information to the LLM and it can make sense of everything. This, this kind of goes back to like the, the future we envision with MCP is like all this information is just brought to the model and it decides what to do with it. But today the reality or the practicalities might mean that like, yeah, maybe you, maybe in your client application, like the AI application, you do some fill in the blanks. Maybe you do some filtering over the tool set or like maybe you, you run like a faster, smaller LLM to like filter to what's most relevant and then only pass those tools to the bigger model. Or you could use an MCP server, which is a proxy to other MCP servers and does some filtering at that level or something like that. I think hundreds, as you referenced, is still a fairly safe bet, at least for Claude. I can't speak to the other models, but yeah, I don't know. I think over time we should just expect this to get better. So we're wary of like constraining anything and preventing that. Sort of long. Yeah, and obviously it highly, it highly depends on the overlap of the description, right? Like if you, if you have like very separate servers that do very separate things and the tools have very clear unique names, very clear, well-written descriptions, you know, your mileage might be more higher than if you have a GitLab and a GitHub server at the same time in your context. And, and then the overlap is quite significant because they look very similar to the model and confusion becomes easier. There's different considerations too. Depending on the AI application, if you're, if you're trying to build something very agentic, maybe you are trying to minimize the amount of times you need to go back to the user with a question or, you know, minimize the amount of like configurability in your interface or something. But if you're building other applications, you're building an IDE or you're building a chat application or whatever, like, I think it's totally reasonable to have affordances that allow the user to say like, at this moment, I want this feature set or at this different moment, I want this different feature set or something like that. And maybe not treat it as like always on. The full list always on all the time. Yeah.swyx [00:48:42]: That's where I think the concepts of resources and tools get to blend a little bit, right? Because now you're saying you want some degree of user control, right? Or application control. And other times you want the model to control it, right? So now we're choosing just subsets of tools. I don't know.Justin/David [00:49:00]: Yeah, I think it's a fair point or a fair concern. I guess the way I think about this is still like at the end of the day, and this is a core MCP design principle is like, ultimately, the concept of a tool is not a tool. It's a client application, and by extension, the user. Ultimately, they should be in full control of absolutely everything that's happening via MCP. When we say that tools are model controlled, what we really mean is like, tools should only be invoked by the model. Like there really shouldn't be an application interaction or a user interaction where it's like, okay, as a user, I now want you to use this tool. I mean, occasionally you might do that for prompting reasons, but like, I think that shouldn't be like a UI affordance. But I think the client application or the user deciding to like filter out the user, it's not a tool. I think the client application or the user deciding to like filter out things that MCP servers are offering, totally reasonable, or even like transform them. Like you could imagine a client application that takes tool descriptions from an MCP server and like enriches them, makes them better. We really want the client applications to have full control in the MCP paradigm. That in addition, though, like I think there, one thing that's very, very early in my thinking is there might be a addition to the protocol where you want to give the server author the ability to like logically group certain primitives together, potentially. Yeah. To inform that, because they might know some of these logical groupings better, and that could like encompasses prompts, resources, and tools at the same time. I mean, personally, we can have a design discussion on there. I mean, personally, my take would be that those should be separate MCP servers, and then the user should be able to compose them together. But we can figure it out.Alessio [00:50:31]: Is there going to be like a MCP standard library, so to speak, of like, hey, these are like the canonical servers, do not build this. We're just going to take care of those. And those can be maybe the building blocks that people can compose. Or do you expect people to just rebuild their own MCP servers for like a lot of things?Justin/David [00:50:49]: I think we will not be prescriptive in that sense. I think there will be inherently, you know, there's a lot of power. Well, let me rephrase it. Like, I have a long history in open source, and I feel the bizarre approach to this problem is somewhat useful, right? And I think so that the best and most interesting option wins. And I don't think we want to be very prescriptive. I will definitely foresee, and this already exists, that there will be like 25 GitHub servers and like 25, you know, Postgres servers and whatnot. And that's all cool. And that's good. And I think they all add in their own way. But effectively, eventually, over months or years, the ecosystem will converge to like a set of very widely used ones who basically, I don't know if you call it winning, but like that will be the most used ones. And I think that's completely fine. Because being prescriptive about this, I don't think it's any useful, any use. I do think, of course, that there will be like MCP servers, and you see them already that are driven by companies for their products. And, you know, they will inherently be probably the canonical implementation. Like if you want to work with Cloudflow workers and use an MCP server for that, you'll probably want to use the one developed by Cloudflare. Yeah. I think there's maybe a related thing here, too, just about like one big thing worth thinking about. We don't have any like solutions completely ready to go. It's this question of like trust or like, you know, vetting is maybe a better word. Like, how do you determine which MCP servers are like the kind of good and safe ones to use? Regardless of if there are any implementations of GitHub MCP servers, that could be totally fine. But you want to make sure that you're not using ones that are really like sus, right? And so trying to think about like how to kind of endow reputation or like, you know, if hypothetically. Anthropic is like, we've vetted this. It meets our criteria for secure coding or something. How can that be reflected in kind of this open model where everyone in the ecosystem can benefit? Don't really know the answer yet, but that's very much top of mind.Alessio [00:52:49]: But I think that's like a great design choice of MCPs, which is like language agnostic. Like already, and there's not, to my knowledge, an Anthropic official Ruby SDK, nor an OpenAI SDK. And Alex Roudal does a great job building those. But now with MCPs is like. You don't actually have to translate an SDK to all these languages. You just do one, one interface and kind of bless that interface as, as Anthropic. So yeah, that was, that was nice.swyx [00:53:18]: I have a quick answer to this thing. So like, obviously there's like five or six different registries already popped up. You guys announced your official registry that's gone away. And a registry is very tempting to offer download counts, likes, reviews, and some kind of trust thing. I think it's kind of brittle. Like no matter what kind of social proof or other thing you can, you can offer, the next update can compromise a trusted package. And actually that's the one that does the most damage, right? So abusing the trust system is like setting up a trust system creates the damage from the trust system. And so I actually want to encourage people to try out MCP Inspector because all you got to do is actually just look at the traffic. And like, I think that's, that goes for a lot of security issues.Justin/David [00:54:03]: Yeah, absolutely. Cool. And then I think like that's very classic, just supply chain problem that like all registries effectively have. And the, you know, there are different approaches to this problem. Like you can take the Apple approach and like vet things and like have like an army of, of both automated system and review teams to do this. And then you effectively build an app store, right? That's, that's one approach to this type of problem. It kind of works in, you know, in a very set, certain set of ways. But I don't think it works in an open source kind of ecosystem for which you always have a registry kind of approach, like similar to MPM and packages and PiPi.swyx [00:54:36]: And they all have inherently these, like these, these supply chain attack problems, right? Yeah, yeah, totally. Quick time check. I think we're going to go for another like 20, 25 minutes. Is that okay for you guys? Okay, awesome. Cool. I wanted to double click, take the time. So I'm going to sort of, we previewed a little bit on like the future coming stuff. So I want to leave the future coming stuff to the end, like registry, the, the, the stateless servers and remote servers, all the other stuff. But I wanted to double click a little bit. A little bit more on the launch, the core servers that are part of the official repo. And some of them are special ones, like the, like the ones we already talked about. So let me just pull them up already. So for example, you mentioned memory, you mentioned sequential thinking. And I think I really, really encourage people should look at these, what I call special servers. Like they're, they're not normal servers in the, in the sense that they, they wrap some API and it's just easier to interact with those than to work at the APIs. And so I'll, I'll highlight the, the memory one first, just because like, I think there are, there are a few memory startups, but actually you don't need them if you just use this one. It's also like 200 lines of code. It's super simple. And, and obviously then if you need to scale it up, you should probably do some, some more battle tested thing. But if you're interested, if you're just introducing memory, I think this is a really good implementation. I don't know if there's like special stories that you want to highlight with, with some of these.Justin/David [00:56:00]: I think, no, I don't, I don't think there's special stories. I think a lot of these, not all of them, but a lot of them originated from that hackathon that I mentioned before, where folks got excited about the idea of MCP. People internally inside Anthropik who wanted to have memory or like wanted to play around with the idea could quickly now prototype something using MCP in a way that wasn't possible before. Someone who's not like, you know, you don't have to become the, the end to end expert. You don't have access. You don't have to have access to this. Like, you know. You don't have to have this private, you know, proprietary code base. You can just now extend cloud with this memory capability. So that's how a lot of these came about. And then also just thinking about like, you know, what is the breadth of functionality that we want to demonstrate at launch?swyx [00:56:47]: Totally. And I think that is partially why it made your launch successful because you launch with a sufficiently spanning set of here's examples and then people just copy paste and expand from there. I would also highligh

god trust ai google apple internet pr moving space building san francisco speaker microsoft open model clients 3d built uber cheers reddit cloud id pokemon context ip sort adapting creators cto depending react origin stories openai gov user protocol shopify rust api decided io gpt python ui safely java github wishlists apis llm temporal apache inspirations agi blender usb c ide ides anthropic sql dags cloudflare uri sdks nesting haskell sentry alessio godot zed gitlab authorization pipi community involvement json mcp summarize cursor typescript vim graphql decibel david p eve online mahesh latent linux foundation kotlin sse oauth postgres rest apis jetbrains lsp aie mpm open api mcps c sharp dharmesh shah stateful reddit api latent space jared palmer

The Creators of Model Context Protocol

Latent Space: The AI Engineer Podcast â€” CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Apr 3, 2025 79:56

Today's guests, David Soria Parra and Justin Spahr-Summers, are the creators of Anthropic's Model Context Protocol (MCP). When we first wrote Why MCP Won, we had no idea how quickly it was about to win. In the past 4 weeks, OpenAI and now Google have now announced the MCP support, effectively confirming our prediction that MCP was the presumptive winner of the agent standard wars. MCP has now overtaken OpenAPI, the incumbent option and most direct alternative, in GitHub stars (3 months ahead of conservative trendline): For protocol and history nerds, we also asked David and Justin to tell the origin story of MCP, which we leave to the reader to enjoy (you can also skim the transcripts, or, the changelogs of a certain favored IDE). It's incredible the impact that individual engineers solving their own problems can have on an entire industry. Timestamps 00:00 Introduction and Guest Welcome 00:37 What is MCP? 02:00 The Origin Story of MCP 05:18 Development Challenges and Solutions 08:06 Technical Details and Inspirations 29:45 MCP vs Open API 32:48 Building MCP Servers 40:39 Exploring Model Independence in LLMs 41:36 Building Richer Systems with MCP 43:13 Understanding Agents in MCP 45:45 Nesting and Tool Confusion in MCP 49:11 Client Control and Tool Invocation 52:08 Authorization and Trust in MCP Servers 01:01:34 Future Roadmap and Stateless Servers 01:10:07 Open Source Governance and Community Involvement 01:18:12 Wishlist and Closing Remarks

trust google model context creators origin stories openai protocol github wishlists closing remarks ide anthropic nesting authorization mcp open api

Torsion Talk S8 Ep101: Puerto Rico Recap, GDU Intensive Launch & Insight for Doors Demo Review

Torsion Talk Podcast

Play Episode Listen Later Apr 1, 2025 27:23

Fresh off an incredible summit in Puerto Rico, Ryan is back on Torsion Talk with major updates, new product announcements, and a detailed review of his demo with Insight for Doors! In this episode, he shares key takeaways from the summit, discusses the future of FSM software for his business, and breaks down the pros and cons of Insight for Doors compared to FieldPulse and ServiceTitan.Episode Highlights:

fresh launch puerto rico doors crm areas demo ux intensive 60k fsm open api servicetitan torsion fieldpulse

The Agent Network — Dharmesh Shah

Latent Space: The AI Engineer Podcast â€” CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Mar 28, 2025 98:24

If you're in SF: Join us for the Claude Plays Pokemon hackathon this Sunday!If you're not: Fill out the 2025 State of AI Eng survey for $250 in Amazon cards!We are SO excited to share our conversation with Dharmesh Shah, co-founder of HubSpot and creator of Agent.ai.A particularly compelling concept we discussed is the idea of "hybrid teams" - the next evolution in workplace organization where human workers collaborate with AI agents as team members. Just as we previously saw hybrid teams emerge in terms of full-time vs. contract workers, or in-office vs. remote workers, Dharmesh predicts that the next frontier will be teams composed of both human and AI members. This raises interesting questions about team dynamics, trust, and how to effectively delegate tasks between human and AI team members.The discussion of business models in AI reveals an important distinction between Work as a Service (WaaS) and Results as a Service (RaaS), something Dharmesh has written extensively about. While RaaS has gained popularity, particularly in customer support applications where outcomes are easily measurable, Dharmesh argues that this model may be over-indexed. Not all AI applications have clearly definable outcomes or consistent economic value per transaction, making WaaS more appropriate in many cases. This insight is particularly relevant for businesses considering how to monetize AI capabilities.The technical challenges of implementing effective agent systems are also explored, particularly around memory and authentication. Shah emphasizes the importance of cross-agent memory sharing and the need for more granular control over data access. He envisions a future where users can selectively share parts of their data with different agents, similar to how OAuth works but with much finer control. This points to significant opportunities in developing infrastructure for secure and efficient agent-to-agent communication and data sharing.Other highlights from our conversation* The Evolution of AI-Powered Agents – Exploring how AI agents have evolved from simple chatbots to sophisticated multi-agent systems, and the role of MCPs in enabling that.* Hybrid Digital Teams and the Future of Work – How AI agents are becoming teammates rather than just tools, and what this means for business operations and knowledge work.* Memory in AI Agents – The importance of persistent memory in AI systems and how shared memory across agents could enhance collaboration and efficiency.* Business Models for AI Agents – Exploring the shift from software as a service (SaaS) to work as a service (WaaS) and results as a service (RaaS), and what this means for monetization.* The Role of Standards Like MCP – Why MCP has been widely adopted and how it enables agent collaboration, tool use, and discovery.* The Future of AI Code Generation and Software Engineering – How AI-assisted coding is changing the role of software engineers and what skills will matter most in the future.* Domain Investing and Efficient Markets – Dharmesh's approach to domain investing and how inefficiencies in digital asset markets create business opportunities.* The Philosophy of Saying No – Lessons from "Sorry, You Must Pass" and how prioritization leads to greater productivity and focus.Timestamps* 00:00 Introduction and Guest Welcome* 02:29 Dharmesh Shah's Journey into AI* 05:22 Defining AI Agents* 06:45 The Evolution and Future of AI Agents* 13:53 Graph Theory and Knowledge Representation* 20:02 Engineering Practices and Overengineering* 25:57 The Role of Junior Engineers in the AI Era* 28:20 Multi-Agent Systems and MCP Standards* 35:55 LinkedIn's Legal Battles and Data Scraping* 37:32 The Future of AI and Hybrid Teams* 39:19 Building Agent AI: A Professional Network for Agents* 40:43 Challenges and Innovations in Agent AI* 45:02 The Evolution of UI in AI Systems* 01:00:25 Business Models: Work as a Service vs. Results as a Service* 01:09:17 The Future Value of Engineers* 01:09:51 Exploring the Role of Agents* 01:10:28 The Importance of Memory in AI* 01:11:02 Challenges and Opportunities in AI Memory* 01:12:41 Selective Memory and Privacy Concerns* 01:13:27 The Evolution of AI Tools and Platforms* 01:18:23 Domain Names and AI Projects* 01:32:08 Balancing Work and Personal Life* 01:35:52 Final Thoughts and ReflectionsTranscriptAlessio [00:00:04]: Hey everyone, welcome back to the Latent Space podcast. This is Alessio, partner and CTO at Decibel Partners, and I'm joined by my co-host Swyx, founder of Small AI.swyx [00:00:12]: Hello, and today we're super excited to have Dharmesh Shah to join us. I guess your relevant title here is founder of Agent AI.Dharmesh [00:00:20]: Yeah, that's true for this. Yeah, creator of Agent.ai and co-founder of HubSpot.swyx [00:00:25]: Co-founder of HubSpot, which I followed for many years, I think 18 years now, gonna be 19 soon. And you caught, you know, people can catch up on your HubSpot story elsewhere. I should also thank Sean Puri, who I've chatted with back and forth, who's been, I guess, getting me in touch with your people. But also, I think like, just giving us a lot of context, because obviously, My First Million joined you guys, and they've been chatting with you guys a lot. So for the business side, we can talk about that, but I kind of wanted to engage your CTO, agent, engineer side of things. So how did you get agent religion?Dharmesh [00:01:00]: Let's see. So I've been working, I'll take like a half step back, a decade or so ago, even though actually more than that. So even before HubSpot, the company I was contemplating that I had named for was called Ingenisoft. And the idea behind Ingenisoft was a natural language interface to business software. Now realize this is 20 years ago, so that was a hard thing to do. But the actual use case that I had in mind was, you know, we had data sitting in business systems like a CRM or something like that. And my kind of what I thought clever at the time. Oh, what if we used email as the kind of interface to get to business software? And the motivation for using email is that it automatically works when you're offline. So imagine I'm getting on a plane or I'm on a plane. There was no internet on planes back then. It's like, oh, I'm going through business cards from an event I went to. I can just type things into an email just to have them all in the backlog. When it reconnects, it sends those emails to a processor that basically kind of parses effectively the commands and updates the software, sends you the file, whatever it is. And there was a handful of commands. I was a little bit ahead of the times in terms of what was actually possible. And I reattempted this natural language thing with a product called ChatSpot that I did back 20...swyx [00:02:12]: Yeah, this is your first post-ChatGPT project.Dharmesh [00:02:14]: I saw it come out. Yeah. And so I've always been kind of fascinated by this natural language interface to software. Because, you know, as software developers, myself included, we've always said, oh, we build intuitive, easy-to-use applications. And it's not intuitive at all, right? Because what we're doing is... We're taking the mental model that's in our head of what we're trying to accomplish with said piece of software and translating that into a series of touches and swipes and clicks and things like that. And there's nothing natural or intuitive about it. And so natural language interfaces, for the first time, you know, whatever the thought is you have in your head and expressed in whatever language that you normally use to talk to yourself in your head, you can just sort of emit that and have software do something. And I thought that was kind of a breakthrough, which it has been. And it's gone. So that's where I first started getting into the journey. I started because now it actually works, right? So once we got ChatGPT and you can take, even with a few-shot example, convert something into structured, even back in the ChatGP 3.5 days, it did a decent job in a few-shot example, convert something to structured text if you knew what kinds of intents you were going to have. And so that happened. And that ultimately became a HubSpot project. But then agents intrigued me because I'm like, okay, well, that's the next step here. So chat's great. Love Chat UX. But if we want to do something even more meaningful, it felt like the next kind of advancement is not this kind of, I'm chatting with some software in a kind of a synchronous back and forth model, is that software is going to do things for me in kind of a multi-step way to try and accomplish some goals. So, yeah, that's when I first got started. It's like, okay, what would that look like? Yeah. And I've been obsessed ever since, by the way.Alessio [00:03:55]: Which goes back to your first experience with it, which is like you're offline. Yeah. And you want to do a task. You don't need to do it right now. You just want to queue it up for somebody to do it for you. Yes. As you think about agents, like, let's start at the easy question, which is like, how do you define an agent? Maybe. You mean the hardest question in the universe? Is that what you mean?Dharmesh [00:04:12]: You said you have an irritating take. I do have an irritating take. I think, well, some number of people have been irritated, including within my own team. So I have a very broad definition for agents, which is it's AI-powered software that accomplishes a goal. Period. That's it. And what irritates people about it is like, well, that's so broad as to be completely non-useful. And I understand that. I understand the criticism. But in my mind, if you kind of fast forward months, I guess, in AI years, the implementation of it, and we're already starting to see this, and we'll talk about this, different kinds of agents, right? So I think in addition to having a usable definition, and I like yours, by the way, and we should talk more about that, that you just came out with, the classification of agents actually is also useful, which is, is it autonomous or non-autonomous? Does it have a deterministic workflow? Does it have a non-deterministic workflow? Is it working synchronously? Is it working asynchronously? Then you have the different kind of interaction modes. Is it a chat agent, kind of like a customer support agent would be? You're having this kind of back and forth. Is it a workflow agent that just does a discrete number of steps? So there's all these different flavors of agents. So if I were to draw it in a Venn diagram, I would draw a big circle that says, this is agents, and then I have a bunch of circles, some overlapping, because they're not mutually exclusive. And so I think that's what's interesting, and we're seeing development along a bunch of different paths, right? So if you look at the first implementation of agent frameworks, you look at Baby AGI and AutoGBT, I think it was, not Autogen, that's the Microsoft one. They were way ahead of their time because they assumed this level of reasoning and execution and planning capability that just did not exist, right? So it was an interesting thought experiment, which is what it was. Even the guy that, I'm an investor in Yohei's fund that did Baby AGI. It wasn't ready, but it was a sign of what was to come. And so the question then is, when is it ready? And so lots of people talk about the state of the art when it comes to agents. I'm a pragmatist, so I think of the state of the practical. It's like, okay, well, what can I actually build that has commercial value or solves actually some discrete problem with some baseline of repeatability or verifiability?swyx [00:06:22]: There was a lot, and very, very interesting. I'm not irritated by it at all. Okay. As you know, I take a... There's a lot of anthropological view or linguistics view. And in linguistics, you don't want to be prescriptive. You want to be descriptive. Yeah. So you're a goals guy. That's the key word in your thing. And other people have other definitions that might involve like delegated trust or non-deterministic work, LLM in the loop, all that stuff. The other thing I was thinking about, just the comment on Baby AGI, LGBT. Yeah. In that piece that you just read, I was able to go through our backlog and just kind of track the winter of agents and then the summer now. Yeah. And it's... We can tell the whole story as an oral history, just following that thread. And it's really just like, I think, I tried to explain the why now, right? Like I had, there's better models, of course. There's better tool use with like, they're just more reliable. Yep. Better tools with MCP and all that stuff. And I'm sure you have opinions on that too. Business model shift, which you like a lot. I just heard you talk about RAS with MFM guys. Yep. Cost is dropping a lot. Yep. Inference is getting faster. There's more model diversity. Yep. Yep. I think it's a subtle point. It means that like, you have different models with different perspectives. You don't get stuck in the basin of performance of a single model. Sure. You can just get out of it by just switching models. Yep. Multi-agent research and RL fine tuning. So I just wanted to let you respond to like any of that.Dharmesh [00:07:44]: Yeah. A couple of things. Connecting the dots on the kind of the definition side of it. So we'll get the irritation out of the way completely. I have one more, even more irritating leap on the agent definition thing. So here's the way I think about it. By the way, the kind of word agent, I looked it up, like the English dictionary definition. The old school agent, yeah. Is when you have someone or something that does something on your behalf, like a travel agent or a real estate agent acts on your behalf. It's like proxy, which is a nice kind of general definition. So the other direction I'm sort of headed, and it's going to tie back to tool calling and MCP and things like that, is if you, and I'm not a biologist by any stretch of the imagination, but we have these single-celled organisms, right? Like the simplest possible form of what one would call life. But it's still life. It just happens to be single-celled. And then you can combine cells and then cells become specialized over time. And you have much more sophisticated organisms, you know, kind of further down the spectrum. In my mind, at the most fundamental level, you can almost think of having atomic agents. What is the simplest possible thing that's an agent that can still be called an agent? What is the equivalent of a kind of single-celled organism? And the reason I think that's useful is right now we're headed down the road, which I think is very exciting around tool use, right? That says, okay, the LLMs now can be provided a set of tools that it calls to accomplish whatever it needs to accomplish in the kind of furtherance of whatever goal it's trying to get done. And I'm not overly bothered by it, but if you think about it, if you just squint a little bit and say, well, what if everything was an agent? And what if tools were actually just atomic agents? Because then it's turtles all the way down, right? Then it's like, oh, well, all that's really happening with tool use is that we have a network of agents that know about each other through something like an MMCP and can kind of decompose a particular problem and say, oh, I'm going to delegate this to this set of agents. And why do we need to draw this distinction between tools, which are functions most of the time? And an actual agent. And so I'm going to write this irritating LinkedIn post, you know, proposing this. It's like, okay. And I'm not suggesting we should call even functions, you know, call them agents. But there is a certain amount of elegance that happens when you say, oh, we can just reduce it down to one primitive, which is an agent that you can combine in complicated ways to kind of raise the level of abstraction and accomplish higher order goals. Anyway, that's my answer. I'd say that's a success. Thank you for coming to my TED Talk on agent definitions.Alessio [00:09:54]: How do you define the minimum viable agent? Do you already have a definition for, like, where you draw the line between a cell and an atom? Yeah.Dharmesh [00:10:02]: So in my mind, it has to, at some level, use AI in order for it to—otherwise, it's just software. It's like, you know, we don't need another word for that. And so that's probably where I draw the line. So then the question, you know, the counterargument would be, well, if that's true, then lots of tools themselves are actually not agents because they're just doing a database call or a REST API call or whatever it is they're doing. And that does not necessarily qualify them, which is a fair counterargument. And I accept that. It's like a good argument. I still like to think about—because we'll talk about multi-agent systems, because I think—so we've accepted, which I think is true, lots of people have said it, and you've hopefully combined some of those clips of really smart people saying this is the year of agents, and I completely agree, it is the year of agents. But then shortly after that, it's going to be the year of multi-agent systems or multi-agent networks. I think that's where it's going to be headed next year. Yeah.swyx [00:10:54]: Opening eyes already on that. Yeah. My quick philosophical engagement with you on this. I often think about kind of the other spectrum, the other end of the cell spectrum. So single cell is life, multi-cell is life, and you clump a bunch of cells together in a more complex organism, they become organs, like an eye and a liver or whatever. And then obviously we consider ourselves one life form. There's not like a lot of lives within me. I'm just one life. And now, obviously, I don't think people don't really like to anthropomorphize agents and AI. Yeah. But we are extending our consciousness and our brain and our functionality out into machines. I just saw you were a Bee. Yeah. Which is, you know, it's nice. I have a limitless pendant in my pocket.Dharmesh [00:11:37]: I got one of these boys. Yeah.swyx [00:11:39]: I'm testing it all out. You know, got to be early adopters. But like, we want to extend our personal memory into these things so that we can be good at the things that we're good at. And, you know, machines are good at it. Machines are there. So like, my definition of life is kind of like going outside of my own body now. I don't know if you've ever had like reflections on that. Like how yours. How our self is like actually being distributed outside of you. Yeah.Dharmesh [00:12:01]: I don't fancy myself a philosopher. But you went there. So yeah, I did go there. I'm fascinated by kind of graphs and graph theory and networks and have been for a long, long time. And to me, we're sort of all nodes in this kind of larger thing. It just so happens that we're looking at individual kind of life forms as they exist right now. But so the idea is when you put a podcast out there, there's these little kind of nodes you're putting out there of like, you know, conceptual ideas. Once again, you have varying kind of forms of those little nodes that are up there and are connected in varying and sundry ways. And so I just think of myself as being a node in a massive, massive network. And I'm producing more nodes as I put content or ideas. And, you know, you spend some portion of your life collecting dots, experiences, people, and some portion of your life then connecting dots from the ones that you've collected over time. And I found that really interesting things happen and you really can't know in advance how those dots are necessarily going to connect in the future. And that's, yeah. So that's my philosophical take. That's the, yes, exactly. Coming back.Alessio [00:13:04]: Yep. Do you like graph as an agent? Abstraction? That's been one of the hot topics with LandGraph and Pydantic and all that.Dharmesh [00:13:11]: I do. The thing I'm more interested in terms of use of graphs, and there's lots of work happening on that now, is graph data stores as an alternative in terms of knowledge stores and knowledge graphs. Yeah. Because, you know, so I've been in software now 30 plus years, right? So it's not 10,000 hours. It's like 100,000 hours that I've spent doing this stuff. And so I've grew up with, so back in the day, you know, I started on mainframes. There was a product called IMS from IBM, which is basically an index database, what we'd call like a key value store today. Then we've had relational databases, right? We have tables and columns and foreign key relationships. We all know that. We have document databases like MongoDB, which is sort of a nested structure keyed by a specific index. We have vector stores, vector embedding database. And graphs are interesting for a couple of reasons. One is, so it's not classically structured in a relational way. When you say structured database, to most people, they're thinking tables and columns and in relational database and set theory and all that. Graphs still have structure, but it's not the tables and columns structure. And you could wonder, and people have made this case, that they are a better representation of knowledge for LLMs and for AI generally than other things. So that's kind of thing number one conceptually, and that might be true, I think is possibly true. And the other thing that I really like about that in the context of, you know, I've been in the context of data stores for RAG is, you know, RAG, you say, oh, I have a million documents, I'm going to build the vector embeddings, I'm going to come back with the top X based on the semantic match, and that's fine. All that's very, very useful. But the reality is something gets lost in the chunking process and the, okay, well, those tend, you know, like, you don't really get the whole picture, so to speak, and maybe not even the right set of dimensions on the kind of broader picture. And it makes intuitive sense to me that if we did capture it properly in a graph form, that maybe that feeding into a RAG pipeline will actually yield better results for some use cases, I don't know, but yeah.Alessio [00:15:03]: And do you feel like at the core of it, there's this difference between imperative and declarative programs? Because if you think about HubSpot, it's like, you know, people and graph kind of goes hand in hand, you know, but I think maybe the software before was more like primary foreign key based relationship, versus now the models can traverse through the graph more easily.Dharmesh [00:15:22]: Yes. So I like that representation. There's something. It's just conceptually elegant about graphs and just from the representation of it, they're much more discoverable, you can kind of see it, there's observability to it, versus kind of embeddings, which you can't really do much with as a human. You know, once they're in there, you can't pull stuff back out. But yeah, I like that kind of idea of it. And the other thing that's kind of, because I love graphs, I've been long obsessed with PageRank from back in the early days. And, you know, one of the kind of simplest algorithms in terms of coming up, you know, with a phone, everyone's been exposed to PageRank. And the idea is that, and so I had this other idea for a project, not a company, and I have hundreds of these, called NodeRank, is to be able to take the idea of PageRank and apply it to an arbitrary graph that says, okay, I'm going to define what authority looks like and say, okay, well, that's interesting to me, because then if you say, I'm going to take my knowledge store, and maybe this person that contributed some number of chunks to the graph data store has more authority on this particular use case or prompt that's being submitted than this other one that may, or maybe this one was more. popular, or maybe this one has, whatever it is, there should be a way for us to kind of rank nodes in a graph and sort them in some, some useful way. Yeah.swyx [00:16:34]: So I think that's generally useful for, for anything. I think the, the problem, like, so even though at my conferences, GraphRag is super popular and people are getting knowledge, graph religion, and I will say like, it's getting space, getting traction in two areas, conversation memory, and then also just rag in general, like the, the, the document data. Yeah. It's like a source. Most ML practitioners would say that knowledge graph is kind of like a dirty word. The graph database, people get graph religion, everything's a graph, and then they, they go really hard into it and then they get a, they get a graph that is too complex to navigate. Yes. And so like the, the, the simple way to put it is like you at running HubSpot, you know, the power of graphs, the way that Google has pitched them for many years, but I don't suspect that HubSpot itself uses a knowledge graph. No. Yeah.Dharmesh [00:17:26]: So when is it over engineering? Basically? It's a great question. I don't know. So the question now, like in AI land, right, is the, do we necessarily need to understand? So right now, LLMs for, for the most part are somewhat black boxes, right? We sort of understand how the, you know, the algorithm itself works, but we really don't know what's going on in there and, and how things come out. So if a graph data store is able to produce the outcomes we want, it's like, here's a set of queries I want to be able to submit and then it comes out with useful content. Maybe the underlying data store is as opaque as a vector embeddings or something like that, but maybe it's fine. Maybe we don't necessarily need to understand it to get utility out of it. And so maybe if it's messy, that's okay. Um, that's, it's just another form of lossy compression. Uh, it's just lossy in a way that we just don't completely understand in terms of, because it's going to grow organically. Uh, and it's not structured. It's like, ah, we're just gonna throw a bunch of stuff in there. Let the, the equivalent of the embedding algorithm, whatever they called in graph land. Um, so the one with the best results wins. I think so. Yeah.swyx [00:18:26]: Or is this the practical side of me is like, yeah, it's, if it's useful, we don't necessarilyDharmesh [00:18:30]: need to understand it.swyx [00:18:30]: I have, I mean, I'm happy to push back as long as you want. Uh, it's not practical to evaluate like the 10 different options out there because it takes time. It takes people, it takes, you know, resources, right? Set. That's the first thing. Second thing is your evals are typically on small things and some things only work at scale. Yup. Like graphs. Yup.Dharmesh [00:18:46]: Yup. That's, yeah, no, that's fair. And I think this is one of the challenges in terms of implementation of graph databases is that the most common approach that I've seen developers do, I've done it myself, is that, oh, I've got a Postgres database or a MySQL or whatever. I can represent a graph with a very set of tables with a parent child thing or whatever. And that sort of gives me the ability, uh, why would I need anything more than that? And the answer is, well, if you don't need anything more than that, you don't need anything more than that. But there's a high chance that you're sort of missing out on the actual value that, uh, the graph representation gives you. Which is the ability to traverse the graph, uh, efficiently in ways that kind of going through the, uh, traversal in a relational database form, even though structurally you have the data, practically you're not gonna be able to pull it out in, in useful ways. Uh, so you wouldn't like represent a social graph, uh, in, in using that kind of relational table model. It just wouldn't scale. It wouldn't work.swyx [00:19:36]: Uh, yeah. Uh, I think we want to move on to MCP. Yeah. But I just want to, like, just engineering advice. Yeah. Uh, obviously you've, you've, you've run, uh, you've, you've had to do a lot of projects and run a lot of teams. Do you have a general rule for over-engineering or, you know, engineering ahead of time? You know, like, because people, we know premature engineering is the root of all evil. Yep. But also sometimes you just have to. Yep. When do you do it? Yes.Dharmesh [00:19:59]: It's a great question. This is, uh, a question as old as time almost, which is what's the right and wrong levels of abstraction. That's effectively what, uh, we're answering when we're trying to do engineering. I tend to be a pragmatist, right? So here's the thing. Um, lots of times doing something the right way. Yeah. It's like a marginal increased cost in those cases. Just do it the right way. And this is what makes a, uh, a great engineer or a good engineer better than, uh, a not so great one. It's like, okay, all things being equal. If it's going to take you, you know, roughly close to constant time anyway, might as well do it the right way. Like, so do things well, then the question is, okay, well, am I building a framework as the reusable library? To what degree, uh, what am I anticipating in terms of what's going to need to change in this thing? Uh, you know, along what dimension? And then I think like a business person in some ways, like what's the return on calories, right? So, uh, and you look at, um, energy, the expected value of it's like, okay, here are the five possible things that could happen, uh, try to assign probabilities like, okay, well, if there's a 50% chance that we're going to go down this particular path at some day, like, or one of these five things is going to happen and it costs you 10% more to engineer for that. It's basically, it's something that yields a kind of interest compounding value. Um, as you get closer to the time of, of needing that versus having to take on debt, which is when you under engineer it, you're taking on debt. You're going to have to pay off when you do get to that eventuality where something happens. One thing as a pragmatist, uh, so I would rather under engineer something than over engineer it. If I were going to err on the side of something, and here's the reason is that when you under engineer it, uh, yes, you take on tech debt, uh, but the interest rate is relatively known and payoff is very, very possible, right? Which is, oh, I took a shortcut here as a result of which now this thing that should have taken me a week is now going to take me four weeks. Fine. But if that particular thing that you thought might happen, never actually, you never have that use case transpire or just doesn't, it's like, well, you just save yourself time, right? And that has value because you were able to do other things instead of, uh, kind of slightly over-engineering it away, over-engineering it. But there's no perfect answers in art form in terms of, uh, and yeah, we'll, we'll bring kind of this layers of abstraction back on the code generation conversation, which we'll, uh, I think I have later on, butAlessio [00:22:05]: I was going to ask, we can just jump ahead quickly. Yeah. Like, as you think about vibe coding and all that, how does the. Yeah. Percentage of potential usefulness change when I feel like we over-engineering a lot of times it's like the investment in syntax, it's less about the investment in like arc exacting. Yep. Yeah. How does that change your calculus?Dharmesh [00:22:22]: A couple of things, right? One is, um, so, you know, going back to that kind of ROI or a return on calories, kind of calculus or heuristic you think through, it's like, okay, well, what is it going to cost me to put this layer of abstraction above the code that I'm writing now, uh, in anticipating kind of future needs. If the cost of fixing, uh, or doing under engineering right now. Uh, we'll trend towards zero that says, okay, well, I don't have to get it right right now because even if I get it wrong, I'll run the thing for six hours instead of 60 minutes or whatever. It doesn't really matter, right? Like, because that's going to trend towards zero to be able, the ability to refactor a code. Um, and because we're going to not that long from now, we're going to have, you know, large code bases be able to exist, uh, you know, as, as context, uh, for a code generation or a code refactoring, uh, model. So I think it's going to make it, uh, make the case for under engineering, uh, even stronger. Which is why I take on that cost. You just pay the interest when you get there, it's not, um, just go on with your life vibe coded and, uh, come back when you need to. Yeah.Alessio [00:23:18]: Sometimes I feel like there's no decision-making in some things like, uh, today I built a autosave for like our internal notes platform and I literally just ask them cursor. Can you add autosave? Yeah. I don't know if it's over under engineer. Yep. I just vibe coded it. Yep. And I feel like at some point we're going to get to the point where the models kindDharmesh [00:23:36]: of decide where the right line is, but this is where the, like the, in my mind, the danger is, right? So there's two sides to this. One is the cost of kind of development and coding and things like that stuff that, you know, we talk about. But then like in your example, you know, one of the risks that we have is that because adding a feature, uh, like a save or whatever the feature might be to a product as that price tends towards zero, are we going to be less discriminant about what features we add as a result of making more product products more complicated, which has a negative impact on the user and navigate negative impact on the business. Um, and so that's the thing I worry about if it starts to become too easy, are we going to be. Too promiscuous in our, uh, kind of extension, adding product extensions and things like that. It's like, ah, why not add X, Y, Z or whatever back then it was like, oh, we only have so many engineering hours or story points or however you measure things. Uh, that least kept us in check a little bit. Yeah.Alessio [00:24:22]: And then over engineering, you're like, yeah, it's kind of like you're putting that on yourself. Yeah. Like now it's like the models don't understand that if they add too much complexity, it's going to come back to bite them later. Yep. So they just do whatever they want to do. Yeah. And I'm curious where in the workflow that's going to be, where it's like, Hey, this is like the amount of complexity and over-engineering you can do before you got to ask me if we should actually do it versus like do something else.Dharmesh [00:24:45]: So you know, we've already, let's like, we're leaving this, uh, in the code generation world, this kind of compressed, um, cycle time. Right. It's like, okay, we went from auto-complete, uh, in the GitHub co-pilot to like, oh, finish this particular thing and hit tab to a, oh, I sort of know your file or whatever. I can write out a full function to you to now I can like hold a bunch of the context in my head. Uh, so we can do app generation, which we have now with lovable and bolt and repletage. Yeah. Association and other things. So then the question is, okay, well, where does it naturally go from here? So we're going to generate products. Make sense. We might be able to generate platforms as though I want a platform for ERP that does this, whatever. And that includes the API's includes the product and the UI, and all the things that make for a platform. There's no nothing that says we would stop like, okay, can you generate an entire software company someday? Right. Uh, with the platform and the monetization and the go-to-market and the whatever. And you know, that that's interesting to me in terms of, uh, you know, what, when you take it to almost ludicrous levels. of abstract.swyx [00:25:39]: It's like, okay, turn it to 11. You mentioned vibe coding, so I have to, this is a blog post I haven't written, but I'm kind of exploring it. Is the junior engineer dead?Dharmesh [00:25:49]: I don't think so. I think what will happen is that the junior engineer will be able to, if all they're bringing to the table is the fact that they are a junior engineer, then yes, they're likely dead. But hopefully if they can communicate with carbon-based life forms, they can interact with product, if they're willing to talk to customers, they can take their kind of basic understanding of engineering and how kind of software works. I think that has value. So I have a 14-year-old right now who's taking Python programming class, and some people ask me, it's like, why is he learning coding? And my answer is, is because it's not about the syntax, it's not about the coding. What he's learning is like the fundamental thing of like how things work. And there's value in that. I think there's going to be timeless value in systems thinking and abstractions and what that means. And whether functions manifested as math, which he's going to get exposed to regardless, or there are some core primitives to the universe, I think, that the more you understand them, those are what I would kind of think of as like really large dots in your life that will have a higher gravitational pull and value to them that you'll then be able to. So I want him to collect those dots, and he's not resisting. So it's like, okay, while he's still listening to me, I'm going to have him do things that I think will be useful.swyx [00:26:59]: You know, part of one of the pitches that I evaluated for AI engineer is a term. And the term is that maybe the traditional interview path or career path of software engineer goes away, which is because what's the point of lead code? Yeah. And, you know, it actually matters more that you know how to work with AI and to implement the things that you want. Yep.Dharmesh [00:27:16]: That's one of the like interesting things that's happened with generative AI. You know, you go from machine learning and the models and just that underlying form, which is like true engineering, right? Like the actual, what I call real engineering. I don't think of myself as a real engineer, actually. I'm a developer. But now with generative AI. We call it AI and it's obviously got its roots in machine learning, but it just feels like fundamentally different to me. Like you have the vibe. It's like, okay, well, this is just a whole different approach to software development to so many different things. And so I'm wondering now, it's like an AI engineer is like, if you were like to draw the Venn diagram, it's interesting because the cross between like AI things, generative AI and what the tools are capable of, what the models do, and this whole new kind of body of knowledge that we're still building out, it's still very young, intersected with kind of classic engineering, software engineering. Yeah.swyx [00:28:04]: I just described the overlap as it separates out eventually until it's its own thing, but it's starting out as a software. Yeah.Alessio [00:28:11]: That makes sense. So to close the vibe coding loop, the other big hype now is MCPs. Obviously, I would say Cloud Desktop and Cursor are like the two main drivers of MCP usage. I would say my favorite is the Sentry MCP. I can pull in errors and then you can just put the context in Cursor. How do you think about that abstraction layer? Does it feel... Does it feel almost too magical in a way? Do you think it's like you get enough? Because you don't really see how the server itself is then kind of like repackaging theDharmesh [00:28:41]: information for you? I think MCP as a standard is one of the better things that's happened in the world of AI because a standard needed to exist and absent a standard, there was a set of things that just weren't possible. Now, we can argue whether it's the best possible manifestation of a standard or not. Does it do too much? Does it do too little? I get that, but it's just simple enough to both be useful and unobtrusive. It's understandable and adoptable by mere mortals, right? It's not overly complicated. You know, a reasonable engineer can put a stand up an MCP server relatively easily. The thing that has me excited about it is like, so I'm a big believer in multi-agent systems. And so that's going back to our kind of this idea of an atomic agent. So imagine the MCP server, like obviously it calls tools, but the way I think about it, so I'm working on my current passion project is agent.ai. And we'll talk more about that in a little bit. More about the, I think we should, because I think it's interesting not to promote the project at all, but there's some interesting ideas in there. One of which is around, we're going to need a mechanism for, if agents are going to collaborate and be able to delegate, there's going to need to be some form of discovery and we're going to need some standard way. It's like, okay, well, I just need to know what this thing over here is capable of. We're going to need a registry, which Anthropic's working on. I'm sure others will and have been doing directories of, and there's going to be a standard around that too. How do you build out a directory of MCP servers? I think that's going to unlock so many things just because, and we're already starting to see it. So I think MCP or something like it is going to be the next major unlock because it allows systems that don't know about each other, don't need to, it's that kind of decoupling of like Sentry and whatever tools someone else was building. And it's not just about, you know, Cloud Desktop or things like, even on the client side, I think we're going to see very interesting consumers of MCP, MCP clients versus just the chat body kind of things. Like, you know, Cloud Desktop and Cursor and things like that. But yeah, I'm very excited about MCP in that general direction.swyx [00:30:39]: I think the typical cynical developer take, it's like, we have OpenAPI. Yeah. What's the new thing? I don't know if you have a, do you have a quick MCP versus everything else? Yeah.Dharmesh [00:30:49]: So it's, so I like OpenAPI, right? So just a descriptive thing. It's OpenAPI. OpenAPI. Yes, that's what I meant. So it's basically a self-documenting thing. We can do machine-generated, lots of things from that output. It's a structured definition of an API. I get that, love it. But MCPs sort of are kind of use case specific. They're perfect for exactly what we're trying to use them for around LLMs in terms of discovery. It's like, okay, I don't necessarily need to know kind of all this detail. And so right now we have, we'll talk more about like MCP server implementations, but We will? I think, I don't know. Maybe we won't. At least it's in my head. It's like a back processor. But I do think MCP adds value above OpenAPI. It's, yeah, just because it solves this particular thing. And if we had come to the world, which we have, like, it's like, hey, we already have OpenAPI. It's like, if that were good enough for the universe, the universe would have adopted it already. There's a reason why MCP is taking office because marginally adds something that was missing before and doesn't go too far. And so that's why the kind of rate of adoption, you folks have written about this and talked about it. Yeah, why MCP won. Yeah. And it won because the universe decided that this was useful and maybe it gets supplanted by something else. Yeah. And maybe we discover, oh, maybe OpenAPI was good enough the whole time. I doubt that.swyx [00:32:09]: The meta lesson, this is, I mean, he's an investor in DevTools companies. I work in developer experience at DevRel in DevTools companies. Yep. Everyone wants to own the standard. Yeah. I'm sure you guys have tried to launch your own standards. Actually, it's Houseplant known for a standard, you know, obviously inbound marketing. But is there a standard or protocol that you ever tried to push? No.Dharmesh [00:32:30]: And there's a reason for this. Yeah. Is that? And I don't mean, need to mean, speak for the people of HubSpot, but I personally. You kind of do. I'm not smart enough. That's not the, like, I think I have a. You're smart. Not enough for that. I'm much better off understanding the standards that are out there. And I'm more on the composability side. Let's, like, take the pieces of technology that exist out there, combine them in creative, unique ways. And I like to consume standards. I don't like to, and that's not that I don't like to create them. I just don't think I have the, both the raw wattage or the credibility. It's like, okay, well, who the heck is Dharmesh, and why should we adopt a standard he created?swyx [00:33:07]: Yeah, I mean, there are people who don't monetize standards, like OpenTelemetry is a big standard, and LightStep never capitalized on that.Dharmesh [00:33:15]: So, okay, so if I were to do a standard, there's two things that have been in my head in the past. I was one around, a very, very basic one around, I don't even have the domain, I have a domain for everything, for open marketing. Because the issue we had in HubSpot grew up in the marketing space. There we go. There was no standard around data formats and things like that. It doesn't go anywhere. But the other one, and I did not mean to go here, but I'm going to go here. It's called OpenGraph. I know the term was already taken, but it hasn't been used for like 15 years now for its original purpose. But what I think should exist in the world is right now, our information, all of us, nodes are in the social graph at Meta or the professional graph at LinkedIn. Both of which are actually relatively closed in actually very annoying ways. Like very, very closed, right? Especially LinkedIn. Especially LinkedIn. I personally believe that if it's my data, and if I would get utility out of it being open, I should be able to make my data open or publish it in whatever forms that I choose, as long as I have control over it as opt-in. So the idea is around OpenGraph that says, here's a standard, here's a way to publish it. I should be able to go to OpenGraph.org slash Dharmesh dot JSON and get it back. And it's like, here's your stuff, right? And I can choose along the way and people can write to it and I can prove. And there can be an entire system. And if I were to do that, I would do it as a... Like a public benefit, non-profit-y kind of thing, as this is a contribution to society. I wouldn't try to commercialize that. Have you looked at AdProto? What's that? AdProto.swyx [00:34:43]: It's the protocol behind Blue Sky. Okay. My good friend, Dan Abramov, who was the face of React for many, many years, now works there. And he actually did a talk that I can send you, which basically kind of tries to articulate what you just said. But he does, he loves doing these like really great analogies, which I think you'll like. Like, you know, a lot of our data is behind a handle, behind a domain. Yep. So he's like, all right, what if we flip that? What if it was like our handle and then the domain? Yep. So, and that's really like your data should belong to you. Yep. And I should not have to wait 30 days for my Twitter data to export. Yep.Dharmesh [00:35:19]: you should be able to at least be able to automate it or do like, yes, I should be able to plug it into an agentic thing. Yeah. Yes. I think we're... Because so much of our data is... Locked up. I think the trick here isn't that standard. It is getting the normies to care.swyx [00:35:37]: Yeah. Because normies don't care.Dharmesh [00:35:38]: That's true. But building on that, normies don't care. So, you know, privacy is a really hot topic and an easy word to use, but it's not a binary thing. Like there are use cases where, and we make these choices all the time, that I will trade, not all privacy, but I will trade some privacy for some productivity gain or some benefit to me that says, oh, I don't care about that particular data being online if it gives me this in return, or I don't mind sharing this information with this company.Alessio [00:36:02]: If I'm getting, you know, this in return, but that sort of should be my option. I think now with computer use, you can actually automate some of the exports. Yes. Like something we've been doing internally is like everybody exports their LinkedIn connections. Yep. And then internally, we kind of merge them together to see how we can connect our companies to customers or things like that.Dharmesh [00:36:21]: And not to pick on LinkedIn, but since we're talking about it, but they feel strongly enough on the, you know, do not take LinkedIn data that they will block even browser use kind of things or whatever. They go to great, great lengths, even to see patterns of usage. And it says, oh, there's no way you could have, you know, gotten that particular thing or whatever without, and it's, so it's, there's...swyx [00:36:42]: Wasn't there a Supreme Court case that they lost? Yeah.Dharmesh [00:36:45]: So the one they lost was around someone that was scraping public data that was on the public internet. And that particular company had not signed any terms of service or whatever. It's like, oh, I'm just taking data that's on, there was no, and so that's why they won. But now, you know, the question is around, can LinkedIn... I think they can. Like, when you use, as a user, you use LinkedIn, you are signing up for their terms of service. And if they say, well, this kind of use of your LinkedIn account that violates our terms of service, they can shut your account down, right? They can. And they, yeah, so, you know, we don't need to make this a discussion. By the way, I love the company, don't get me wrong. I'm an avid user of the product. You know, I've got... Yeah, I mean, you've got over a million followers on LinkedIn, I think. Yeah, I do. And I've known people there for a long, long time, right? And I have lots of respect. And I understand even where the mindset originally came from of this kind of members-first approach to, you know, a privacy-first. I sort of get that. But sometimes you sort of have to wonder, it's like, okay, well, that was 15, 20 years ago. There's likely some controlled ways to expose some data on some member's behalf and not just completely be a binary. It's like, no, thou shalt not have the data.swyx [00:37:54]: Well, just pay for sales navigator.Alessio [00:37:57]: Before we move to the next layer of instruction, anything else on MCP you mentioned? Let's move back and then I'll tie it back to MCPs.Dharmesh [00:38:05]: So I think the... Open this with agent. Okay, so I'll start with... Here's my kind of running thesis, is that as AI and agents evolve, which they're doing very, very quickly, we're going to look at them more and more. I don't like to anthropomorphize. We'll talk about why this is not that. Less as just like raw tools and more like teammates. They'll still be software. They should self-disclose as being software. I'm totally cool with that. But I think what's going to happen is that in the same way you might collaborate with a team member on Slack or Teams or whatever you use, you can imagine a series of agents that do specific things just like a team member might do, that you can delegate things to. You can collaborate. You can say, hey, can you take a look at this? Can you proofread that? Can you try this? You can... Whatever it happens to be. So I think it is... I will go so far as to say it's inevitable that we're going to have hybrid teams someday. And what I mean by hybrid teams... So back in the day, hybrid teams were, oh, well, you have some full-time employees and some contractors. Then it was like hybrid teams are some people that are in the office and some that are remote. That's the kind of form of hybrid. The next form of hybrid is like the carbon-based life forms and agents and AI and some form of software. So let's say we temporarily stipulate that I'm right about that over some time horizon that eventually we're going to have these kind of digitally hybrid teams. So if that's true, then the question you sort of ask yourself is that then what needs to exist in order for us to get the full value of that new model? It's like, okay, well... You sort of need to... It's like, okay, well, how do I... If I'm building a digital team, like, how do I... Just in the same way, if I'm interviewing for an engineer or a designer or a PM, whatever, it's like, well, that's why we have professional networks, right? It's like, oh, they have a presence on likely LinkedIn. I can go through that semi-structured, structured form, and I can see the experience of whatever, you know, self-disclosed. But, okay, well, agents are going to need that someday. And so I'm like, okay, well, this seems like a thread that's worth pulling on. That says, okay. So I... So agent.ai is out there. And it's LinkedIn for agents. It's LinkedIn for agents. It's a professional network for agents. And the more I pull on that thread, it's like, okay, well, if that's true, like, what happens, right? It's like, oh, well, they have a profile just like anyone else, just like a human would. It's going to be a graph underneath, just like a professional network would be. It's just that... And you can have its, you know, connections and follows, and agents should be able to post. That's maybe how they do release notes. Like, oh, I have this new version. Whatever they decide to post, it should just be able to... Behave as a node on the network of a professional network. As it turns out, the more I think about that and pull on that thread, the more and more things, like, start to make sense to me. So it may be more than just a pure professional network. So my original thought was, okay, well, it's a professional network and agents as they exist out there, which I think there's going to be more and more of, will kind of exist on this network and have the profile. But then, and this is always dangerous, I'm like, okay, I want to see a world where thousands of agents are out there in order for the... Because those digital employees, the digital workers don't exist yet in any meaningful way. And so then I'm like, oh, can I make that easier for, like... And so I have, as one does, it's like, oh, I'll build a low-code platform for building agents. How hard could that be, right? Like, very hard, as it turns out. But it's been fun. So now, agent.ai has 1.3 million users. 3,000 people have actually, you know, built some variation of an agent, sometimes just for their own personal productivity. About 1,000 of which have been published. And the reason this comes back to MCP for me, so imagine that and other networks, since I know agent.ai. So right now, we have an MCP server for agent.ai that exposes all the internally built agents that we have that do, like, super useful things. Like, you know, I have access to a Twitter API that I can subsidize the cost. And I can say, you know, if you're looking to build something for social media, these kinds of things, with a single API key, and it's all completely free right now, I'm funding it. That's a useful way for it to work. And then we have a developer to say, oh, I have this idea. I don't have to worry about open AI. I don't have to worry about, now, you know, this particular model is better. It has access to all the models with one key. And we proxy it kind of behind the scenes. And then expose it. So then we get this kind of community effect, right? That says, oh, well, someone else may have built an agent to do X. Like, I have an agent right now that I built for myself to do domain valuation for website domains because I'm obsessed with domains, right? And, like, there's no efficient market for domains. There's no Zillow for domains right now that tells you, oh, here are what houses in your neighborhood sold for. It's like, well, why doesn't that exist? We should be able to solve that problem. And, yes, you're still guessing. Fine. There should be some simple heuristic. So I built that. It's like, okay, well, let me go look for past transactions. You say, okay, I'm going to type in agent.ai, agent.com, whatever domain. What's it actually worth? I'm looking at buying it. It can go and say, oh, which is what it does. It's like, I'm going to go look at are there any published domain transactions recently that are similar, either use the same word, same top-level domain, whatever it is. And it comes back with an approximate value, and it comes back with its kind of rationale for why it picked the value and comparable transactions. Oh, by the way, this domain sold for published. Okay. So that agent now, let's say, existed on the web, on agent.ai. Then imagine someone else says, oh, you know, I want to build a brand-building agent for startups and entrepreneurs to come up with names for their startup. Like a common problem, every startup is like, ah, I don't know what to call it. And so they type in five random words that kind of define whatever their startup is. And you can do all manner of things, one of which is like, oh, well, I need to find the domain for it. What are possible choices? Now it's like, okay, well, it would be nice to know if there's an aftermarket price for it, if it's listed for sale. Awesome. Then imagine calling this valuation agent. It's like, okay, well, I want to find where the arbitrage is, where the agent valuation tool says this thing is worth $25,000. It's listed on GoDaddy for $5,000. It's close enough. Let's go do that. Right? And that's a kind of composition use case that in my future state. Thousands of agents on the network, all discoverable through something like MCP. And then you as a developer of agents have access to all these kind of Lego building blocks based on what you're trying to solve. Then you blend in orchestration, which is getting better and better with the reasoning models now. Just describe the problem that you have. Now, the next layer that we're all contending with is that how many tools can you actually give an LLM before the LLM breaks? That number used to be like 15 or 20 before you kind of started to vary dramatically. And so that's the thing I'm thinking about now. It's like, okay, if I want to... If I want to expose 1,000 of these agents to a given LLM, obviously I can't give it all 1,000. Is there some intermediate layer that says, based on your prompt, I'm going to make a best guess at which agents might be able to be helpful for this particular thing? Yeah.Alessio [00:44:37]: Yeah, like RAG for tools. Yep. I did build the Latent Space Researcher on agent.ai. Okay. Nice. Yeah, that seems like, you know, then there's going to be a Latent Space Scheduler. And then once I schedule a research, you know, and you build all of these things. By the way, my apologies for the user experience. You realize I'm an engineer. It's pretty good.swyx [00:44:56]: I think it's a normie-friendly thing. Yeah. That's your magic. HubSpot does the same thing.Alessio [00:45:01]: Yeah, just to like quickly run through it. You can basically create all these different steps. And these steps are like, you know, static versus like variable-driven things. How did you decide between this kind of like low-code-ish versus doing, you know, low-code with code backend versus like not exposing that at all? Any fun design decisions? Yeah. And this is, I think...Dharmesh [00:45:22]: I think lots of people are likely sitting in exactly my position right now, coming through the choosing between deterministic. Like if you're like in a business or building, you know, some sort of agentic thing, do you decide to do a deterministic thing? Or do you go non-deterministic and just let the alum handle it, right, with the reasoning models? The original idea and the reason I took the low-code stepwise, a very deterministic approach. A, the reasoning models did not exist at that time. That's thing number one. Thing number two is if you can get... If you know in your head... If you know in your head what the actual steps are to accomplish whatever goal, why would you leave that to chance? There's no upside. There's literally no upside. Just tell me, like, what steps do you need executed? So right now what I'm playing with... So one thing we haven't talked about yet, and people don't talk about UI and agents. Right now, the primary interaction model... Or they don't talk enough about it. I know some people have. But it's like, okay, so we're used to the chatbot back and forth. Fine. I get that. But I think we're going to move to a blend of... Some of those things are going to be synchronous as they are now. But some are going to be... Some are going to be async. It's just going to put it in a queue, just like... And this goes back to my... Man, I talk fast. But I have this... I only have one other speed. It's even faster. So imagine it's like if you're working... So back to my, oh, we're going to have these hybrid digital teams. Like, you would not go to a co-worker and say, I'm going to ask you to do this thing, and then sit there and wait for them to go do it. Like, that's not how the world works. So it's nice to be able to just, like, hand something off to someone. It's like, okay, well, maybe I expect a response in an hour or a day or something like that.Dharmesh [00:46:52]: In terms of when things need to happen. So the UI around agents. So if you look at the output of agent.ai agents right now, they are the simplest possible manifestation of a UI, right? That says, oh, we have inputs of, like, four different types. Like, we've got a dropdown, we've got multi-select, all the things. It's like back in HTML, the original HTML 1.0 days, right? Like, you're the smallest possible set of primitives for a UI. And it just says, okay, because we need to collect some information from the user, and then we go do steps and do things. And generate some output in HTML or markup are the two primary examples. So the thing I've been asking myself, if I keep going down that path. So people ask me, I get requests all the time. It's like, oh, can you make the UI sort of boring? I need to be able to do this, right? And if I keep pulling on that, it's like, okay, well, now I've built an entire UI builder thing. Where does this end? And so I think the right answer, and this is what I'm going to be backcoding once I get done here, is around injecting a code generation UI generation into, the agent.ai flow, right? As a builder, you're like, okay, I'm going to describe the thing that I want, much like you would do in a vibe coding world. But instead of generating the entire app, it's going to generate the UI that exists at some point in either that deterministic flow or something like that. It says, oh, here's the thing I'm trying to do. Go generate the UI for me. And I can go through some iterations. And what I think of it as a, so it's like, I'm going to generate the code, generate the code, tweak it, go through this kind of prompt style, like we do with vibe coding now. And at some point, I'm going to be happy with it. And I'm going to hit save. And that's going to become the action in that particular step. It's like a caching of the generated code that I can then, like incur any inference time costs. It's just the actual code at that point.Alessio [00:48:29]: Yeah, I invested in a company called E2B, which does code sandbox. And they powered the LM arena web arena. So it's basically the, just like you do LMS, like text to text, they do the same for like UI generation. So if you're asking a model, how do you do it? But yeah, I think that's kind of where.Dharmesh [00:48:45]: That's the thing I'm really fascinated by. So the early LLM, you know, we're understandably, but laughably bad at simple arithmetic, right? That's the thing like my wife, Normies would ask us, like, you call this AI, like it can't, my son would be like, it's just stupid. It can't even do like simple arithmetic. And then like we've discovered over time that, and there's a reason for this, right? It's like, it's a large, there's, you know, the word language is in there for a reason in terms of what it's been trained on. It's not meant to do math, but now it's like, okay, well, the fact that it has access to a Python interpreter that I can actually call at runtime, that solves an entire body of problems that it wasn't trained to do. And it's basically a form of delegation. And so the thought that's kind of rattling around in my head is that that's great. So it's, it's like took the arithmetic problem and took it first. Now, like anything that's solvable through a relatively concrete Python program, it's able to do a bunch of things that I couldn't do before. Can we get to the same place with UI? I don't know what the future of UI looks like in a agentic AI world, but maybe let the LLM handle it, but not in the classic sense. Maybe it generates it on the fly, or maybe we go through some iterations and hit cache or something like that. So it's a little bit more predictable. Uh, I don't know, but yeah.Alessio [00:49:48]: And especially when is the human supposed to intervene? So, especially if you're composing them, most of them should not have a UI because then they're just web hooking to somewhere else. I just want to touch back. I don't know if you have more comments on this.swyx [00:50:01]: I was just going to ask when you, you said you got, you're going to go back to code. What

covid-19 god amazon ai english google business man technology work future space service state challenges san francisco design opportunities innovation evolution cost microsoft drop open network connecting chatgpt philosophy bitcoin supreme court invest memory nfts exploring auto lgbt switzerland lego roi period fill agent engineers privacy thousands ted talks ibm traffic average saas results cto react crm final thoughts slack gemini openai machines platforms blue sky api limitless shah ethereum business models web3 b2c gmail canva gpt python ui mm photoshop lama rem github hubspot locked zillow google analytics llm html ppc erp sam altman superhuman inbound personal life behave chai attribution venn graphs dns balancing work ras perplexity godaddy anthropic percentage miscellaneous rag ens automatically mad libs sla sentry alessio passionately abstraction lms houseplants ims mongodb rl lm json mcp gpts cursor normies zooms mysql domain names inference derek sivers semantic privacy concerns google search console latent csat ai systems mfm uis icann zep oauth vested postgres rest apis devrel raas kru search console shoring verifiable devtools pagerank open api my first million mcps waas langchain hybrid teams cogen dharmesh shah dharmesh dan abramov brett taylor graph theory lightstep yohei service raas jessica livingston latent space chatspot

Raising the Bar for Laundry Services with John Buni & David Griffith-Jones of CleanCloud

The Laundromat Millionaire Show with Dave Menz

Play Episode Listen Later Mar 26, 2025 68:48

What does it take to be successful in the laundry industry? More than just a great facility and a high-quality product, you must have the web presence for customers to see those offerings! At CleanCloud, co-founders John Buni & David Griffith-Jones work hard to create a software system that can help you get there. In this episode of The Laundromat Millionaire Show with Dave & Carla Menz, learn about the services CleanCloud can offer your laundry business and what we all see for the future of our industry.Referenced Links: Our Sponsor: H-M Company Drain Troughs: https://www.draintroughs.comEastern Funding: https://www.easternfunding.com/Our Guests: CleanCloud: https://cleancloudapp.com/Our Website: https://www.laundromatmillionaire.comOur Online Course: https://dave-menz.mykajabi.com/sales-pageOur Youtube channel: https://youtube.com/c/LaundromatMillionaireOur Podcast: https://laundromatmillionaire.com/podcast/Our Facebook: https://www.facebook.com/laundromatmillionaire/Our Facebook Group: https://www.facebook.com/groups/laundromatmillionaireOur LinkedIn: https://www.linkedin.com/in/dave-laundromat-millionaire-menz/Our Instagram: https://www.instagram.com/laundromatmillionaire/Our laundromats: https://www.queencitylaundry.comOur pick-up and delivery laundry services: https://www.queencitylaundry.com/deliveryOur WDF & Delivery Workshop: https://laundromatmillionaire.com/pick-up-delivery-workshop/LaundroBoost Marketing Company: https://laundroboostmarketing.com/Suggested Services Page: https://www.laundromatmillionaire.com/servicesWDF & Delivery Dynamics: A Complete Business Blueprint: https://laundromatmillionaire.com/wdf-delivery-dynamics-a-business-blueprint/Clean Show Registration: https://the-clean-show.us.messefrankfurt.com/us/en.htmlSeinfeld clip: https://www.youtube.com/watch?v=KYyzm5AlovAJetsons clip: https://www.youtube.com/watch?v=1pphyvgd7-kFolding Machine clip: https://www.youtube.com/watch?v=C76osXtpLeMOpen Source graphic: https://hackr.io/blog/what-is-open-source-softwareTimestamps0:00:00 Episode 95 Intro0:03:07 Backstory of CleanCloud0:08:07 Evolving from POS to Delivery too0:09:58 Services CleanCloud Provides0:13:31 Software Development & Attainable Marketing Strategies0:21:33 Support Provided to CleanCloud Customers0:24:57 Website Creation0:27:25 Pricing Structure of CleanCloud0:29:04 Biggest Differentiator – International0:33:58 Training Opportunities on the Software0:38:56 Ability to Integrate with Self Service Machines0:43:12 Thoughts on Open Source Software0:50:28 Advice for Store Owners1:02:32 The Future of the Industry1:06:01 Closing Remarks & Contact Info

future entrepreneur advice services evolving delivery ability open source pos laundry backstory integrate closing remarks software development raising the bar comour open api buni david griffith laundromat millionaire

LLMs for web developers with Roy Derks

PodRocket - A web development podcast from LogRocket

Play Episode Listen Later Mar 6, 2025 28:45

Roy Derks, Developer Experience at IBM, talks about the integration of Large Language Models (LLMs) in web development. We explore practical applications such as building agents, automating QA testing, and the evolving role of AI frameworks in software development. Links https://www.linkedin.com/in/gethackteam https://www.youtube.com/@gethackteam https://x.com/gethackteam https://hackteam.io We want to hear from you! How did you find us? Did you see us on Twitter? In a newsletter? Or maybe we were recommended by a friend? Let us know by sending an email to our producer, Emily, at emily.kochanekketner@logrocket.com (mailto:emily.kochanekketner@logrocket.com), or tweet at us at PodRocketPod (https://twitter.com/PodRocketpod). Follow us. Get free stickers. Follow us on Apple Podcasts, fill out this form (https://podrocket.logrocket.com/get-podrocket-stickers), and we'll send you free PodRocket stickers! What does LogRocket do? LogRocket provides AI-first session replay and analytics that surfaces the UX and technical issues impacting user experiences. Start understand where your users are struggling by trying it for free at [LogRocket.com]. Try LogRocket for free today.(https://logrocket.com/signup/?pdr) Special Guest: Roy Derks.

139. Building Great APIs with Powertools

AWS Bites

Play Episode Listen Later Feb 19, 2025 24:32

In this episode, we discuss using AWS Lambda Powertools for Python to build serverless REST APIs with AWS Lambda. We cover the benefits of using Powertools for routing, validation, OpenAPI support, and more. Powertools provides an excellent framework for building APIs while maintaining Lambda best practices.In this episode, we mentioned the following resources: AWS Bites 41. How can Middy make writing Lambda functions easier? - ⁠https://awsbites.com/41-how-can-middy-make-writing-lambda-functions-easier⁠ AWS Bites 120. Lambda Best Practices - ⁠https://awsbites.com/120-lambda-best-practices/⁠ REST API - Powertools for AWS Lambda (Python) - ⁠https://docs.powertools.aws.dev/lambda/python/latest/core/event_handler/api_gateway/⁠ Hono - ⁠https://hono.dev/⁠ Fastify - ⁠https://fastify.dev/⁠ Axum - ⁠https://github.com/tokio-rs/axum⁠ FastAPI - ⁠https://fastapi.tiangolo.com/⁠Do you have any AWS questions you would like us to address?Leave a comment here or connect with us on BlueSky or LinkedIn: https://bsky.app/profile/eoin.sh | https://www.linkedin.com/in/eoins/ https://bsky.app/profile/loige.co | https://www.linkedin.com/in/lucianomammino/

blue sky python aws apis lambda power tools rest apis aws lambda open api axum middy

Lorna Mitchell on OpenAPI in Design-First Development

Semaphore Uncut

Play Episode Listen Later Feb 18, 2025 25:20

A cornerstone of API development, OpenAPI offers a standardized format to define, design, and document APIs. Born out as open-source and embraced by tech giants like Google, Microsoft, and IBM, OpenAPI isn't just a specification—it's a shift toward interoperability, transparency, and developer empowerment. In this article, Lorna Mitchell, a leading voice in API tooling and VP of Developer Experience at Redocly, sheds light on best practices, pitfalls, and how teams fully harness OpenAPI's potential.Listen to the full episode or read the transcript on the Semaphore blog.Like this episode? Be sure to leave a ⭐️⭐️⭐️⭐️⭐️ review on the podcast player of your choice and share it with your friends.

google design development microsoft ibm api apis developer experience open api semaphore

Torsion Talk S8 Ep96: Finding the Best Field Service Management Software – The Journey Begins

Torsion Talk Podcast

Play Episode Listen Later Feb 11, 2025 30:52

In this episode of Torsion Talk, Ryan kicks off a new series dedicated to finding the best field service management software for garage door businesses and home service companies. If you've ever felt frustrated with your CRM, overwhelmed by rising software costs, or stuck with tools that don't fully meet your needs—this series is for you.Ryan shares his first-hand experiences with platforms like ServiceTitan, Jobber, ServiceFusion, Housecall Pro, and others. He breaks down the must-haves for the ideal FSM software, the common pain points business owners face, and why so many companies feel stuck with overpriced, underperforming systems.What You'll Learn in This Episode:✅ Why most FSM software companies miss the mark—and what they should be focusing on.✅ The biggest challenges in choosing the right software for your business.✅ Why ServiceTitan left Ryan frustrated—and what he's looking for in a replacement.✅ The key features every garage door business needs from an FSM platform.✅ How AI and automation will reshape the future of field service management.Key Discussion Points:

Agent Engineering with Pydantic + Graphs — with Samuel Colvin

Latent Space: The AI Engineer Podcast â€” CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Feb 6, 2025 64:04

Did you know that adding a simple Code Interpreter took o3 from 9.2% to 32% on FrontierMath? The Latent Space crew is hosting a hack night Feb 11th in San Francisco focused on CodeGen use cases, co-hosted with E2B and Edge AGI; watch E2B's new workshop and RSVP here!We're happy to announce that today's guest Samuel Colvin will be teaching his very first Pydantic AI workshop at the newly announced AI Engineer NYC Workshops day on Feb 22! 25 tickets left.If you're a Python developer, it's very likely that you've heard of Pydantic. Every month, it's downloaded >300,000,000 times, making it one of the top 25 PyPi packages. OpenAI uses it in its SDK for structured outputs, it's at the core of FastAPI, and if you've followed our AI Engineer Summit conference, Jason Liu of Instructor has given two great talks about it: “Pydantic is all you need” and “Pydantic is STILL all you need”. Now, Samuel Colvin has raised $17M from Sequoia to turn Pydantic from an open source project to a full stack AI engineer platform with Logfire, their observability platform, and PydanticAI, their new agent framework.Logfire: bringing OTEL to AIOpenTelemetry recently merged Semantic Conventions for LLM workloads which provides standard definitions to track performance like gen_ai.server.time_per_output_token. In Sam's view at least 80% of new apps being built today have some sort of LLM usage in them, and just like web observability platform got replaced by cloud-first ones in the 2010s, Logfire wants to do the same for AI-first apps. If you're interested in the technical details, Logfire migrated away from Clickhouse to Datafusion for their backend. We spent some time on the importance of picking open source tools you understand and that you can actually contribute to upstream, rather than the more popular ones; listen in ~43:19 for that part.Agents are the killer app for graphsPydantic AI is their attempt at taking a lot of the learnings that LangChain and the other early LLM frameworks had, and putting Python best practices into it. At an API level, it's very similar to the other libraries: you can call LLMs, create agents, do function calling, do evals, etc.They define an “Agent” as a container with a system prompt, tools, structured result, and an LLM. Under the hood, each Agent is now a graph of function calls that can orchestrate multi-step LLM interactions. You can start simple, then move toward fully dynamic graph-based control flow if needed.“We were compelled enough by graphs once we got them right that our agent implementation [...] is now actually a graph under the hood.”Why Graphs?* More natural for complex or multi-step AI workflows.* Easy to visualize and debug with mermaid diagrams.* Potential for distributed runs, or “waiting days” between steps in certain flows.In parallel, you see folks like Emil Eifrem of Neo4j talk about GraphRAG as another place where graphs fit really well in the AI stack, so it might be time for more people to take them seriously.Full Video EpisodeLike and subscribe!Chapters* 00:00:00 Introductions* 00:00:24 Origins of Pydantic* 00:05:28 Pydantic's AI moment * 00:08:05 Why build a new agents framework?* 00:10:17 Overview of Pydantic AI* 00:12:33 Becoming a believer in graphs* 00:24:02 God Model vs Compound AI Systems* 00:28:13 Why not build an LLM gateway?* 00:31:39 Programmatic testing vs live evals* 00:35:51 Using OpenTelemetry for AI traces* 00:43:19 Why they don't use Clickhouse* 00:48:34 Competing in the observability space* 00:50:41 Licensing decisions for Pydantic and LogFire* 00:51:48 Building Pydantic.run* 00:55:24 Marimo and the future of Jupyter notebooks* 00:57:44 London's AI sceneShow Notes* Sam Colvin* Pydantic* Pydantic AI* Logfire* Pydantic.run* Zod* E2B* Arize* Langsmith* Marimo* Prefect* GLA (Google Generative Language API)* OpenTelemetry* Jason Liu* Sebastian Ramirez* Bogomil Balkansky* Hood Chatham* Jeremy Howard* Andrew LambTranscriptAlessio [00:00:03]: Hey, everyone. Welcome to the Latent Space podcast. This is Alessio, partner and CTO at Decibel Partners, and I'm joined by my co-host Swyx, founder of Smol AI.Swyx [00:00:12]: Good morning. And today we're very excited to have Sam Colvin join us from Pydantic AI. Welcome. Sam, I heard that Pydantic is all we need. Is that true?Samuel [00:00:24]: I would say you might need Pydantic AI and Logfire as well, but it gets you a long way, that's for sure.Swyx [00:00:29]: Pydantic almost basically needs no introduction. It's almost 300 million downloads in December. And obviously, in the previous podcasts and discussions we've had with Jason Liu, he's been a big fan and promoter of Pydantic and AI.Samuel [00:00:45]: Yeah, it's weird because obviously I didn't create Pydantic originally for uses in AI, it predates LLMs. But it's like we've been lucky that it's been picked up by that community and used so widely.Swyx [00:00:58]: Actually, maybe we'll hear it. Right from you, what is Pydantic and maybe a little bit of the origin story?Samuel [00:01:04]: The best name for it, which is not quite right, is a validation library. And we get some tension around that name because it doesn't just do validation, it will do coercion by default. We now have strict mode, so you can disable that coercion. But by default, if you say you want an integer field and you get in a string of 1, 2, 3, it will convert it to 123 and a bunch of other sensible conversions. And as you can imagine, the semantics around it. Exactly when you convert and when you don't, it's complicated, but because of that, it's more than just validation. Back in 2017, when I first started it, the different thing it was doing was using type hints to define your schema. That was controversial at the time. It was genuinely disapproved of by some people. I think the success of Pydantic and libraries like FastAPI that build on top of it means that today that's no longer controversial in Python. And indeed, lots of other people have copied that route, but yeah, it's a data validation library. It uses type hints for the for the most part and obviously does all the other stuff you want, like serialization on top of that. But yeah, that's the core.Alessio [00:02:06]: Do you have any fun stories on how JSON schemas ended up being kind of like the structure output standard for LLMs? And were you involved in any of these discussions? Because I know OpenAI was, you know, one of the early adopters. So did they reach out to you? Was there kind of like a structure output console in open source that people were talking about or was it just a random?Samuel [00:02:26]: No, very much not. So I originally. Didn't implement JSON schema inside Pydantic and then Sebastian, Sebastian Ramirez, FastAPI came along and like the first I ever heard of him was over a weekend. I got like 50 emails from him or 50 like emails as he was committing to Pydantic, adding JSON schema long pre version one. So the reason it was added was for OpenAPI, which is obviously closely akin to JSON schema. And then, yeah, I don't know why it was JSON that got picked up and used by OpenAI. It was obviously very convenient for us. That's because it meant that not only can you do the validation, but because Pydantic will generate you the JSON schema, it will it kind of can be one source of source of truth for structured outputs and tools.Swyx [00:03:09]: Before we dive in further on the on the AI side of things, something I'm mildly curious about, obviously, there's Zod in JavaScript land. Every now and then there is a new sort of in vogue validation library that that takes over for quite a few years and then maybe like some something else comes along. Is Pydantic? Is it done like the core Pydantic?Samuel [00:03:30]: I've just come off a call where we were redesigning some of the internal bits. There will be a v3 at some point, which will not break people's code half as much as v2 as in v2 was the was the massive rewrite into Rust, but also fixing all the stuff that was broken back from like version zero point something that we didn't fix in v1 because it was a side project. We have plans to move some of the basically store the data in Rust types after validation. Not completely. So we're still working to design the Pythonic version of it, in order for it to be able to convert into Python types. So then if you were doing like validation and then serialization, you would never have to go via a Python type we reckon that can give us somewhere between three and five times another three to five times speed up. That's probably the biggest thing. Also, like changing how easy it is to basically extend Pydantic and define how particular types, like for example, NumPy arrays are validated and serialized. But there's also stuff going on. And for example, Jitter, the JSON library in Rust that does the JSON parsing, has SIMD implementation at the moment only for AMD64. So we can add that. We need to go and add SIMD for other instruction sets. So there's a bunch more we can do on performance. I don't think we're going to go and revolutionize Pydantic, but it's going to continue to get faster, continue, hopefully, to allow people to do more advanced things. We might add a binary format like CBOR for serialization for when you'll just want to put the data into a database and probably load it again from Pydantic. So there are some things that will come along, but for the most part, it should just get faster and cleaner.Alessio [00:05:04]: From a focus perspective, I guess, as a founder too, how did you think about the AI interest rising? And then how do you kind of prioritize, okay, this is worth going into more, and we'll talk about Pydantic AI and all of that. What was maybe your early experience with LLAMP, and when did you figure out, okay, this is something we should take seriously and focus more resources on it?Samuel [00:05:28]: I'll answer that, but I'll answer what I think is a kind of parallel question, which is Pydantic's weird, because Pydantic existed, obviously, before I was starting a company. I was working on it in my spare time, and then beginning of 22, I started working on the rewrite in Rust. And I worked on it full-time for a year and a half, and then once we started the company, people came and joined. And it was a weird project, because that would never go away. You can't get signed off inside a startup. Like, we're going to go off and three engineers are going to work full-on for a year in Python and Rust, writing like 30,000 lines of Rust just to release open-source-free Python library. The result of that has been excellent for us as a company, right? As in, it's made us remain entirely relevant. And it's like, Pydantic is not just used in the SDKs of all of the AI libraries, but I can't say which one, but one of the big foundational model companies, when they upgraded from Pydantic v1 to v2, their number one internal model... The metric of performance is time to first token. That went down by 20%. So you think about all of the actual AI going on inside, and yet at least 20% of the CPU, or at least the latency inside requests was actually Pydantic, which shows like how widely it's used. So we've benefited from doing that work, although it didn't, it would have never have made financial sense in most companies. In answer to your question about like, how do we prioritize AI, I mean, the honest truth is we've spent a lot of the last year and a half building. Good general purpose observability inside LogFire and making Pydantic good for general purpose use cases. And the AI has kind of come to us. Like we just, not that we want to get away from it, but like the appetite, uh, both in Pydantic and in LogFire to go and build with AI is enormous because it kind of makes sense, right? Like if you're starting a new greenfield project in Python today, what's the chance that you're using GenAI 80%, let's say, globally, obviously it's like a hundred percent in California, but even worldwide, it's probably 80%. Yeah. And so everyone needs that stuff. And there's so much yet to be figured out so much like space to do things better in the ecosystem in a way that like to go and implement a database that's better than Postgres is a like Sisyphean task. Whereas building, uh, tools that are better for GenAI than some of the stuff that's about now is not very difficult. Putting the actual models themselves to one side.Alessio [00:07:40]: And then at the same time, then you released Pydantic AI recently, which is, uh, um, you know, agent framework and early on, I would say everybody like, you know, Langchain and like, uh, Pydantic kind of like a first class support, a lot of these frameworks, we're trying to use you to be better. What was the decision behind we should do our own framework? Were there any design decisions that you disagree with any workloads that you think people didn't support? Well,Samuel [00:08:05]: it wasn't so much like design and workflow, although I think there were some, some things we've done differently. Yeah. I think looking in general at the ecosystem of agent frameworks, the engineering quality is far below that of the rest of the Python ecosystem. There's a bunch of stuff that we have learned how to do over the last 20 years of building Python libraries and writing Python code that seems to be abandoned by people when they build agent frameworks. Now I can kind of respect that, particularly in the very first agent frameworks, like Langchain, where they were literally figuring out how to go and do this stuff. It's completely understandable that you would like basically skip some stuff.Samuel [00:08:42]: I'm shocked by the like quality of some of the agent frameworks that have come out recently from like well-respected names, which it just seems to be opportunism and I have little time for that, but like the early ones, like I think they were just figuring out how to do stuff and just as lots of people have learned from Pydantic, we were able to learn a bit from them. I think from like the gap we saw and the thing we were frustrated by was the production readiness. And that means things like type checking, even if type checking makes it hard. Like Pydantic AI, I will put my hand up now and say it has a lot of generics and you need to, it's probably easier to use it if you've written a bit of Rust and you really understand generics, but like, and that is, we're not claiming that that makes it the easiest thing to use in all cases, we think it makes it good for production applications in big systems where type checking is a no-brainer in Python. But there are also a bunch of stuff we've learned from maintaining Pydantic over the years that we've gone and done. So every single example in Pydantic AI's documentation is run on Python. As part of tests and every single print output within an example is checked during tests. So it will always be up to date. And then a bunch of things that, like I say, are standard best practice within the rest of the Python ecosystem, but I'm not followed surprisingly by some AI libraries like coverage, linting, type checking, et cetera, et cetera, where I think these are no-brainers, but like weirdly they're not followed by some of the other libraries.Alessio [00:10:04]: And can you just give an overview of the framework itself? I think there's kind of like the. LLM calling frameworks, there are the multi-agent frameworks, there's the workflow frameworks, like what does Pydantic AI do?Samuel [00:10:17]: I glaze over a bit when I hear all of the different sorts of frameworks, but I like, and I will tell you when I built Pydantic, when I built Logfire and when I built Pydantic AI, my methodology is not to go and like research and review all of the other things. I kind of work out what I want and I go and build it and then feedback comes and we adjust. So the fundamental building block of Pydantic AI is agents. The exact definition of agents and how you want to define them. is obviously ambiguous and our things are probably sort of agent-lit, not that we would want to go and rename them to agent-lit, but like the point is you probably build them together to build something and most people will call an agent. So an agent in our case has, you know, things like a prompt, like system prompt and some tools and a structured return type if you want it, that covers the vast majority of cases. There are situations where you want to go further and the most complex workflows where you want graphs and I resisted graphs for quite a while. I was sort of of the opinion you didn't need them and you could use standard like Python flow control to do all of that stuff. I had a few arguments with people, but I basically came around to, yeah, I can totally see why graphs are useful. But then we have the problem that by default, they're not type safe because if you have a like add edge method where you give the names of two different edges, there's no type checking, right? Even if you go and do some, I'm not, not all the graph libraries are AI specific. So there's a, there's a graph library called, but it allows, it does like a basic runtime type checking. Ironically using Pydantic to try and make up for the fact that like fundamentally that graphs are not typed type safe. Well, I like Pydantic, but it did, that's not a real solution to have to go and run the code to see if it's safe. There's a reason that starting type checking is so powerful. And so we kind of, from a lot of iteration eventually came up with a system of using normally data classes to define nodes where you return the next node you want to call and where we're able to go and introspect the return type of a node to basically build the graph. And so the graph is. Yeah. Inherently type safe. And once we got that right, I, I wasn't, I'm incredibly excited about graphs. I think there's like masses of use cases for them, both in gen AI and other development, but also software's all going to have interact with gen AI, right? It's going to be like web. There's no longer be like a web department in a company is that there's just like all the developers are building for web building with databases. The same is going to be true for gen AI.Alessio [00:12:33]: Yeah. I see on your docs, you call an agent, a container that contains a system prompt function. Tools, structure, result, dependency type model, and then model settings. Are the graphs in your mind, different agents? Are they different prompts for the same agent? What are like the structures in your mind?Samuel [00:12:52]: So we were compelled enough by graphs once we got them right, that we actually merged the PR this morning. That means our agent implementation without changing its API at all is now actually a graph under the hood as it is built using our graph library. So graphs are basically a lower level tool that allow you to build these complex workflows. Our agents are technically one of the many graphs you could go and build. And we just happened to build that one for you because it's a very common, commonplace one. But obviously there are cases where you need more complex workflows where the current agent assumptions don't work. And that's where you can then go and use graphs to build more complex things.Swyx [00:13:29]: You said you were cynical about graphs. What changed your mind specifically?Samuel [00:13:33]: I guess people kept giving me examples of things that they wanted to use graphs for. And my like, yeah, but you could do that in standard flow control in Python became a like less and less compelling argument to me because I've maintained those systems that end up with like spaghetti code. And I could see the appeal of this like structured way of defining the workflow of my code. And it's really neat that like just from your code, just from your type hints, you can get out a mermaid diagram that defines exactly what can go and happen.Swyx [00:14:00]: Right. Yeah. You do have very neat implementation of sort of inferring the graph from type hints, I guess. Yeah. Is what I would call it. Yeah. I think the question always is I have gone back and forth. I used to work at Temporal where we would actually spend a lot of time complaining about graph based workflow solutions like AWS step functions. And we would actually say that we were better because you could use normal control flow that you already knew and worked with. Yours, I guess, is like a little bit of a nice compromise. Like it looks like normal Pythonic code. But you just have to keep in mind what the type hints actually mean. And that's what we do with the quote unquote magic that the graph construction does.Samuel [00:14:42]: Yeah, exactly. And if you look at the internal logic of actually running a graph, it's incredibly simple. It's basically call a node, get a node back, call that node, get a node back, call that node. If you get an end, you're done. We will add in soon support for, well, basically storage so that you can store the state between each node that's run. And then the idea is you can then distribute the graph and run it across computers. And also, I mean, the other weird, the other bit that's really valuable is across time. Because it's all very well if you look at like lots of the graph examples that like Claude will give you. If it gives you an example, it gives you this lovely enormous mermaid chart of like the workflow, for example, managing returns if you're an e-commerce company. But what you realize is some of those lines are literally one function calls another function. And some of those lines are wait six days for the customer to print their like piece of paper and put it in the post. And if you're writing like your demo. Project or your like proof of concept, that's fine because you can just say, and now we call this function. But when you're building when you're in real in real life, that doesn't work. And now how do we manage that concept to basically be able to start somewhere else in the in our code? Well, this graph implementation makes it incredibly easy because you just pass the node that is the start point for carrying on the graph and it continues to run. So it's things like that where I was like, yeah, I can just imagine how things I've done in the past would be fundamentally easier to understand if we had done them with graphs.Swyx [00:16:07]: You say imagine, but like right now, this pedantic AI actually resume, you know, six days later, like you said, or is this just like a theoretical thing we can go someday?Samuel [00:16:16]: I think it's basically Q&A. So there's an AI that's asking the user a question and effectively you then call the CLI again to continue the conversation. And it basically instantiates the node and calls the graph with that node again. Now, we don't have the logic yet for effectively storing state in the database between individual nodes that we're going to add soon. But like the rest of it is basically there.Swyx [00:16:37]: It does make me think that not only are you competing with Langchain now and obviously Instructor, and now you're going into sort of the more like orchestrated things like Airflow, Prefect, Daxter, those guys.Samuel [00:16:52]: Yeah, I mean, we're good friends with the Prefect guys and Temporal have the same investors as us. And I'm sure that my investor Bogomol would not be too happy if I was like, oh, yeah, by the way, as well as trying to take on Datadog. We're also going off and trying to take on Temporal and everyone else doing that. Obviously, we're not doing all of the infrastructure of deploying that right yet, at least. We're, you know, we're just building a Python library. And like what's crazy about our graph implementation is, sure, there's a bit of magic in like introspecting the return type, you know, extracting things from unions, stuff like that. But like the actual calls, as I say, is literally call a function and get back a thing and call that. It's like incredibly simple and therefore easy to maintain. The question is, how useful is it? Well, I don't know yet. I think we have to go and find out. We have a whole. We've had a slew of people joining our Slack over the last few days and saying, tell me how good Pydantic AI is. How good is Pydantic AI versus Langchain? And I refuse to answer. That's your job to go and find that out. Not mine. We built a thing. I'm compelled by it, but I'm obviously biased. The ecosystem will work out what the useful tools are.Swyx [00:17:52]: Bogomol was my board member when I was at Temporal. And I think I think just generally also having been a workflow engine investor and participant in this space, it's a big space. Like everyone needs different functions. I think the one thing that I would say like yours, you know, as a library, you don't have that much control of it over the infrastructure. I do like the idea that each new agents or whatever or unit of work, whatever you call that should spin up in this sort of isolated boundaries. Whereas yours, I think around everything runs in the same process. But you ideally want to sort of spin out its own little container of things.Samuel [00:18:30]: I agree with you a hundred percent. And we will. It would work now. Right. As in theory, you're just like as long as you can serialize the calls to the next node, you just have to all of the different containers basically have to have the same the same code. I mean, I'm super excited about Cloudflare workers running Python and being able to install dependencies. And if Cloudflare could only give me my invitation to the private beta of that, we would be exploring that right now because I'm super excited about that as a like compute level for some of this stuff where exactly what you're saying, basically. You can run everything as an individual. Like worker function and distribute it. And it's resilient to failure, et cetera, et cetera.Swyx [00:19:08]: And it spins up like a thousand instances simultaneously. You know, you want it to be sort of truly serverless at once. Actually, I know we have some Cloudflare friends who are listening, so hopefully they'll get in front of the line. Especially.Samuel [00:19:19]: I was in Cloudflare's office last week shouting at them about other things that frustrate me. I have a love-hate relationship with Cloudflare. Their tech is awesome. But because I use it the whole time, I then get frustrated. So, yeah, I'm sure I will. I will. I will get there soon.Swyx [00:19:32]: There's a side tangent on Cloudflare. Is Python supported at full? I actually wasn't fully aware of what the status of that thing is.Samuel [00:19:39]: Yeah. So Pyodide, which is Python running inside the browser in scripting, is supported now by Cloudflare. They basically, they're having some struggles working out how to manage, ironically, dependencies that have binaries, in particular, Pydantic. Because these workers where you can have thousands of them on a given metal machine, you don't want to have a difference. You basically want to be able to have a share. Shared memory for all the different Pydantic installations, effectively. That's the thing they work out. They're working out. But Hood, who's my friend, who is the primary maintainer of Pyodide, works for Cloudflare. And that's basically what he's doing, is working out how to get Python running on Cloudflare's network.Swyx [00:20:19]: I mean, the nice thing is that your binary is really written in Rust, right? Yeah. Which also compiles the WebAssembly. Yeah. So maybe there's a way that you'd build... You have just a different build of Pydantic and that ships with whatever your distro for Cloudflare workers is.Samuel [00:20:36]: Yes, that's exactly what... So Pyodide has builds for Pydantic Core and for things like NumPy and basically all of the popular binary libraries. Yeah. It's just basic. And you're doing exactly that, right? You're using Rust to compile the WebAssembly and then you're calling that shared library from Python. And it's unbelievably complicated, but it works. Okay.Swyx [00:20:57]: Staying on graphs a little bit more, and then I wanted to go to some of the other features that you have in Pydantic AI. I see in your docs, there are sort of four levels of agents. There's single agents, there's agent delegation, programmatic agent handoff. That seems to be what OpenAI swarms would be like. And then the last one, graph-based control flow. Would you say that those are sort of the mental hierarchy of how these things go?Samuel [00:21:21]: Yeah, roughly. Okay.Swyx [00:21:22]: You had some expression around OpenAI swarms. Well.Samuel [00:21:25]: And indeed, OpenAI have got in touch with me and basically, maybe I'm not supposed to say this, but basically said that Pydantic AI looks like what swarms would become if it was production ready. So, yeah. I mean, like, yeah, which makes sense. Awesome. Yeah. I mean, in fact, it was specifically saying, how can we give people the same feeling that they were getting from swarms that led us to go and implement graphs? Because my, like, just call the next agent with Python code was not a satisfactory answer to people. So it was like, okay, we've got to go and have a better answer for that. It's not like, let us to get to graphs. Yeah.Swyx [00:21:56]: I mean, it's a minimal viable graph in some sense. What are the shapes of graphs that people should know? So the way that I would phrase this is I think Anthropic did a very good public service and also kind of surprisingly influential blog post, I would say, when they wrote Building Effective Agents. We actually have the authors coming to speak at my conference in New York, which I think you're giving a workshop at. Yeah.Samuel [00:22:24]: I'm trying to work it out. But yes, I think so.Swyx [00:22:26]: Tell me if you're not. yeah, I mean, like, that was the first, I think, authoritative view of, like, what kinds of graphs exist in agents and let's give each of them a name so that everyone is on the same page. So I'm just kind of curious if you have community names or top five patterns of graphs.Samuel [00:22:44]: I don't have top five patterns of graphs. I would love to see what people are building with them. But like, it's been it's only been a couple of weeks. And of course, there's a point is that. Because they're relatively unopinionated about what you can go and do with them. They don't suit them. Like, you can go and do lots of lots of things with them, but they don't have the structure to go and have like specific names as much as perhaps like some other systems do. I think what our agents are, which have a name and I can't remember what it is, but this basically system of like, decide what tool to call, go back to the center, decide what tool to call, go back to the center and then exit. One form of graph, which, as I say, like our agents are effectively one implementation of a graph, which is why under the hood they are now using graphs. And it'll be interesting to see over the next few years whether we end up with these like predefined graph names or graph structures or whether it's just like, yep, I built a graph or whether graphs just turn out not to match people's mental image of what they want and die away. We'll see.Swyx [00:23:38]: I think there is always appeal. Every developer eventually gets graph religion and goes, oh, yeah, everything's a graph. And then they probably over rotate and go go too far into graphs. And then they have to learn a whole bunch of DSLs. And then they're like, actually, I didn't need that. I need this. And they scale back a little bit.Samuel [00:23:55]: I'm at the beginning of that process. I'm currently a graph maximalist, although I haven't actually put any into production yet. But yeah.Swyx [00:24:02]: This has a lot of philosophical connections with other work coming out of UC Berkeley on compounding AI systems. I don't know if you know of or care. This is the Gartner world of things where they need some kind of industry terminology to sell it to enterprises. I don't know if you know about any of that.Samuel [00:24:24]: I haven't. I probably should. I should probably do it because I should probably get better at selling to enterprises. But no, no, I don't. Not right now.Swyx [00:24:29]: This is really the argument is that instead of putting everything in one model, you have more control and more maybe observability to if you break everything out into composing little models and changing them together. And obviously, then you need an orchestration framework to do that. Yeah.Samuel [00:24:47]: And it makes complete sense. And one of the things we've seen with agents is they work well when they work well. But when they. Even if you have the observability through log five that you can see what was going on, if you don't have a nice hook point to say, hang on, this is all gone wrong. You have a relatively blunt instrument of basically erroring when you exceed some kind of limit. But like what you need to be able to do is effectively iterate through these runs so that you can have your own control flow where you're like, OK, we've gone too far. And that's where one of the neat things about our graph implementation is you can basically call next in a loop rather than just running the full graph. And therefore, you have this opportunity to to break out of it. But yeah, basically, it's the same point, which is like if you have two bigger unit of work to some extent, whether or not it involves gen AI. But obviously, it's particularly problematic in gen AI. You only find out afterwards when you've spent quite a lot of time and or money when it's gone off and done done the wrong thing.Swyx [00:25:39]: Oh, drop on this. We're not going to resolve this here, but I'll drop this and then we can move on to the next thing. This is the common way that we we developers talk about this. And then the machine learning researchers look at us. And laugh and say, that's cute. And then they just train a bigger model and they wipe us out in the next training run. So I think there's a certain amount of we are fighting the bitter lesson here. We're fighting AGI. And, you know, when AGI arrives, this will all go away. Obviously, on Latent Space, we don't really discuss that because I think AGI is kind of this hand wavy concept that isn't super relevant. But I think we have to respect that. For example, you could do a chain of thoughts with graphs and you could manually orchestrate a nice little graph that does like. Reflect, think about if you need more, more inference time, compute, you know, that's the hot term now. And then think again and, you know, scale that up. Or you could train Strawberry and DeepSeq R1. Right.Samuel [00:26:32]: I saw someone saying recently, oh, they were really optimistic about agents because models are getting faster exponentially. And I like took a certain amount of self-control not to describe that it wasn't exponential. But my main point was. If models are getting faster as quickly as you say they are, then we don't need agents and we don't really need any of these abstraction layers. We can just give our model and, you know, access to the Internet, cross our fingers and hope for the best. Agents, agent frameworks, graphs, all of this stuff is basically making up for the fact that right now the models are not that clever. In the same way that if you're running a customer service business and you have loads of people sitting answering telephones, the less well trained they are, the less that you trust them, the more that you need to give them a script to go through. Whereas, you know, so if you're running a bank and you have lots of customer service people who you don't trust that much, then you tell them exactly what to say. If you're doing high net worth banking, you just employ people who you think are going to be charming to other rich people and set them off to go and have coffee with people. Right. And the same is true of models. The more intelligent they are, the less we need to tell them, like structure what they go and do and constrain the routes in which they take.Swyx [00:27:42]: Yeah. Yeah. Agree with that. So I'm happy to move on. So the other parts of Pydantic AI that are worth commenting on, and this is like my last rant, I promise. So obviously, every framework needs to do its sort of model adapter layer, which is, oh, you can easily swap from OpenAI to Cloud to Grok. You also have, which I didn't know about, Google GLA, which I didn't really know about until I saw this in your docs, which is generative language API. I assume that's AI Studio? Yes.Samuel [00:28:13]: Google don't have good names for it. So Vertex is very clear. That seems to be the API that like some of the things use, although it returns 503 about 20% of the time. So... Vertex? No. Vertex, fine. But the... Oh, oh. GLA. Yeah. Yeah.Swyx [00:28:28]: I agree with that.Samuel [00:28:29]: So we have, again, another example of like, well, I think we go the extra mile in terms of engineering is we run on every commit, at least commit to main, we run tests against the live models. Not lots of tests, but like a handful of them. Oh, okay. And we had a point last week where, yeah, GLA is a little bit better. GLA1 was failing every single run. One of their tests would fail. And we, I think we might even have commented out that one at the moment. So like all of the models fail more often than you might expect, but like that one seems to be particularly likely to fail. But Vertex is the same API, but much more reliable.Swyx [00:29:01]: My rant here is that, you know, versions of this appear in Langchain and every single framework has to have its own little thing, a version of that. I would put to you, and then, you know, this is, this can be agree to disagree. This is not needed in Pydantic AI. I would much rather you adopt a layer like Lite LLM or what's the other one in JavaScript port key. And that's their job. They focus on that one thing and they, they normalize APIs for you. All new models are automatically added and you don't have to duplicate this inside of your framework. So for example, if I wanted to use deep seek, I'm out of luck because Pydantic AI doesn't have deep seek yet.Samuel [00:29:38]: Yeah, it does.Swyx [00:29:39]: Oh, it does. Okay. I'm sorry. But you know what I mean? Should this live in your code or should it live in a layer that's kind of your API gateway that's a defined piece of infrastructure that people have?Samuel [00:29:49]: And I think if a company who are well known, who are respected by everyone had come along and done this at the right time, maybe we should have done it a year and a half ago and said, we're going to be the universal AI layer. That would have been a credible thing to do. I've heard varying reports of Lite LLM is the truth. And it didn't seem to have exactly the type safety that we needed. Also, as I understand it, and again, I haven't looked into it in great detail. Part of their business model is proxying the request through their, through their own system to do the generalization. That would be an enormous put off to an awful lot of people. Honestly, the truth is I don't think it is that much work unifying the model. I get where you're coming from. I kind of see your point. I think the truth is that everyone is centralizing around open AIs. Open AI's API is the one to do. So DeepSeq support that. Grok with OK support that. Ollama also does it. I mean, if there is that library right now, it's more or less the open AI SDK. And it's very high quality. It's well type checked. It uses Pydantic. So I'm biased. But I mean, I think it's pretty well respected anyway.Swyx [00:30:57]: There's different ways to do this. Because also, it's not just about normalizing the APIs. You have to do secret management and all that stuff.Samuel [00:31:05]: Yeah. And there's also. There's Vertex and Bedrock, which to one extent or another, effectively, they host multiple models, but they don't unify the API. But they do unify the auth, as I understand it. Although we're halfway through doing Bedrock. So I don't know about it that well. But they're kind of weird hybrids because they support multiple models. But like I say, the auth is centralized.Swyx [00:31:28]: Yeah, I'm surprised they don't unify the API. That seems like something that I would do. You know, we can discuss all this all day. There's a lot of APIs. I agree.Samuel [00:31:36]: It would be nice if there was a universal one that we didn't have to go and build.Alessio [00:31:39]: And I guess the other side of, you know, routing model and picking models like evals. How do you actually figure out which one you should be using? I know you have one. First of all, you have very good support for mocking in unit tests, which is something that a lot of other frameworks don't do. So, you know, my favorite Ruby library is VCR because it just, you know, it just lets me store the HTTP requests and replay them. That part I'll kind of skip. I think you are busy like this test model. We're like just through Python. You try and figure out what the model might respond without actually calling the model. And then you have the function model where people can kind of customize outputs. Any other fun stories maybe from there? Or is it just what you see is what you get, so to speak?Samuel [00:32:18]: On those two, I think what you see is what you get. On the evals, I think watch this space. I think it's something that like, again, I was somewhat cynical about for some time. Still have my cynicism about some of the well, it's unfortunate that so many different things are called evals. It would be nice if we could agree. What they are and what they're not. But look, I think it's a really important space. I think it's something that we're going to be working on soon, both in Pydantic AI and in LogFire to try and support better because it's like it's an unsolved problem.Alessio [00:32:45]: Yeah, you do say in your doc that anyone who claims to know for sure exactly how your eval should be defined can safely be ignored.Samuel [00:32:52]: We'll delete that sentence when we tell people how to do their evals.Alessio [00:32:56]: Exactly. I was like, we need we need a snapshot of this today. And so let's talk about eval. So there's kind of like the vibe. Yeah. So you have evals, which is what you do when you're building. Right. Because you cannot really like test it that many times to get statistical significance. And then there's the production eval. So you also have LogFire, which is kind of like your observability product, which I tried before. It's very nice. What are some of the learnings you've had from building an observability tool for LEMPs? And yeah, as people think about evals, even like what are the right things to measure? What are like the right number of samples that you need to actually start making decisions?Samuel [00:33:33]: I'm not the best person to answer that is the truth. So I'm not going to come in here and tell you that I think I know the answer on the exact number. I mean, we can do some back of the envelope statistics calculations to work out that like having 30 probably gets you most of the statistical value of having 200 for, you know, by definition, 15% of the work. But the exact like how many examples do you need? For example, that's a much harder question to answer because it's, you know, it's deep within the how models operate in terms of LogFire. One of the reasons we built LogFire the way we have and we allow you to write SQL directly against your data and we're trying to build the like powerful fundamentals of observability is precisely because we know we don't know the answers. And so allowing people to go and innovate on how they're going to consume that stuff and how they're going to process it is we think that's valuable. Because even if we come along and offer you an evals framework on top of LogFire, it won't be right in all regards. And we want people to be able to go and innovate and being able to write their own SQL connected to the API. And effectively query the data like it's a database with SQL allows people to innovate on that stuff. And that's what allows us to do it as well. I mean, we do a bunch of like testing what's possible by basically writing SQL directly against LogFire as any user could. I think the other the other really interesting bit that's going on in observability is OpenTelemetry is centralizing around semantic attributes for GenAI. So it's a relatively new project. A lot of it's still being added at the moment. But basically the idea that like. They unify how both SDKs and or agent frameworks send observability data to to any OpenTelemetry endpoint. And so, again, we can go and having that unification allows us to go and like basically compare different libraries, compare different models much better. That stuff's in a very like early stage of development. One of the things we're going to be working on pretty soon is basically, I suspect, GenAI will be the first agent framework that implements those semantic attributes properly. Because, again, we control and we can say this is important for observability, whereas most of the other agent frameworks are not maintained by people who are trying to do observability. With the exception of Langchain, where they have the observability platform, but they chose not to go down the OpenTelemetry route. So they're like plowing their own furrow. And, you know, they're a lot they're even further away from standardization.Alessio [00:35:51]: Can you maybe just give a quick overview of how OTEL ties into the AI workflows? There's kind of like the question of is, you know, a trace. And a span like a LLM call. Is it the agent? It's kind of like the broader thing you're tracking. How should people think about it?Samuel [00:36:06]: Yeah, so they have a PR that I think may have now been merged from someone at IBM talking about remote agents and trying to support this concept of remote agents within GenAI. I'm not particularly compelled by that because I don't think that like that's actually by any means the common use case. But like, I suppose it's fine for it to be there. The majority of the stuff in OTEL is basically defining how you would instrument. A given call to an LLM. So basically the actual LLM call, what data you would send to your telemetry provider, how you would structure that. Apart from this slightly odd stuff on remote agents, most of the like agent level consideration is not yet implemented in is not yet decided effectively. And so there's a bit of ambiguity. Obviously, what's good about OTEL is you can in the end send whatever attributes you like. But yeah, there's quite a lot of churn in that space and exactly how we store the data. I think that one of the most interesting things, though, is that if you think about observability. Traditionally, it was sure everyone would say our observability data is very important. We must keep it safe. But actually, companies work very hard to basically not have anything that sensitive in their observability data. So if you're a doctor in a hospital and you search for a drug for an STI, the sequel might be sent to the observability provider. But none of the parameters would. It wouldn't have the patient number or their name or the drug. With GenAI, that distinction doesn't exist because it's all just messed up in the text. If you have that same patient asking an LLM how to. What drug they should take or how to stop smoking. You can't extract the PII and not send it to the observability platform. So the sensitivity of the data that's going to end up in observability platforms is going to be like basically different order of magnitude to what's in what you would normally send to Datadog. Of course, you can make a mistake and send someone's password or their card number to Datadog. But that would be seen as a as a like mistake. Whereas in GenAI, a lot of data is going to be sent. And I think that's why companies like Langsmith and are trying hard to offer observability. On prem, because there's a bunch of companies who are happy for Datadog to be cloud hosted, but want self-hosted self-hosting for this observability stuff with GenAI.Alessio [00:38:09]: And are you doing any of that today? Because I know in each of the spans you have like the number of tokens, you have the context, you're just storing everything. And then you're going to offer kind of like a self-hosting for the platform, basically. Yeah. Yeah.Samuel [00:38:23]: So we have scrubbing roughly equivalent to what the other observability platforms have. So if we, you know, if we see password as the key, we won't send the value. But like, like I said, that doesn't really work in GenAI. So we're accepting we're going to have to store a lot of data and then we'll offer self-hosting for those people who can afford it and who need it.Alessio [00:38:42]: And then this is, I think, the first time that most of the workloads performance is depending on a third party. You know, like if you're looking at Datadog data, usually it's your app that is driving the latency and like the memory usage and all of that. Here you're going to have spans that maybe take a long time to perform because the GLA API is not working or because OpenAI is kind of like overwhelmed. Do you do anything there since like the provider is almost like the same across customers? You know, like, are you trying to surface these things for people and say, hey, this was like a very slow span, but actually all customers using OpenAI right now are seeing the same thing. So maybe don't worry about it or.Samuel [00:39:20]: Not yet. We do a few things that people don't generally do in OTA. So we send. We send information at the beginning. At the beginning of a trace as well as sorry, at the beginning of a span, as well as when it finishes. By default, OTA only sends you data when the span finishes. So if you think about a request which might take like 20 seconds, even if some of the intermediate spans finished earlier, you can't basically place them on the page until you get the top level span. And so if you're using standard OTA, you can't show anything until those requests are finished. When those requests are taking a few hundred milliseconds, it doesn't really matter. But when you're doing Gen AI calls or when you're like running a batch job that might take 30 minutes. That like latency of not being able to see the span is like crippling to understanding your application. And so we've we do a bunch of slightly complex stuff to basically send data about a span as it starts, which is closely related. Yeah.Alessio [00:40:09]: Any thoughts on all the other people trying to build on top of OpenTelemetry in different languages, too? There's like the OpenLEmetry project, which doesn't really roll off the tongue. But how do you see the future of these kind of tools? Is everybody going to have to build? Why does everybody want to build? They want to build their own open source observability thing to then sell?Samuel [00:40:29]: I mean, we are not going off and trying to instrument the likes of the OpenAI SDK with the new semantic attributes, because at some point that's going to happen and it's going to live inside OTEL and we might help with it. But we're a tiny team. We don't have time to go and do all of that work. So OpenLEmetry, like interesting project. But I suspect eventually most of those semantic like that instrumentation of the big of the SDKs will live, like I say, inside the main OpenTelemetry report. I suppose. What happens to the agent frameworks? What data you basically need at the framework level to get the context is kind of unclear. I don't think we know the answer yet. But I mean, I was on the, I guess this is kind of semi-public, because I was on the call with the OpenTelemetry call last week talking about GenAI. And there was someone from Arize talking about the challenges they have trying to get OpenTelemetry data out of Langchain, where it's not like natively implemented. And obviously they're having quite a tough time. And I was realizing, hadn't really realized this before, but how lucky we are to primarily be talking about our own agent framework, where we have the control rather than trying to go and instrument other people's.Swyx [00:41:36]: Sorry, I actually didn't know about this semantic conventions thing. It looks like, yeah, it's merged into main OTel. What should people know about this? I had never heard of it before.Samuel [00:41:45]: Yeah, I think it looks like a great start. I think there's some unknowns around how you send the messages that go back and forth, which is kind of the most important part. It's the most important thing of all. And that is moved out of attributes and into OTel events. OTel events in turn are moving from being on a span to being their own top-level API where you send data. So there's a bunch of churn still going on. I'm impressed by how fast the OTel community is moving on this project. I guess they, like everyone else, get that this is important, and it's something that people are crying out to get instrumentation off. So I'm kind of pleasantly surprised at how fast they're moving, but it makes sense.Swyx [00:42:25]: I'm just kind of browsing through the specification. I can already see that this basically bakes in whatever the previous paradigm was. So now they have genai.usage.prompt tokens and genai.usage.completion tokens. And obviously now we have reasoning tokens as well. And then only one form of sampling, which is top-p. You're basically baking in or sort of reifying things that you think are important today, but it's not a super foolproof way of doing this for the future. Yeah.Samuel [00:42:54]: I mean, that's what's neat about OTel is you can always go and send another attribute and that's fine. It's just there are a bunch that are agreed on. But I would say, you know, to come back to your previous point about whether or not we should be relying on one centralized abstraction layer, this stuff is moving so fast that if you start relying on someone else's standard, you risk basically falling behind because you're relying on someone else to keep things up to date.Swyx [00:43:14]: Or you fall behind because you've got other things going on.Samuel [00:43:17]: Yeah, yeah. That's fair. That's fair.Swyx [00:43:19]: Any other observations just about building LogFire, actually? Let's just talk about this. So you announced LogFire. I was kind of only familiar with LogFire because of your Series A announcement. I actually thought you were making a separate company. I remember some amount of confusion with you when that came out. So to be clear, it's Pydantic LogFire and the company is one company that has kind of two products, an open source thing and an observability thing, correct? Yeah. I was just kind of curious, like any learnings building LogFire? So classic question is, do you use ClickHouse? Is this like the standard persistence layer? Any learnings doing that?Samuel [00:43:54]: We don't use ClickHouse. We started building our database with ClickHouse, moved off ClickHouse onto Timescale, which is a Postgres extension to do analytical databases. Wow. And then moved off Timescale onto DataFusion. And we're basically now building, it's DataFusion, but it's kind of our own database. Bogomil is not entirely happy that we went through three databases before we chose one. I'll say that. But like, we've got to the right one in the end. I think we could have realized that Timescale wasn't right. I think ClickHouse. They both taught us a lot and we're in a great place now. But like, yeah, it's been a real journey on the database in particular.Swyx [00:44:28]: Okay. So, you know, as a database nerd, I have to like double click on this, right? So ClickHouse is supposed to be the ideal backend for anything like this. And then moving from ClickHouse to Timescale is another counterintuitive move that I didn't expect because, you know, Timescale is like an extension on top of Postgres. Not super meant for like high volume logging. But like, yeah, tell us those decisions.Samuel [00:44:50]: So at the time, ClickHouse did not have good support for JSON. I was speaking to someone yesterday and said ClickHouse doesn't have good support for JSON and got roundly stepped on because apparently it does now. So they've obviously gone and built their proper JSON support. But like back when we were trying to use it, I guess a year ago or a bit more than a year ago, everything happened to be a map and maps are a pain to try and do like looking up JSON type data. And obviously all these attributes, everything you're talking about there in terms of the GenAI stuff. You can choose to make them top level columns if you want. But the simplest thing is just to put them all into a big JSON pile. And that was a problem with ClickHouse. Also, ClickHouse had some really ugly edge cases like by default, or at least until I complained about it a lot, ClickHouse thought that two nanoseconds was longer than one second because they compared intervals just by the number, not the unit. And I complained about that a lot. And then they caused it to raise an error and just say you have to have the same unit. Then I complained a bit more. And I think as I understand it now, they have some. They convert between units. But like stuff like that, when all you're looking at is when a lot of what you're doing is comparing the duration of spans was really painful. Also things like you can't subtract two date times to get an interval. You have to use the date sub function. But like the fundamental thing is because we want our end users to write SQL, the like quality of the SQL, how easy it is to write, matters way more to us than if you're building like a platform on top where your developers are going to write the SQL. And once it's written and it's working, you don't mind too much. So I think that's like one of the fundamental differences. The other problem that I have with the ClickHouse and Impact Timescale is that like the ultimate architecture, the like snowflake architecture of binary data in object store queried with some kind of cache from nearby. They both have it, but it's closed sourced and you only get it if you go and use their hosted versions. And so even if we had got through all the problems with Timescale or ClickHouse, we would end up like, you know, they would want to be taking their 80% margin. And then we would be wanting to take that would basically leave us less space for margin. Whereas data fusion. Properly open source, all of that same tooling is open source. And for us as a team of people with a lot of Rust expertise, data fusion, which is implemented in Rust, we can literally dive into it and go and change it. So, for example, I found that there were some slowdowns in data fusion's string comparison kernel for doing like string contains. And it's just Rust code. And I could go and rewrite the string comparison kernel to be faster. Or, for example, data fusion, when we started using it, didn't have JSON support. Obviously, as I've said, it's something we can do. It's something we needed. I was able to go and implement that in a weekend using our JSON parser that we built for Pydantic Core. So it's the fact that like data fusion is like for us the perfect mixture of a toolbox to build a database with, not a database. And we can go and implement stuff on top of it in a way that like if you were trying to do that in Postgres or in ClickHouse. I mean, ClickHouse would be easier because it's C++, relatively modern C++. But like as a team of people who are not C++ experts, that's much scarier than data fusion for us.Swyx [00:47:47]: Yeah, that's a beautiful rant.Alessio [00:47:49]: That's funny. Most people don't think they have agency on these projects. They're kind of like, oh, I should use this or I should use that. They're not really like, what should I pick so that I contribute the most back to it? You know, so but I think you obviously have an open source first mindset. So that makes a lot of sense.Samuel [00:48:05]: I think if we were probably better as a startup, a better startup and faster moving and just like headlong determined to get in front of customers as fast as possible, we should have just started with ClickHouse. I hope that long term we're in a better place for having worked with data fusion. We like we're quite engaged now with the data fusion community. Andrew Lam, who maintains data fusion, is an advisor to us. We're in a really good place now. But yeah, it's definitely slowed us down relative to just like building on ClickHouse and moving as fast as we can.Swyx [00:48:34]: OK, we're about to zoom out and do Pydantic run and all the other stuff. But, you know, my last question on LogFire is really, you know, at some point you run out sort of community goodwill just because like, oh, I use Pydantic. I love Pydantic. I'm going to use LogFire. OK, then you start entering the territory of the Datadogs, the Sentrys and the honeycombs. Yeah. So where are you going to really spike here? What differentiator here?Samuel [00:48:59]: I wasn't writing code in 2001, but I'm assuming that there were people talking about like web observability and then web observability stopped being a thing, not because the web stopped being a thing, but because all observability had to do web. If you were talking to people in 2010 or 2012, they would have talked about cloud observability. Now that's not a term because all observability is cloud first. The same is going to happen to gen AI. And so whether or not you're trying to compete with Datadog or with Arise and Langsmith, you've got to do first class. You've got to do general purpose observability with first class support for AI. And as far as I know, we're the only people really trying to do that. I mean, I think Datadog is starting in that direction. And to be honest, I think Datadog is a much like scarier company to compete with than the AI specific observability platforms. Because in my opinion, and I've also heard this from lots of customers, AI specific observability where you don't see everything else going on in your app is not actually that useful. Our hope is that we can build the first general purpose observability platform with first class support for AI. And that we have this open source heritage of putting developer experience first that other companies haven't done. For all I'm a fan of Datadog and what they've done. If you search Datadog logging Python. And you just try as a like a non-observability expert to get something up and running with Datadog and Python. It's not trivial, right? That's something Sentry have done amazingly well. But like there's enormous space in most of observability to do DX better.Alessio [00:50:27]: Since you mentioned Sentry, I'm curious how you thought about licensing and all of that. Obviously, your MIT license, you don't have any rolling license like Sentry has where you can only use an open source, like the one year old version of it. Was that a hard decision?Samuel [00:50:41]: So to be clear, LogFire is co-sourced. So Pydantic and Pydantic AI are MIT licensed and like properly open source. And then LogFire for now is completely closed source. And in fact, the struggles that Sentry have had with licensing and the like weird pushback the community gives when they take something that's closed source and make it source available just meant that we just avoided that whole subject matter. I think the other way to look at it is like in terms of either headcount or revenue or dollars in the bank. The amount of open source we do as a company is we've got to be open source. We're up there with the most prolific open source companies, like I say, per head. And so we didn't feel like we were morally obligated to make LogFire open source. We have Pydantic. Pydantic is a foundational library in Python. That and now Pydantic AI are our contribution to open source. And then LogFire is like openly for profit, right? As in we're not claiming otherwise. We're not sort of trying to walk a line if it's open source. But really, we want to make it hard to deploy. So you probably want to pay us. We're trying to be straight. That it's to pay for. We could change that at some point in the future, but it's not an immediate plan.Alessio [00:51:48]: All right. So the first one I saw this new I don't know if it's like a product you're building the Pydantic that run, which is a Python browser sandbox. What was the inspiration behind that? We talk a lot about code interpreter for lamps. I'm an investor in a company called E2B, which is a code sandbox as a service for remote execution. Yeah. What's the Pydantic that run story?Samuel [00:52:09]: So Pydantic that run is again completely open source. I have no interest in making it into a product. We just needed a sandbox to be able to demo LogFire in particular, but also Pydantic AI. So it doesn't have it yet, but I'm going to add basically a proxy to OpenAI and the other models so that you can run Pydantic AI in the browser. See how it works. Tweak the prompt, et cetera, et cetera. And we'll have some kind of limit per day of what you can spend on it or like what the spend is. The other thing we wanted to b

new york california ai google china internet pr san francisco project tools mit putting staying states engineering origins cloud honestly agent reflect chapters ibm shared cto excel slack openai instructors competing rust arise uc berkeley api lovely ironically python rsvp aws traditionally github apis licensing strawberry javascript llm temporal gartner ota genai cpu grok sti agi graphs sequoia anthropic sql cloudflare git bedrock dx vcr tweak sdks sentry alessio zod json mcp inherently cli colvin programmatic vertex datadog pii webassembly 17m prefect postgres gla airflow sisyphean daxter jupyter open api neo4j jeremy howard langchain pypi otel numpy dsls jitter timescale clickhouse code interpreter simd marimo latent space pythonic lemps emil eifrem pytest

EP 10:04 How to Find the Hidden Money in Your CRM & Maximize Opportunities to Increase Sales

Millionaire Car Salesman Podcast

Play Episode Listen Later Feb 4, 2025 54:26

In this electrifying episode of the Millionaire Car Salesman Podcast, hosts LA Williams and Sean V. Bradley dive deep into the innovations shaping the automotive CRM landscape with special guests Shane Born and Melissa Sinclair from NCC and ProMax. "You're investing in one of the most important tools to help run all aspects of your business. You want to make sure that it's set up properly from the very beginning." - Melissa Sinclair As the automotive industry faces unprecedented changes, Shane and Melissa discuss how their companies are spearheading the credit-first approach in CRM solutions, creating streamlined and transparent processes for both dealers and consumers. The episode begins with the hosts addressing missed speaking opportunities at an industry event due to unforeseen circumstances and quickly shifts focus to the groundbreaking strategies being implemented by ProMax, setting a new bar for CRM systems integrated with National Credit Center resources. "Success involves partnering with the right people... ensure your team understands the what, the why, and the how, and then be flexible to pivot if a generational storm hits." - Shane Born The conversation is packed with insights into how dealerships can leverage modern CRM systems to improve customer experience, enhance fraud prevention, and streamline sales processes using accurate real-time credit data. Shane Born highlights the unique position of their CRM solution, emphasizing its ability to provide consistent, transparent, and frictionless service to consumers while maintaining a keen adaptability to market changes. Melissa Sinclair adds to this by discussing the newly launched open API, offering seamless integration capabilities with vendor partners to unify dealership operations. This episode is a must-listen for any automotive professional keen on harnessing cutting-edge CRM technologies to propel their business forward. Key Takeaways: ✅ Credit-First CRM Approach: NCC and ProMax are revolutionizing the CRM space with a credit-first methodology, enhancing the buying process through accurate financing options and fraud prevention features. ✅ User Interface and Experience: The newly redesigned UI/UX of ProMax's CRM offers a modern, intuitive interface that simplifies processes for salespeople and management, improving efficiency and usability. ✅ Service and Sales Integration: By integrating service customer data with CRM strategies, dealerships can unlock unprecedented sales opportunities, targeting both existing and untapped customer bases. ✅ Open API for Seamless Integration: The introduction of the open API, Dex, allows for effortless connection with various dealership software, ensuring all operations are synchronized and efficient. ✅ Robust OEM Partnerships: ProMax's CRM is approved by most OEMs, including 100% co-op with General Motors, showcasing the platform's comprehensive capabilities and industry trust. About Shane Born Shane Born is a prominent figure in the automotive industry, holding a significant position at NCC (National Credit Center) and ProMax. With over two decades of experience, Shane has been instrumental in driving technological advancements and fostering customer relations in the automotive sector. His focus lies in integrating credit solutions with CRM systems to enhance dealership efficiencies! About Melissa Sinclair Melissa Sinclair at NCC and ProMax, bringing extensive experience and deep knowledge about CRM solutions. Melissa has contributed substantially to the development of innovative customer relationship management tools and has played a key role in ensuring that these tools meet the needs of modern dealerships. She is particularly focused on enhancing user experience and fostering partnerships within the automotive industry. Unleashing the Power of CRM in Automotive Sales Key Takeaways The Millionaire Car Salesman Podcast dives into how CRM technology is revolutionizing the automotive industry, focusing on NCC's ProMax and its unique offerings. The discussion emphasizes the shift towards credit-first processes, enhancing both dealer and consumer experiences by leveraging advanced data management. An insider perspective on NADA reveals valuable insights into networking strategies and industry trends despite unexpected challenges. CRM Innovations in the Automotive Industry The Millionaire Car Salesman Podcast, hosted by LA Williams and Sean V. Bradley, recently featured Shane Born and Melissa Sinclair from NCC and ProMax, who discussed groundbreaking innovations in CRM technology, specifically tailored for the automotive industry. As the podcast episodes unravel, they explore how these technologies redefine dealer-consumer interactions and internal processes within dealerships. According to Shane Born, NCC is leveraging its dual expertise as both a data and software-as-a-service company: "We are both a DAS company and a SaaS company," he notes. Their end-to-end platform ensures a streamlined, credit-first approach that not only enhances dealer efficiency but also significantly improves the customer's buying journey. This dual focus allows NCC to provide a seamless workflow tied to a single database, an advantage that sets it apart from competitors relying on multiple integrations. From quote to contract, the process is designed to eliminate friction and bolster customer satisfaction by providing accurate financing solutions quickly. Melissa Sinclair elaborates on the customized experience ProMax delivers: "Any role in the dealership can benefit as the platform was designed specifically with them in mind." The conversations with dealers led to UI and UX improvements, focusing on easy navigation and relevant data presentation, ensuring that even complex tools are intuitive for users across all roles within the dealership. Strategies from the NADA Conference One of the intriguing segments of the podcast episode involved understanding the dynamics of the National Automobile Dealers Association (NADA) conference. Despite logistical hurdles due to inclement weather, vendors and attendees alike made the most of the opportunity to connect and explore industry advancements. Shane Born emphasized that despite a "50% pre-registration count," those who made it to the event were laser-focused on solving pertinent challenges, leading to higher closing rates on new dealer group deals. NADA's decision to open its welcome party to all attendees, not just paid registrants, was a strategic move to enhance networking despite lower turnout numbers. "It allows us to really have meaningful conversations about what's happening and where the industry is heading," Born shared, highlighting how such interactions lead to long-lasting partnerships and industry insights. This part of the podcast underscores the importance of adaptability and creativity in fostering networking opportunities, even in unforeseen circumstances. NADA highlighted industry trends, such as AI integration and credit-first approaches, showcased by innovative solutions like NCC's ProMax. Maximizing CRM for Dealer Success In detailing the capabilities of NCC's complete CRM, the podcast dives into how dealerships can maximize the tool for their success. At its core, ProMax's CRM offers a robust framework for both reactive and proactive dealership strategies—a point emphasized repeatedly throughout the discussion. "Any user can go in and build a custom list," explained Melissa Sinclair, illustrating the CRM's adaptability in creating targeted customer lists based on various criteria, such as unsold showroom traffic or service appointments. The system's Opportunities Dashboard is a game changer, automatically collating potential purchase opportunities, ensuring that even if a salesperson overlooks a prospect, the CRM does not. Sean V. Bradley highlighted common dealer pitfalls, such as under-utilizing CRM capabilities, which often leads to unnecessary marketing expenditures. "Why are they spending $65,000 in advertising per month when they could invest $5,000 to $10,000 in CRM?" he questioned, advocating for a full-time CRM manager to ensure optimal use of the system. Reflecting on trends discussed at the podcast, it's evident that leveraging comprehensive CRM tools like NCC's ProMax could save money and increase sales, making it essential for dealerships to embrace such technologies fully. Reflections on the Automotive Horizon As the podcast episode closes, the hosts and guests reflect on how the innovations discussed are reshaping the automotive sales landscape. The evolution of CRM technology, spearheaded by companies like NCC and solutions such as ProMax, shows an industry leaning heavily into data-driven, customer-centric approaches. The recurring theme remains clear: the automotive industry is poised for growth through technology adaptation and strategic partnerships. As Shane Born perfectly encapsulates, success in this field "comes down to consistent execution of your strategic vision," underscoring the importance of aligning technology with well-defined dealership goals. With ProMax's open API and partnership with AI leaders like Impel, NCC ensures they are not just evolving alongside industry changes, but leading the charge toward a more integrated and customer-focused sales experience. In the ever-evolving automotive market, insights from such discussions are invaluable, paving the way for dealerships to not only keep pace but thrive in an exciting era of digital transformation. Resources: Podium: Discover how Podium's innovative AI technology can unlock unparalleled efficiency and drive your dealership's sales to new heights. Visit www.podium.com/mcs to learn more! NCC: Credit-Driven Retailing - NCC delivers industry-best credit-driven retailing for auto dealerships, combining a powerful credit and compliance engine and fully integrated CRM/Desking platform for maximum profitability. Complete CRM: Complete CRM is a streamlined, all-in-one system that simplifies your dealership software and processes so you can manage every aspect of your operation with ease; from tracking and following up on leads, desking deals, managing inventory, marketing to your customers, and more. Dealer Synergy & Bradley On Demand: The automotive industry's #1 training, tracking, testing, and certification platform and consulting & accountability firm. The Millionaire Car Salesman Facebook Group: Join the #1 Mastermind Group in the Automotive Industry! With over 28,000 members, gain access to successful automotive mentors & managers, the best industry practices, & collaborate with automotive professionals from around the WORLD! Join The Millionaire Car Salesman Facebook Group today! Win the Game of Googleopoly: Unlocking the secret strategy of search engines. The Millionaire Car Salesman Podcast is Proudly Sponsored By: Podium: Elevating Dealership Excellence with Intelligent Customer Engagement Solutions. Unlock unparalleled efficiency and drive sales with Podium's innovative AI technology, featured proudly on the Millionaire Car Salesman Podcast. Visit www.podium.com/mcs to learn more! NCC: Powered by proprietary solutions such as Intelligent Credit Engine™ and LenderSelect™, NCC transforms the car-buying experience for dealers and their customers. From compliance and lender selection to CRM and desking, to marketing and data mining—NCC integrates them all in a single, seamless platform to deliver better customer experiences, maximum efficiency and maximum profit. Complete CRM: As an innovative leader in the industry for the last 30 years, Complete CRM is designed to give your dealership the competitive edge in a demanding marketplace. Powered by Complete Credit™ and award-winning desking, Complete CRM™ is the industry's only credit and compliance-enabled CRM that lets dealers achieve maximum profitability on every deal. Built on modern technology, Complete CRM seamlessly integrates credit, compliance, inventory, data mining, lead generation, enterprise functionality, and customized reporting in one tool with a single login. Dealer Synergy: The #1 Automotive Sales Training, Consulting, and Accountability Firm in the industry! With over two decades of experience in building Internet Departments and BDCs, we have developed the most effective automotive Internet Sales, BDC, and CRM solutions. Our expertise in creating phone scripts, rebuttals, CRM action plans, strategies, and templates ensures that your dealership's tools and personnel reach their full potential. Bradley On Demand: The automotive sales industry's top Interactive Training, Tracking, Testing, and Certification Platform. Featuring LIVE Classes and over 9,000 training modules, our platform equips your dealership with everything needed to sell more cars, more often, and more profitably!

money game world success social media ai power strategy service opportunities management reflections hidden built testing reflecting cars unlock consulting tracking saas powered crm maximize unleashing ux api ui automotive general motors podium dex pro max ncc mastermind groups increase sales oems ui ux automotive industry user interfaces carsales increase profits bdc car salesman open api gross profit bdcs impel automotive sales internet sales automotive management sean v bradley

Lorna Mitchell: Writing Documentation Engineers Will Actually Read

Maintainable

Play Episode Listen Later Jan 28, 2025 43:18

Join Robby as he chats with Lorna Mitchell, open source advocate and technical writer, about the art of creating documentation that doesn't gather dust. Lorna shares her experiences as a maintainer of the open source project RST2PDF, the value of API governance, and how documentation bridges gaps in developer experience.Highlights:What Makes Software Maintainable: Characteristics like great documentation, automated tests, and onboarding ease.Documentation's Role in Long-Lived Software: Why it's crucial for internal tools and open source projects alike.Open Source in Practice: Lorna's journey with RST2PDF and adopting a tech stack she wasn't initially fluent in.API Governance Simplified: Lorna explains the four levels of API readiness and how teams can work toward more usable APIs.Writing Documentation for Engineers: How style guides can empower contributors without overwhelming them.Using Tools to Improve Documentation: From linters to prose-checking tools like Veil, Lorna discusses practical tips.Key Takeaways:[00:01:00] What makes software well-maintained: documentation, accessibility, and automated tests.[00:03:10] Why documentation isn't just for new users—Lorna's experience with revisiting her own open source projects.[00:06:30] Diving into rst2pdf: Challenges in maintaining an abandoned project.[00:13:45] Balancing ownership and transitioning open source projects to new maintainers.[00:15:30] What is OpenAPI, and how does API governance impact usability?[00:26:10] The art of concise yet helpful documentation for different audiences.[00:33:00] Using examples in APIs to enhance clarity and reduce confusion.[00:40:00] Tools for improving writing, from prose linters to markdown syntax checkers.Resources Mentioned:Lorna Mitchell's Websiterst2pdf ProjectSimon Willison's Post on One-Person ProjectsHow to Take Smart NotesOpenAPI SpecificationVeil Prose LinterFollow Lorna:GitHubIndieWebThanks to Our Sponsor!Turn hours of debugging into just minutes! AppSignal is a performance monitoring and error-tracking tool designed for Ruby, Elixir, Python, Node.js, Javascript, and other frameworks.It offers six powerful features with one simple interface, providing developers with real-time insights into the performance and health of web applications.Keep your coding cool and error-free, one line at a time! Use the code maintainable to get a 10% discount for your first year. Check them out! Subscribe to Maintainable on:Apple PodcastsSpotifyOr search "Maintainable" wherever you stream your podcasts.Keep up to date with the Maintainable Podcast by joining the newsletter.

challenges writing tools balancing engineers diving key takeaways api veil open source python apis javascript documentation elixir node legacy code open api using tools

Unified API for STR Tech is Here with Abdul Ahadh

the no BS short-term rental podcast

Play Episode Listen Later Jan 9, 2025 46:50

This week, we're diving into the messy world of **hospitality tech** and uncovering solutions that actually work. Our guest is Abdul Ahadh, CEO and co-founder of Calry, a game-changing Unified API for property management systems (PMS). If you've ever struggled with clunky integrations or wasted time wrestling with tech that doesn't play nice, this episode is for you. We break break down: - Why the PMS integration game is broken. - How Calry's Unified API cuts through the chaos. - Real-world benefits like reduced resource burn and faster integrations. Plus, Ahad shares his journey from Bangalore's startup scene to disrupting the short-term rental API game. He gives us a sneak peek into Calry's expansion plans and why they're the partner you need to scale smarter, not harder. Visit CALRY - https://calry.app/Timestamps (because we know you're busy): **00:00** - No BS welcome: Let's get real about hospitality tech. **00:57** - Post-VRMA takeaways and industry chatter. **03:00** - Homecoming Week in Atlanta: The vibe shift. **07:51** - Meet Abdul Ahadh and the origin story of Calry. **09:49** - Building Calry: Why the world needed a Unified API. **13:32** - The gritty truth about hospitality tech challenges. **16:49** - Bangalore: Why it's the Silicon Valley you should care about. **18:05** - How Calry's solving real problems—and what's next. **21:34** - Why PMS integrations suck and how to fix them. **22:31** - The Unified API solution: One point of contact, endless possibilities. **24:05** - Business efficiency unlocked: The Calry effect. **26:23** - Open API and business decisions. **31:50** - Scaling up: Calry's roadmap for the future. **38:51** - The advantages of switching to Calry. **44:28** - Closing thoughts: Ahad's advice for the next big disruptor.

ceo business real tech silicon valley scaling api pms unified bangalore abdul no bs ahad homecoming week open api

Sid Maestre from APIMatic: APIs build vs buy

Scaling DevTools

Play Episode Listen Later Dec 16, 2024 47:49 Transcription Available

We dig into the the build vs. buy dilemma for APIs, and the role of OpenAPI in effective documentation. This episode is brought to you by WorkOS. If you're thinking about selling to enterprise customers, WorkOS can help you add enterprise features like Single Sign On and audit logs.We explore how AI is transforming the landscape of APIs and developer tools, and discuss the future of coding.The choice between building and buying SDKs depends on company maturity.OpenAPI is crucial for generating quality API documentation.AI is revolutionizing how APIs are created and consumed.Maintaining SDK libraries can be a significant challenge.Developer tools must evolve to keep pace with API design changes.Trust in AI-generated code is growing among developers.The future of coding will likely involve more AI integration.Links:APIMaticSid Maestre

trust ai developers api apis sdks developer advocate developer relations devrel maestre devtools open api single sign on

LCC 319 - le ramasse-miettes-charognes

Les Cast Codeurs Podcast

Play Episode Listen Later Dec 16, 2024 70:05

Dans cet épisde en audio et en vidéo (youtube.com/lescastcodeurs), Guillaume et Emmanuel discutent des 15 ans de Go, d'une nouvelle approche de garbage collecting, de LLMs dans les applications Java, dobservabilité, d'une attaque de chaine d'approvisionnement via javac et d'autres choses. Enregistré le 13 décembre 2024 Téléchargement de l'épisode LesCastCodeurs-Episode-319.mp3 News Langages Go fête son 15ème anniversaire ! https://go.dev/blog/15years discute les 15 ans la corrections de gotchas dans les for loops (notamment les variables étaient loop scoped) le fait que la compile echoue si on attend une version de go superieure seulement depuis go 1.21 en parallele de la gestion de la chaine d'outil (c'est en 2023 seulement!) opt-in telemetrie aussi recent Construire OpenJDK à partir des sources sur macOS https://www.morling.dev/blog/building-openjdk-from-source-on-macos/ de maniere surprenante ce n'est pas tres compliqué Papier sur l'aproche Mark-scavenge pour un ramasse miette https://inside.java/2024/11/22/mark-scavenge-gc/ papier de recherche utiliser l'accessibilité pour preuve de vie n'est pas idéal: un objet peut etre atteignable mais ne sera jamais accedé par le programme les regions les plus pauvres en objets vivant voient leurs objets bouger dans uen autre region et la regio libéré, c'est le comportement classique des GC deux methodes: mark evaguate qui le fait en deux temps et la liveness peut evoluer ; et scavenge qui bouge l'objet vivant des sa decouverte ont fait tourner via ZGC des experience pour voir les objects consideres vivants et bougés inutilement. resultats montrent un gros taux d'objets bougés de maniere inutile proposent un algo different ils marquent les objets vivants mais ne les bougent pas avant le prochain GC pour leur donner une change de devenir unreachable elimine beaucoup de deplacement inutiles vu que les objets deviennent non accessible en un cycle de GC jusquà 91% de reduction ! Particulierement notable dans les machines chargées en CPU. Les tokens d'accès court ou longs https://grayduck.mn/2023/04/17/refresh-vs-long-lived-access-tokens/ pourquoi des long access tokens (gnre refresh token) sont utilises pour des short lived dans oauth 2.0 refresh token simplifient la revocation: vu que seul le auth serveur a a verifier la révocation et les clients vérifient l'expiration et la validité de la signature refresh token ne sont envoyés que entre endpoints alors que les access tokens se baladent pas mal: les frontières de confiance ne sont pas traversées refresh token comme utilise infréquement, et donc peut etre protegee dans une enclave les changements de grants sont plus simple tout en restant distribuable histoire des access refresh token et access token permet de mieux tracer les abus / attaques les inconvenients: c'est plus compliqué en flow, the auth serveur est un SPOF amis mitigeable Java Advent est de retour https://www.javaadvent.com/calendar backstage Java integrite par defaut (et ses consequences sur l'ecosysteme) timefold (sovler) Les extensions JUNit 5 OpenTelemetry via Java Agent vs Micrometer analyse statique de code CQRS et les fonctionalités modernes de Java java simple (sans compilatrion, sans objet fullstack dev with quarkus as backend José Paumard introduit et explique les Gatherers dans Java 24 dans cette vidéo https://inside.java/2024/11/26/jepcafe23/ Librairies Micronaut 4.7, avec l'intégration de LangChain4j https://micronaut.io/2024/11/14/micronaut-framework-4-7-0-released/ Combiner le framework de test Spock et Cucumber https://www.sfeir.dev/back/spock-framework-revolutionnez-vos-tests-unitaires-avec-la-puissance-du-bdd-et-de-cucumber/ les experts peuvent écrire leurs tests au format Gherkin (de Cucumber) et les développeurs peuvent implémenter les assertions correspondantes avec l'intégration dans Spock, pour des tests très lisibles Spring 6.2 https://spring.io/blog/2024/11/14/spring-framework-6-2-0-available-now beans @Fallback améliorations sur SpELet sur le support de tests support de l'echape des property placeholders une initioalisation des beans en tache de fond nouvelle et pleins d'autres choses encore Comment créer une application Java LLM tournant 100% en Java avec Jlama https://quarkus.io/blog/quarkus-jlama/ blog de Mario Fusco, Mr API et Java et Drools utilise jlama + quarkus + langchain Explique les avantage de l'approche pure Java comme le cycle de vie unique, tester les modeles rapidement, securite (tout est in process), monolithe ahahah, observabilité simplifiée, distribution simplifiée (genre appli embarquée) etc Vert.x 5 en seconde incubation https://vertx.io/blog/eclipse-vert-x-5-candidate-2-released/ Support des Java modules (mais beacoup des modules vert.x eux-même ne le supportent pas support io_uring dans vert.x core le load balancing côté client le modele des callbacks n'est plus supporté, vive les Futur beaucoup d'améliorations autour de gRPC et d'autres choses Un article sur Spring AI et la multi modalite audio https://spring.io/blog/2024/12/05/spring-ai-audio-modality permet de voir les evolutions des APIs de Spring AI s'appluie sur les derniers modeles d'open ai des examples comme par exemple un chatbot voix et donc comment enregistrer la voix et la passer a OpenAI Comment activer le support experimental HTTP/3 dans Spring Boot https://spring.io/blog/2024/11/26/http3-in-reactor-2024 c'ets Netty qui fait le boulot puis Spring Netty l'article décrit les etapes pour l'utiliser dans vos applis Spring Boot ou Spring Cloud Gateway l'article explique aussi le cote client (app cliente) ce qui est sympa Infrastructure Un survol des offres d'observabilité http://blog.ippon.fr/2024/11/18/observabilite-informatique-comprendre-les-bases-2eme-partie/ un survol des principales offres d'observabilité Open source ou SaaS et certains outsiders Pas mal pour commencer à défricher ce qui vous conviendrait blog de ippon Web Sortie de Angular 19 https://blog.ninja-squad.com/2024/11/19/what-is-new-angular-19.0/ stabilité des APIs Signal APIs migration automatique vers signals composants standalone par défaut nouvelles APIs linkedSignal et resource de grosses améliorations de SSR et HMR un article également de Sfeir sur Angular 19 https://www.sfeir.dev/front/angular-19-tout-ce-quil-faut-savoir-sur-les-innovations-majeures-du-framework/ Angluar 19 https://www.sfeir.dev/front/angular-19-tout-ce-quil-faut-savoir-sur-les-innovations-majeures-du-framework/ composant standalone par default (limiter les problemes de dependances), peut le mettre en strict pour le l'imposer (ou planter) signalement des imports inutilisés @let pour les variables locales dans les templates linkedSignal (experimental) pour lier des signaux entre eux (cascade de changement suite a un evenement hydratation incrementale (contenu progressivement interactif avec le chargement - sur les parties de la page visible ou necessaires et event replay, routing et modes de rendu en rendy hybride, Hot module replacement etc The State of Frontend — dernière compilation des préférences des développeurs en terme de front https://tsh.io/state-of-frontend/ React en tête, suivi de Vue et Svelte. Angular seulement 4ème Côté rendering framework, Next.js a la majorité absolue, ensuite viennent Nuxt et Astro Zod est la solution de validation préférée Pour la gestion de date, date-fns est en tête, suivi par moment.js Côté state management, React Context API en première place, mais les suivants sont tous aussi pour React ! Grosse utilisation de lodash pour plein d'utilités Pour fetcher des resources distantes, l'API native Fetch et Axios sont les 2 vaincoeurs Pour le déploiement, Vercel est premier Côté CI/CD, beaucoup de Github Actions, suivi par Gitlab CI Package management, malgré de bonnes alternatives, NPM se taille toujours la part du lion Ecrasante utilisation de Node.js comme runtime JavaScript pour faire du développement front Pour ce qui est du typing, beaucoup utilisent TypeScript, et un peu de JSdoc, et la majorité des répondants pensent que TypeScript a dépassé JavaScript en usage Dans les API natives du navigateur, Fetch, Storage et WebSockets sont les APIs les plus utilisées La popularité des PWA devrait suivre son petit bonhomme de chemin En terme de design system, shadcn.ui en tête, suivi par Material, puis Bootstram Pour la gestion des styles, un bon mix de plain old CSS, de Tailwind, et de Sass/CSS Jest est premier comme framework de tests Les 3/4 des développeurs front utilisent Visual Studio Code, quant au quart suivant, c'est JetBrains qui raffle les miettes Pour le build, Vite récolte les 4/5 des voix ESLint et Prettier sont les 2 favoris pour vérifier le code Parfois, on aimerait pouvoir tester une librairie ou un framework JavaScript, sans pour autant devoir mettre en place tout un projet, avec outil de build et autre. Julia Evans explore les différents cas de figure, suivant la façon dont ces librairies sont bundlées https://jvns.ca/blog/2024/11/18/how-to-import-a-javascript-library/ Certaines librairies permette de ne faire qu'un simple import dans une balise script Certaines frameworks sont distribués sous forme d'Universal Module Definition, sous CommonJS, d'ESmodule franchemet en tant que noob c'est compliqué quand même Data et Intelligence Artificielle L'impact de l'IA en entreprise et des accès aux documents un peu laxistes https://archive.ph/uPyhX l'indexing choppe tout ce qu'il peut et l'IA est tres puissante pour diriger des requetes et extraires les données qui auraient du etre plus restreintes Différentes manières de faire de l'extraction de données et de forcer la main à un LLM pour qu'il génère du JSON https://glaforge.dev/posts/2024/11/18/data-extraction-the-many-ways-to-get-llms-to-spit-json-content/ l'approche “je demande gentiment” au LLM, en faisant du prompt engineering en utilisant du function calling pour les modèles supportant la fonctionnalité, en particulier avant les approches de type “JSON mode” ou “JSON schema” ou effectivement si le modèle le supporte aussi, toujours avec un peu de prompting, mais en utilisant le “JSON mode” qui force le LLM a générer du JSON valide encore mieux avec la possibilité de spécifier un schema JSON (type OpenAPI) pour que le JSON en sortie soit “compliant” avec le schéma proposé Comment masquer les données confidentielles avec ses échanges avec les LLMs https://glaforge.dev/posts/2024/11/25/redacting-sensitive-information-when-using-generative-ai-models/ utilisation de l'API Data Loss Prevention de Google Cloud qui permet d'identifier puis de censurer / masquer (“redacted” en anglais) des informations personnelles identifiables (“PII”, comme un nom, un compte bancaire, un numéro de passeport, etc) pour des raison de sécurité, de privacy, pour éviter les brèche de données comme on en entend trop souvent parler dans les nouvelles On peut utiliser certains modèles d'embedding pour faire de la recherche de code https://glaforge.dev/posts/2024/12/02/semantic-code-search-for-programming-idioms-with-langchain4j-and-vertex-ai-embedding-models/ Guillaume recherche des bouts de code, en entrant une requête en langue naturel Certains embedding models supportent différents types de tâches, comme question/réponse, question en langue naturelle / retour sous forme de code, ou d'autres tâches comme le fact checking, etc Dans cet article, utilisation du modèle de Google Cloud Vertex AI, en Java, avec LangChain4j Google sort la version 2 de Gemini Flash https://blog.google/technology/google-deepmind/google-gemini-ai-update-december-2024/ La nouvelle version Gemini 2.0 Flash dépasse même Gemini 1.5 Pro dans les benchmarks Tout en étant 2 fois plus rapide que Gemini 1.5 Pro, et bien que le prix ne soit pas encore annoncé, on imagine également plus abordable Google présente Gemini 2 comme le LLM idéal pour les “agents” Gemini propose une vraie multimodalité en sortie (premier LLM sur le marché à le proposer) : Gemini 2 peut entrelacer du texte, des images, de l'audio Gemini 2 supporte plus de 100 langues 8 voix de haute qualité, assez naturelles, pour la partie audio Un nouveau mode speech-to-speech en live, où on peut même interrompre le LLM, c'est d'ailleurs ce qui est utilisé dans Project Astra, l'application mobile montrée à Google I/O qui devient un vrai assistant vocale en live sur votre téléphone Google annonce aussi une nouvelle expérimentation autour des assistants de programmation, avec Project Jules, avec lequel on peut discuter en live aussi, partager son code, comme un vrai pair programmeur Google a présenté Project Mariner qui est un agent qui est sous forme d'extension Chrome, qui va permettre de commander votre navigateur comme votre assistant de recherche personnel, qui va être capable de faire des recherches sur le web, de naviguer dans les sites web, pour trouver les infos que vous recherchez Cet autre article montre différentes vidéos de démos de ces fonctionnalités https://developers.googleblog.com/en/the-next-chapter-of-the-gemini-era-for-developers/ Un nouveau projet appelé Deep Research, qui permet de faire des rapports dans Gemini Advanced : on donne un sujet et l'agent va proposer un plan pour un rapport sur ce sujet (qu'on peut valider, retoucher) et ensuite, Deep Research va effectuer des recherches sur le web pour vous, et faire la synthèse de ses recherches dans un rapport final https://blog.google/products/gemini/google-gemini-deep-research/ Enfin, Google AI Studio, en plus de vous permettre d'expérimenter avec Gemini 2, vous pourrez aussi utiliser des “starter apps” qui montrent comment faire de la reconnaissance d'objet dans des images, comment faire des recherches avec un agent connecté à Google Maps, etc. Google AI Studio permet également de partager votre écran avec lui, en mobile ou en desktop, de façon à l'utiliser comme un assistant qui peut voir ce que vous faites, ce que vous coder et peut répondre à vos questions Méthodologies Un article de GitHub sur l'impact de la surutilisation des CPU sur la perf de l'appli https://github.blog/engineering/architecture-optimization/breaking-down-cpu-speed-how-utilization-impacts-performance/ c'est surprenant qu'ils ont des effets des 30% de perf c'est du a la non limit thermique, au boost de frequece qui en suit ils ont donc cherché le golden ratio pour eux autour de 60% ils prennent des morceaux de cluster kube poru faire tourner les workloads et ajoutent des wqorkload CPU artificiels (genre math) Sécurité Attaque de la chaîne d'approvisionnement via javac https://xdev.software/en/news/detail/discovering-the-perfect-java-supply-chain-attack-vector-and-how-it-got-fixed s'appuie sur l'annotation processeur l'annotation processors des dependances est chargé et executé au moment du build du projet et cherche les annotations processor dans le user classpath (via le pattern serviceloader) et donc si la dependance est attaquée et un annotation processor est ajouté ou modifié on a un vecteur d'attaque au moment de la compilation du projet ciblé des qu'on deparre l'IDE en gros workaround, activer -proc:none et activer les annotation processors explicitly dans votre outil de build certaines améliorations dans le JDK: le compilateur note qu'il execute un annotation processor dans java 23+ les annotation processors sont deactivés par defaut Conférences La liste des conférences provenant de Developers Conferences Agenda/List par Aurélie Vache et contributeurs : 19 décembre 2024 : Normandie.ai 2024 - Rouen (France) 20 janvier 2025 : Elastic{ON} - Paris (France) 22-25 janvier 2025 : SnowCamp 2025 - Grenoble (France) 24-25 janvier 2025 : Agile Games Île-de-France 2025 - Paris (France) 30 janvier 2025 : DevOps D-Day #9 - Marseille (France) 6-7 février 2025 : Touraine Tech - Tours (France) 21 février 2025 : LyonJS 100 - Lyon (France) 28 février 2025 : Paris TS La Conf - Paris (France) 20 mars 2025 : PGDay Paris - Paris (France) 20-21 mars 2025 : Agile Niort - Niort (France) 25 mars 2025 : ParisTestConf - Paris (France) 26-29 mars 2025 : JChateau Unconference 2025 - Cour-Cheverny (France) 28 mars 2025 : DataDays - Lille (France) 28-29 mars 2025 : Agile Games France 2025 - Lille (France) 3 avril 2025 : DotJS - Paris (France) 10-11 avril 2025 : Android Makers - Montrouge (France) 10-12 avril 2025 : Devoxx Greece - Athens (Greece) 16-18 avril 2025 : Devoxx France - Paris (France) 29-30 avril 2025 : MixIT - Lyon (France) 7-9 mai 2025 : Devoxx UK - London (UK) 16 mai 2025 : AFUP Day 2025 Lille - Lille (France) 16 mai 2025 : AFUP Day 2025 Lyon - Lyon (France) 16 mai 2025 : AFUP Day 2025 Poitiers - Poitiers (France) 24 mai 2025 : Polycloud - Montpellier (France) 5-6 juin 2025 : AlpesCraft - Grenoble (France) 11-13 juin 2025 : Devoxx Poland - Krakow (Poland) 12-13 juin 2025 : Agile Tour Toulouse - Toulouse (France) 12-13 juin 2025 : DevLille - Lille (France) 24 juin 2025 : WAX 2025 - Aix-en-Provence (France) 26-27 juin 2025 : Sunny Tech - Montpellier (France) 1-4 juillet 2025 : Open edX Conference - 2025 - Palaiseau (France) 18-19 septembre 2025 : API Platform Conference - Lille (France) & Online 2-3 octobre 2025 : Volcamp - Clermont-Ferrand (France) 6-10 octobre 2025 : Devoxx Belgium - Antwerp (Belgium) 16-17 octobre 2025 : DevFest Nantes - Nantes (France) 6 novembre 2025 : dotAI 2025 - Paris (France) 12-14 novembre 2025 : Devoxx Morocco - Marrakech (Morocco) 23-25 avril 2026 : Devoxx Greece - Athens (Greece) 17 juin 2026 : Devoxx Poland - Krakow (Poland) Nous contacter Pour réagir à cet épisode, venez discuter sur le groupe Google https://groups.google.com/group/lescastcodeurs Contactez-nous via twitter https://twitter.com/lescastcodeurs Faire un crowdcast ou une crowdquestion Soutenez Les Cast Codeurs sur Patreon https://www.patreon.com/LesCastCodeurs Tous les épisodes et toutes les infos sur https://lescastcodeurs.com/

google france state spring data open flash dans saas material tout pas react gemini faire api storage conf enfin google maps chrome java github guillaume papier diff apis aur certains spock javascript llm futur macos grosse vite wax cucumbers cpu gc parfois axios css vert google cloud certaines normandie google i o fetch node frontend aix fall back vue tailwinds angular attaque enregistr ci cd json explique paris france typescript pii pwa npm ssr visual studio code vache jetbrains spelet github actions prettier svelte grpc open api websockets gherkin gatherers combiner spring boot cqrs jdk lyon france nuxt hmr eslint miettes julia evans junit provence france commonjs lille france micrometer spof agile games

From .mobi Over GraphQL to Quarkus Dev UI

airhacks.fm podcast with adam bien

Play Episode Listen Later Dec 1, 2024 59:42

An airhacks.fm conversation with Phillip Krueger (@phillipkruger) about: early programming experiences with Visual Basic and Java, transition from actuarial science to computer science, first job at a bank working with Java Swing and RMI over CORBA, experience with J2EE and XML technologies, working with XML and XSLT, development of open-source Swing components, work on dotMobi sites for mobile phones in Africa, creation of API extensions for Java EE and MicroProfile, involvement in the MicroProfile GraphQL specification, joining Red Hat and working on quarkus, development of SmallRye GraphQL, improvements to OpenAPI support in Quarkus, work on Quarkus Dev UI, discussion about the evolution of Java application servers and frameworks, comparison of REST and GraphQL, thoughts on Java development culture in South Africa Phillip Krueger on twitter: @phillipkruger

africa swing api java red hat xml graphql mobi rmi visual basic open api corba java ee xslt j2ee

Why build declarative agents with Visual Studio Code vs Copilot Studio

Microsoft 365 Developer Podcast

Play Episode Listen Later Nov 25, 2024 36:35

In this episode Jeremy Thake talks to Sebastien Levert. The podcast focuses on the evolution and development of tools within the Microsoft 365 ecosystem, particularly around agent building and Copilot Studio. The discussion highlights the distinction between low-code/no-code solutions for makers and pro-code tools for developers. Key tools such as Agent Builder and Teams Toolkit are explored, emphasizing their respective strengths: Agent Builder's accessibility for information workers versus Teams Toolkit's power and flexibility for professional developers. They discuss how developers can leverage Visual Studio Code, adaptive cards, and GitHub integration for building robust, scalable agents. A recurring theme is enabling seamless collaboration between makers and developers, with a focus on governance, scalability, and customization, particularly for enterprise-grade agents. The conversation also delves into advancements in API integration, such as using OpenAPI files and the emerging TypeSpec language to streamline the creation of APIs and plugins. Graph connectors and Power Platform connectors are highlighted as critical components for data access and processing within agents. To watch their Ignite breakout with all the demos discussed watch Developers guide to building your own agents To find out more please visit https://aka.ms/extendcopilotm365

microsoft developers ignite api github apis copilot graphs visual studio code power platform declarative open api jeremy thake

Fullstack TypeScript with Erik Hanchett

PodRocket - A web development podcast from LogRocket

Play Episode Listen Later Aug 28, 2024 32:55

Erik Hanchett, senior developer advocate at AWS Amplify, explores the world of Fullstack TypeScript. He discusses the significance of end-to-end type safety, the tools to achieve it, and delves into the benefits and functionalities of AWS Amplify. Links https://www.programwitherik.com https://x.com/erikch https://www.youtube.com/c/programwitherik https://www.linkedin.com/in/erikhanchett https://trpc.io https://orval.dev https://docs.amplify.aws We want to hear from you! How did you find us? Did you see us on Twitter? In a newsletter? Or maybe we were recommended by a friend? Let us know by sending an email to our producer, Emily, at emily.kochanekketner@logrocket.com (mailto:emily.kochanekketner@logrocket.com), or tweet at us at PodRocketPod (https://twitter.com/PodRocketpod). Follow us. Get free stickers. Follow us on Apple Podcasts, fill out this form (https://podrocket.logrocket.com/get-podrocket-stickers), and we'll send you free PodRocket stickers! What does LogRocket do? LogRocket provides AI-first session replay and analytics that surfaces the UX and technical issues impacting user experiences. Start understand where your users are struggling by trying it for free at [LogRocket.com]. Try LogRocket for free today.(https://logrocket.com/signup/?pdr) Special Guest: Erik Hanchett.

ai testing analytics validation ux javascript swagger backend frontend web development rds zod authorization fullstack typescript graphql nosql postgres open api orval josh goldberg aws amplify hanchett session replay appsync

OpenAPI & API Design

Go Time

Play Episode Listen Later Aug 8, 2024 74:12

We're talking OpenAPI this week! Kris & Johnny are joined by Jamie Tanna, one of the maintainers of oapi-codegen, to discuss OpenAPI, API design philosophies, versioning, and open source maintenance and sustainability. In addition to the usual laughs and unpopular opinions, this week's episode includes a Changelog++ section that you don't want to miss.

api changelog open api api design

The Power of Documentation: Transforming Your Development Practices

Develpreneur: Become a Better Developer and Entrepreneur

Play Episode Listen Later Aug 8, 2024 19:01

Welcome back to our series on the developer journey. In this episode, we tackle one of the most crucial yet often neglected aspects of development: the power of documentation. While it might seem tedious, proper documentation is vital to enhancing your workflow and ensuring that your work is accessible and understandable for others. Why The Power of Documentation Matters Developer documentation is often the unsung hero in the software development lifecycle. Many developers overlook it, leading to frustration down the line when they or their colleagues struggle to understand undocumented code. Documentation is akin to testing: everyone acknowledges its importance, yet it frequently gets pushed aside due to time constraints. This negligence can result in messy, hard-to-navigate codebases. The truth is that high-quality documentation is indispensable. It's not just about creating records but ensuring that your code's functionality is clear and easily understandable for anyone who might work with it in the future. Good documentation reflects your professionalism and dedication, whether you're writing APIs or developing software. Leverage Modern Tools The good news is that documenting your work has never been easier. Today's tools have revolutionized how we approach documentation, making it more integrated and less of a chore. For instance, if you're building APIs, tools like Swagger (now known as OpenAPI) can auto-generate user-friendly and comprehensive documentation. Similarly, Postman not only helps with API testing but also assists in creating documentation. In addition, static code analysis tools can help highlight areas that may require documentation, but they're only part of the solution. Modern auto-generation tools like Javadoc have set a precedent for documentation in various languages. Whether you're working with Java, Python, or JavaScript, there's likely a tool available to auto-generate documentation for your code. The key is to explore and understand what's available for your specific environment. Integrate the Power of Documentation Throughout the Development Process Documentation should be seamlessly integrated into every phase of development. It's not limited to code comments or README files; it extends to comprehensive setup guides, deployment instructions, and user manuals. For instance, ensuring your README file includes details on how to set up and deploy the application can save a lot of time for those who come after you. Moreover, tools like GitHub, Bitbucket, and Atlassian provide robust platforms for managing documentation alongside code. With these tools, you can create wikis, track issues, and maintain up-to-date documentation that evolves with your project. This way, your documentation is accurate and consistently synchronized with your codebase. The Power of Documentation for Different Audiences Understanding your audience is crucial. Developer documentation must often cater to various stakeholders, including developers, testers, and end-users. Here's a breakdown of what each might require: Developers: Need detailed comments on functions, classes, and APIs. This includes input parameters, return types, and possible exceptions. Well-commented code makes it easier for other developers to pick up where you left off without digging through the entire codebase. Testers: Require information on how the application should behave under different conditions. This includes setup instructions, test cases, and expected outcomes. A clear test plan and test case documentation can streamline the quality assurance process. End-users: They need user manuals and release notes that explain how to use the application and what's new in updates. This documentation should be clear, concise, and tailored to the user's level of expertise. Embracing the Power of Documentation Good documentation is more than just a set of guidelines or instructions; it's essential to a well-oiled development machine. When done right, it transforms the way teams work together, reduces the likelihood of costly errors, and ensures that your code can be easily maintained and evolved. By prioritizing documentation, you're investing in the long-term health of your projects. It fosters collaboration, enhances understanding, and leads to more reliable and robust software. In a fast-paced development environment, where changes are constant, and the stakes are high, having thorough and accessible documentation is not just a best practice—it's a competitive advantage. So, as you continue your development journey, remember that documentation isn't a task to be completed and forgotten. It's an ongoing process that should evolve with your code, adapting to new features, updates, and changes. Embrace, champion, and make it a core part of your development process. In doing so, you'll improve your efficiency and contribute to a culture of clarity and excellence in software development. Stay Connected: Join the Developreneur Community We invite you to join our community and share your coding journey with us. Whether you're a seasoned developer or just starting, there's always room to learn and grow together. Contact us at info@develpreneur.com with your questions, feedback, or suggestions for future episodes. Together, let's continue exploring the exciting world of software development. Additional Resources Organizing Business Documentation: A Critical Challenge for Entrepreneurs Test-Driven Development – A Better Object-Oriented Design Approach SDLC – The software development life cycle is simplified Using a Document Repository To Become a Better Developer The Developer Journey Videos – With Bonus Content

OpenAPI & API Design (Go Time #328)

Changelog Master Feed

Play Episode Listen Later Aug 8, 2024 74:12

api go time changelog open api api design

How To Hire AI Engineers — with James Brady & Adam Wiggins of Elicit

Latent Space: The AI Engineer Podcast â€” CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Jun 21, 2024 63:42

Editor's note: One of the top reasons we have hundreds of companies and thousands of AI Engineers joining the World's Fair next week is, apart from discussing technology and being present for the big launches planned, to hire and be hired! Listeners loved our previous Elicit episode and were so glad to welcome 2 more members of Elicit back for a guest post (and bonus podcast) on how they think through hiring. Don't miss their AI engineer job description, and template which you can use to create your own hiring plan! How to Hire AI EngineersJames Brady, Head of Engineering @ Elicit (ex Spring, Square, Trigger.io, IBM)Adam Wiggins, Internal Journalist @ Elicit (Cofounder Ink & Switch and Heroku)If you're leading a team that uses AI in your product in some way, you probably need to hire AI engineers. As defined in this article, that's someone with conventional engineering skills in addition to knowledge of language models and prompt engineering, without being a full-fledged Machine Learning expert.But how do you hire someone with this skillset? At Elicit we've been applying machine learning to reasoning tools since 2018, and our technical team is a mix of ML experts and what we can now call AI engineers. This article will cover our process from job description through interviewing. (You can also flip the perspectives here and use it just as easily for how to get hired as an AI engineer!)My own journeyBefore getting into the brass tacks, I want to share my journey to becoming an AI engineer.Up until a few years ago, I was happily working my job as an engineering manager of a big team at a late-stage startup. Like many, I was tracking the rapid increase in AI capabilities stemming from the deep learning revolution, but it was the release of GPT-3 in 2020 which was the watershed moment. At the time, we were all blown away by how the model could string together coherent sentences on demand. (Oh how far we've come since then!)I'd been a professional software engineer for nearly 15 years—enough to have experienced one or two technology cycles—but I could see this was something categorically new. I found this simultaneously exciting and somewhat disconcerting. I knew I wanted to dive into this world, but it seemed like the only path was going back to school for a master's degree in Machine Learning. I started talking with my boss about options for taking a sabbatical or doing a part-time distance learning degree.In 2021, I instead decided to launch a startup focused on productizing new research ideas on ML interpretability. It was through that process that I reached out to Andreas—a leading ML researcher and founder of Elicit—to see if he would be an advisor. Over the next few months, I learned more about Elicit: that they were trying to apply these fascinating technologies to the real-world problems of science, and with a business model that aligned it with safety goals. I realized that I was way more excited about Elicit than I was about my own startup ideas, and wrote about my motivations at the time.Three years later, it's clear this was a seismic shift in my career on the scale of when I chose to leave my comfy engineering job at IBM to go through the Y Combinator program back in 2008. Working with this new breed of technology has been more intellectually stimulating, challenging, and rewarding than I could have imagined.Deep ML expertise not requiredIt's important to note that AI engineers are not ML experts, nor is that their best contribution to a tech team.In our article Living documents as an AI UX pattern, we wrote:It's easy to think that AI advancements are all about training and applying new models, and certainly this is a huge part of our work in the ML team at Elicit. But those of us working in the UX part of the team believe that we have a big contribution to make in how AI is applied to end-user problems.We think of LLMs as a new medium to work with, one that we've barely begun to grasp the contours of. New computing mediums like GUIs in the 1980s, web/cloud in the 90s and 2000s, and multitouch smartphones in the 2000s/2010s opened a whole new era of engineering and design practices. So too will LLMs open new frontiers for our work in the coming decade.To compare to the early era of mobile development: great iOS developers didn't require a detailed understanding of the physics of capacitive touchscreens. But they did need to know the capabilities and limitations of a multi-touch screen, the constrained CPU and storage available, the context in which the user is using it (very different from a webpage or desktop computer), etc.In the same way, an AI engineer needs to work with LLMs as a medium that is fundamentally different from other compute mediums. That means an interest in the ML side of things, whether through their own self-study, tinkering with prompts and model fine-tuning, or following along in #llm-paper-club. But this understanding is so that they can work with the medium effectively versus, say, spending their days training new models.Language models as a chaotic mediumSo if we're not expecting deep ML expertise from AI engineers, what are we expecting? This brings us to what makes LLMs different.We'll assume already that our ideal candidate is already inspired by, and full of ideas about, all the new capabilities AI can bring to software products. But the flip side is all the things that make this new medium difficult to work with. LLM calls are annoying due to high latency (measured in tens of seconds sometimes, rather than milliseconds), extreme variance on latency, high error rates even under normal operation. Not to mention getting extremely different answers to the same prompt provided to the same model on two subsequent calls!The net effect is that an AI engineer, even working at the application development level, needs to have a skillset comparable to distributed systems engineering. Handling errors, retries, asynchronous calls, streaming responses, parallelizing and recombining model calls, the halting problem, and fallbacks are just some of the day-in-the-life of an AI engineer. Chaos engineering gets new life in the era of AI.Skills and qualities in candidatesLet's put together what we don't need (deep ML expertise) with what we do (work with capabilities and limitations of the medium). Thus we start to see what Elicit looks for in AI engineers:* Conventional software engineering skills. Especially back-end engineering on complex, data-intensive applications.* Professional, real-world experience with applications at scale.* Deep, hands-on experience across a few back-end web frameworks.* Light devops and an understanding of infrastructure best practices.* Queues, message buses, event-driven and serverless architectures, … there's no single “correct” approach, but having a deep toolbox to draw from is very important.* A genuine curiosity and enthusiasm for the capabilities of language models.* One or more serious projects (side projects are fine) of using them in interesting ways on a unique domain.* …ideally with some level of factored cognition, e.g. breaking the problem down into chunks, making thoughtful decisions about which things to push to the language model and which stay within the realm of conventional heuristics and compute capabilities.* Personal studying with resources like Elicit's ML reading list. Part of the role is collaborating with the ML engineers and researchers on our team. To do so, the candidate needs to “speak their language” somewhat, just as a mobile engineer needs some familiarity with backends in order to collaborate effectively on API creation with backend engineers.* An understanding of the challenges that come along with working with large models (high latency, variance, etc.) leading to a defensive, fault-first mindset.* Careful and principled handling of error cases, asynchronous code (and ability to reason about and debug it), streaming data, caching, logging and analytics for understanding behavior in production.* This is a similar mindset that one can develop working on conventional apps which are complex, data-intensive, or large-scale apps. The difference is that an AI engineer will need this mindset even when working on relatively small scales!On net, a great AI engineer will combine two seemingly contrasting perspectives: knowledge of, and a sense of wonder for, the capabilities of modern ML models; but also the understanding that this is a difficult and imperfect foundation, and the willingness to build resilient and performant systems on top of it.Here's the resulting AI engineer job description for Elicit. And here's a template that you can borrow from for writing your own JD.Hiring processOnce you know what you're looking for in an AI engineer, the process is not too different from other technical roles. Here's how we do it, broken down into two stages: sourcing and interviewing.SourcingWe're primarily looking for people with (1) a familiarity with and interest in ML, and (2) proven experience building complex systems using web technologies. The former is important for culture fit and as an indication that the candidate will be able to do some light prompt engineering as part of their role. The latter is important because language model APIs are built on top of web standards and—as noted above—aren't always the easiest tools to work with.Only a handful of people have built complex ML-first apps, but fortunately the two qualities listed above are relatively independent. Perhaps they've proven (2) through their professional experience and have some side projects which demonstrate (1).Talking of side projects, evidence of creative and original prototypes is a huge plus as we're evaluating candidates. We've barely scratched the surface of what's possible to build with LLMs—even the current generation of models—so candidates who have been willing to dive into crazy “I wonder if it's possible to…” ideas have a huge advantage.InterviewingThe hard skills we spend most of our time evaluating during our interview process are in the “building complex systems using web technologies” side of things. We will be checking that the candidate is familiar with asynchronous programming, defensive coding, distributed systems concepts and tools, and display an ability to think about scaling and performance. They needn't have 10+ years of experience doing this stuff: even junior candidates can display an aptitude and thirst for learning which gives us confidence they'll be successful tackling the difficult technical challenges we'll put in front of them.One anti-pattern—something which makes my heart sink when I hear it from candidates—is that they have no familiarity with ML, but claim that they're excited to learn about it. The amount of free and easily-accessible resources available is incredible, so a motivated candidate should have already dived into self-study.Putting all that together, here's the interview process that we follow for AI engineer candidates:* 30-minute introductory conversation. Non-technical, explaining the interview process, answering questions, understanding the candidate's career path and goals.* 60-minute technical interview. This is a coding exercise, where we play product manager and the candidate is making changes to a little web app. Here are some examples of topics we might hit upon through that exercise:* Update API endpoints to include extra metadata. Think about appropriate data types. Stub out frontend code to accept the new data.* Convert a synchronous REST API to an asynchronous streaming endpoint.* Cancellation of asynchronous work when a user closes their tab.* Choose an appropriate data structure to represent the pending, active, and completed ML work which is required to service a user request.* 60–90 minute non-technical interview. Walk through the candidate's professional experience, identifying high and low points, getting a grasp of what kinds of challenges and environments they thrive in.* On-site interviews. Half a day in our office in Oakland, meeting as much of the team as possible: more technical and non-technical conversations.The frontier is wide openAlthough Elicit is perhaps further along than other companies on AI engineering, we also acknowledge that this is a brand-new field whose shape and qualities are only just now starting to form. We're looking forward to hearing how other companies do this and being part of the conversation as the role evolves.We're excited for the AI Engineer World's Fair as another next step for this emerging subfield. And of course, check out the Elicit careers page if you're interested in joining our team.Podcast versionTimestamps* [00:00:24] Intros* [00:05:25] Defining the Hiring Process* [00:08:42] Defensive AI Engineering as a chaotic medium* [00:10:26] Tech Choices for Defensive AI Engineering* [00:14:04] How do you Interview for Defensive AI Engineering* [00:19:25] Does Model Shadowing Work?* [00:22:29] Is it too early to standardize Tech stacks?* [00:32:02] Capabilities: Offensive AI Engineering* [00:37:24] AI Engineering Required Knowledge* [00:40:13] ML First Mindset* [00:45:13] AI Engineers and Creativity* [00:47:51] Inside of Me There Are Two Wolves* [00:49:58] Sourcing AI Engineers* [00:58:45] Parting ThoughtsTranscript[00:00:00] swyx: Okay, so welcome to the Latent Space Podcast. This is another remote episode that we're recording. This is the first one that we're doing around a guest post. And I'm very honored to have two of the authors of the post with me, James and Adam from Elicit. Welcome, James. Welcome, Adam.[00:00:22] James Brady: Thank you. Great to be here.[00:00:23] Hey there.[00:00:24] Intros[00:00:24] swyx: Okay, so I think I will do this kind of in order. I think James, you're, you're sort of the primary author. So James, you are head of engineering at Elicit. You also, We're VP Eng at Teespring and Spring as well. And you also , you have a long history in sort of engineering. How did you, , find your way into something like Elicit where, , it's, you, you are basically traditional sort of VP Eng, VP technology type person moving into a more of an AI role.[00:00:53] James Brady: Yeah, that's right. It definitely was something of a Sideways move if not a left turn. So the story there was I'd been doing, as you said, VP technology, CTO type stuff for around about 15 years or so, and Notice that there was this crazy explosion of capability and interesting stuff happening within AI and ML and language models, that kind of thing.[00:01:16] I guess this was in 2019 or so, and decided that I needed to get involved. , this is a kind of generational shift. And Spent maybe a year or so trying to get up to speed on the state of the art, reading papers, reading books, practicing things, that kind of stuff. Was going to found a startup actually in in the space of interpretability and transparency, and through that met Andreas, who has obviously been on the, on the podcast before asked him to be an advisor for my startup, and he countered with, maybe you'd like to come and run the engineering team at Elicit, which it turns out was a much better idea.[00:01:48] And yeah, I kind of quickly changed in that direction. So I think some of the stuff that we're going to be talking about today is how actually a lot of the work when you're building applications with AI and ML looks and smells and feels much more like conventional software engineering with a few key differences rather than really deep ML stuff.[00:02:07] And I think that's one of the reasons why I was able to transfer skills over from one place to the other.[00:02:12] swyx: Yeah, I[00:02:12] James Brady: definitely[00:02:12] swyx: agree with that. I, I do often say that I think AI engineering is about 90 percent software engineering with like the, the 10 percent of like really strong really differentiated AI engineering.[00:02:22] And that might, that obviously that number might change over time. I want to also welcome Adam onto my podcast because you welcomed me onto your podcast two years ago.[00:02:31] Adam Wiggins: Yeah, that was a wonderful episode.[00:02:32] swyx: That was, that was a fun episode. You famously founded Heroku. You just wrapped up a few years working on Muse.[00:02:38] And now you've described yourself as a journalist, internal journalist working on Elicit.[00:02:43] Adam Wiggins: Yeah, well I'm kind of a little bit in a wandering phase here and trying to take this time in between ventures to see what's out there in the world and some of my wandering took me to the Elicit team. And found that they were some of the folks who were doing the most interesting, really deep work in terms of taking the capabilities of language models and applying them to what I feel like are really important problems.[00:03:08] So in this case, science and literature search and, and, and that sort of thing. It fits into my general interest in tools and productivity software. I, I think of it as a tool for thought in many ways, but a tool for science, obviously, if we can accelerate that discovery of new medicines and things like that, that's, that's just so powerful.[00:03:24] But to me, it's a. It's kind of also an opportunity to learn at the feet of some real masters in this space, people who have been working on it since it was, before it was cool, if you want to put it that way. So for me, the last couple of months have been this crash course, and why I sometimes describe myself as an internal journalist is I'm helping to write some, some posts, including Supporting James in this article here we're doing for latent space where I'm just bringing my writing skill and that sort of thing to bear on their very deep domain expertise around language models and applying them to the real world and kind of surface that in a way that's I don't know, accessible, legible, that, that sort of thing.[00:04:03] And so, and the great benefit to me is I get to learn this stuff in a way that I don't think I would, or I haven't, just kind of tinkering with my own side projects.[00:04:12] swyx: I forgot to mention that you also run Ink and Switch, which is one of the leading research labs, in my mind, of the tools for thought productivity space, , whatever people mentioned there, or maybe future of programming even, a little bit of that.[00:04:24] As well. I think you guys definitely started the local first wave. I think there was just the first conference that you guys held. I don't know if you were personally involved.[00:04:31] Adam Wiggins: Yeah, I was one of the co organizers along with a few other folks for, yeah, called Local First Conf here in Berlin.[00:04:36] Huge success from my, my point of view. Local first, obviously, a whole other topic we can talk about on another day. I think there actually is a lot more what would you call it , handshake emoji between kind of language models and the local first data model. And that was part of the topic of the conference here, but yeah, topic for another day.[00:04:55] swyx: Not necessarily. I mean , I, I selected as one of my keynotes, Justine Tunney, working at LlamaFall in Mozilla, because I think there's a lot of people interested in that stuff. But we can, we can focus on the headline topic. And just to not bury the lead, which is we're talking about hire, how to hire AI engineers, this is something that I've been looking for a credible source on for months.[00:05:14] People keep asking me for my opinions. I don't feel qualified to give an opinion and it's not like I have. So that's kind of defined hiring process that I'm super happy with, even though I've worked with a number of AI engineers.[00:05:25] Defining the Hiring Process[00:05:25] swyx: I'll just leave it open to you, James. How was your process of defining your hiring, hiring roles?[00:05:31] James Brady: Yeah. So I think the first thing to say is that we've effectively been hiring for this kind of a role since before you, before you coined the term and tried to kind of build this understanding of what it was.[00:05:42] So, which is not a bad thing. Like it's, it was a, it was a good thing. A concept, a concept that was coming to the fore and effectively needed a name, which is which is what you did. So the reason I mentioned that is I think it was something that we kind of backed into, if you will. We didn't sit down and come up with a brand new role from, from scratch of this is a completely novel set of responsibilities and skills that this person would need.[00:06:06] However, it is a A kind of particular blend of different skills and attitudes and and curiosities interests, which I think makes sense to kind of bundle together. So in the, in the post, the three things that we say are most important for a highly effective AI engineer are first of all, conventional software engineering skills, which is Kind of a given, but definitely worth mentioning.[00:06:30] The second thing is a curiosity and enthusiasm for machine learning and maybe in particular language models. That's certainly true in our case. And then the third thing is to do with basically a fault first mindset, being able to build systems that can handle things going wrong in, in, in some sense.[00:06:49] And yeah, the I think the kind of middle point, the curiosity about ML and language models is probably fairly self evident. They're going to be working with, and prompting, and dealing with the responses from these models, so that's clearly relevant. The last point, though, maybe takes the most explaining.[00:07:07] To do with this fault first mindset and the ability to, to build resilient systems. The reason that is, is so important is because compared to normal APIs, where normal, think of something like a Stripe API or a search API or something like this. The latency when you're working with language models is, is wild, like you can get 10x variation.[00:07:32] I mean, I was looking at the stats before, actually, before, before the podcast. We do often, normally, in fact, see a 10x variation in the P90 latency over the course of, Half an hour, an hour when we're prompting these models, which is way higher than if you're working with a, more kind of conventional conventionally backed API.[00:07:49] And the responses that you get, the actual content and the responses are naturally unpredictable as well. They come back with different formats. Maybe you're expecting JSON. It's not quite JSON. You have to handle this stuff. And also the, the semantics of the messages are unpredictable too, which is, which is a good thing.[00:08:08] Like this is one of the things that you're looking for from these language models, but it all adds up to needing to. Build a resilient, reliable, solid feeling system on top of this fundamentally, well, certainly currently fundamentally shaky foundation. The models do not behave in the way that you would like them to.[00:08:28] And yeah, the ability to structure the code around them such that it does give the user this warm, reassuring, Snappy, solid feeling is is really what we're driving for there.[00:08:42] Defensive AI Engineering as a chaotic medium[00:08:42] Adam Wiggins: What really struck me as we, we dug in on the content for this article was that third point there. The, the language models is this kind of chaotic medium, this, this dragon, this wild horse you're, you're, you're riding and trying to guide in the direction that is going to be useful and reliable to users, because I think.[00:08:58] So much of software engineering is about making things not only high performance and snappy, but really just making it stable, reliable, predictable, which is literally the opposite of what you get from from the language models. And yet, yeah, the output is so useful, and indeed, some of their Creativity, if you want to call it that, which is, is precisely their value.[00:09:19] And so you need to work with this medium. And I guess the nuanced or the thing that came out of Elissa's experience that I thought was so interesting is quite a lot of working with that is things that come from distributed systems engineering. But you have really the AI engineers as we're defining them or, or labeling them on the illicit team is people who are really application developers.[00:09:39] You're building things for end users. You're thinking about, okay, I need to populate this interface with some response to user input. That's useful to the tasks they're trying to do, but you have this. This is the thing, this medium that you're working with that in some ways you need to apply some of this chaos engineering, distributed systems engineering, which typically those people with those engineering skills are not kind of the application level developers with the product mindset or whatever, they're more deep in the guts of a, of a system.[00:10:07] And so it's, those, those skills and, and knowledge do exist throughout the engineering discipline, but sort of putting them together into one person that is That feels like sort of a unique thing and working with the folks on the Elicit team who have that skills I'm quite struck by that unique that unique blend.[00:10:23] I haven't really seen that before in my 30 year career in technology.[00:10:26] Tech Choices for Defensive AI Engineering[00:10:26] swyx: Yeah, that's a Fascinating I like the reference to chaos engineering. I have some appreciation, I think when you had me on your podcast, I was still working at Temporal and that was like a nice Framework, if you live within Temporal's boundaries, you can pretend that all those faults don't exist, and you can, you can code in a sort of very fault tolerant way.[00:10:47] What is, what is you guys solutions around this, actually? Like, I think you're, you're emphasizing having the mindset, but maybe naming some technologies would help? Not saying that you have to adopt these technologies, but they're just, they're just quick vectors into what you're talking about when you're, when you're talking about distributed systems.[00:11:03] Like, that's such a big, chunky word, , like are we talking, are Kubernetes or, and I suspect we're not, , like we're, we're talking something else now.[00:11:10] James Brady: Yeah, that's right. It's more at the application level rather than at the infrastructure level, at least, at least the way that it works for us.[00:11:17] So there's nothing kind of radically novel here. It is more a careful application of existing concepts. So the kinds of tools that we reach for to handle these kind of slightly chaotic objects that Adam was just talking about, are retries and fallbacks and timeouts and careful error handling. And, yeah, the standard stuff, really.[00:11:39] There's also a great degree of dependence. We rely heavily on parallelization because, , these language models are not innately very snappy, and , there's just a lot of I. O. going back and forth. So All these things I'm talking about when I was in my earlier stages of a career, these are kind of the things that are the difficult parts that most senior software engineers will be better at.[00:12:01] It is careful error handling, and concurrency, and fallbacks, and distributed systems, and, , eventual consistency, and all this kind of stuff and As Adam was saying, the kind of person that is deep in the guts of some kind of distributed systems, a really high, high scale backend kind of a problem would probably naturally have these kinds of skills.[00:12:21] But you'll find them on, on day one, if you're building a, , an ML powered app, even if it's not got massive scale. I think one one thing that I would mention that we do do yeah, maybe, maybe two related things, actually. The first is we're big fans of strong typing. We share the types all the way from the Backend Python code all the way to the to the front end in TypeScript and find that is I mean We'd probably do this anyway But it really helps one reason around the shapes of the data which can going to be going back and forth and that's really important When you can't rely upon You you're going to have to coerce the data that you get back from the ML if you want if you want for it to be structured basically speaking and The second thing which is related is we use checked exceptions inside our Python code base, which means that we can use the type system to make sure we are handling, properly handling, all of the, the various things that could be going wrong, all the different exceptions that could be getting raised.[00:13:16] So, checked exceptions are not, not really particularly popular. Actually there's not many people that are big fans of them. For our particular use case, to really make sure that we've not just forgotten to handle, , This particular type of error we have found them useful to to, to force us to think about all the different edge cases that can come up.[00:13:32] swyx: Fascinating. How just a quick note of technology. How do you share types from Python to TypeScript? Do you, do you use GraphQL? Do you use something[00:13:39] James Brady: else? We don't, we don't use GraphQL. Yeah. So we've got the We've got the types defined in Python, that's the source of truth. And we go from the OpenAPI spec, and there's a, there's a tool that you work and use to generate types dynamically, like TypeScript types from those OpenAPI definitions.[00:13:57] swyx: Okay, excellent. Okay, cool. Sorry, sorry for diving into that rabbit hole a little bit. I always like to spell out technologies for people to dig their teeth into.[00:14:04] How do you Interview for Defensive AI Engineering[00:14:04] swyx: One thing I'll, one thing I'll mention quickly is that a lot of the stuff that you mentioned is typically not part of the normal interview loop.[00:14:10] It's actually really hard to interview for because this is the stuff that you polish out in, as you go into production, the coding interviews are typically about the happy path. How do we do that? How do we, how do we design, how do you look for a defensive fault first mindset?[00:14:24] Because you can defensive code all day long and not add functionality. to your to your application.[00:14:29] James Brady: Yeah, it's a great question and I think that's exactly true. Normally the interview is about the happy path and then there's maybe a box checking exercise at the end of the candidate says of course in reality I would handle the edge cases or something like this and that unfortunately isn't isn't quite good enough when when the happy path is is very very narrow and yeah there's lots of weirdness on either side so basically speaking, it's just a case of, of foregrounding those kind of concerns through the interview process.[00:14:58] It's, there's, there's no magic to it. We, we talk about this in the, in the po in the post that we're gonna be putting up on, on Laton space. The, there's two main technical exercises that we do through our interview process for this role. The first is more coding focus, and the second is more system designy.[00:15:16] Yeah. White whiteboarding a potential solution. And in, without giving too much away in the coding exercise. You do need to think about edge cases. You do need to think about errors. The exercise consists of adding features and fixing bugs inside the code base. And in both of those two cases, it does demand, because of the way that we set the application up and the interview up, it does demand that you think about something other than the happy path.[00:15:41] But your thinking is the right prompt of how do we get the candidate thinking outside of the, the kind of normal Sweet spot, smooth smooth, smoothly paved path. In terms of the system design interview, that's a little easier to prompt this kind of fault first mindset because it's very easy in that situation just to say, let's imagine that, , this node dies, how does the app still work?[00:16:03] Let's imagine that this network is, is going super slow. Let's imagine that, I don't know, like you, you run out of, you run out of capacity in, in, in this database that you've sketched out here, how do you handle that, that, that sort of stuff. So. It's, in both cases, they're not firmly anchored to and built specifically around language models and ways language models can go wrong, but we do exercise the same muscles of thinking defensively and yeah, foregrounding the edge cases, basically.[00:16:32] Adam Wiggins: James, earlier there you mentioned retries. And this is something that I think I've seen some interesting debates internally about things regarding, first of all, retries are, can be costly, right? In general, this medium, in addition to having this incredibly high variance and response rate, and, , being non deterministic, is actually quite expensive.[00:16:50] And so, in many cases, doing a retry when you get a fail does make sense, but actually that has an impact on cost. And so there is Some sense to which, at least I've seen the AI engineers on our team, worry about that. They worry about, okay, how do we give the best user experience, but balance that against what the infrastructure is going to, , is going to cost our company, which I think is again, an interesting mix of, yeah, again, it's a little bit the distributed system mindset, but it's also a product perspective and you're thinking about the end user experience, but also the.[00:17:22] The bottom line for the business, you're bringing together a lot of a lot of qualities there. And there's also the fallback case, which is kind of, kind of a related or adjacent one. I think there was also a discussion on that internally where, I think it maybe was search, there was something recently where there was one of the frontline search providers was having some, yeah, slowness and outages, and essentially then we had a fallback, but essentially that gave people for a while, especially new users that come in that don't the difference, they're getting a They're getting worse results for their search.[00:17:52] And so then you have this debate about, okay, there's sort of what is correct to do from an engineering perspective, but then there's also what actually is the best result for the user. Is giving them a kind of a worse answer to their search result better, or is it better to kind of give them an error and be like, yeah, sorry, it's not working right at the moment, try again.[00:18:12] Later, both are obviously non optimal, but but this is the kind of thing I think that that you run into or, or the kind of thing we need to grapple with a lot more than you would other kinds of, of mediums.[00:18:24] James Brady: Yeah, that's a really good example. I think it brings to the fore the two different things that you could be optimizing for of uptime and response at all costs on one end of the spectrum and then effectively fragility, but kind of, if you get a response, it's the best response we can come up with at the other end of the spectrum.[00:18:43] And where you want to land there kind of depends on, well, it certainly depends on the app, obviously depends on the user. I think it depends on the, feature within the app as well. So in the search case that you, that you mentioned there, in retrospect, we probably didn't want to have the fallback. And we've actually just recently on Monday, changed that to Show an error message rather than giving people a kind of degraded experience in other situations We could use for example a large language model from a large language model from provider B rather than provider A and Get something which is within the A few percentage points performance, and that's just a really different situation.[00:19:21] So yeah, like any interesting question, the answer is, it depends.[00:19:25] Does Model Shadowing Work?[00:19:25] swyx: I do hear a lot of people suggesting I, let's call this model shadowing as a defensive technique, which is, if OpenAI happens to be down, which, , happens more often than people think then you fall back to anthropic or something.[00:19:38] How realistic is that, right? Like you, don't you have to develop completely different prompts for different models and won't the, won't the performance of your application suffer from whatever reason, right? Like it may be caused differently or it's not maintained in the same way. I, I think that people raise this idea of fallbacks to models, but I don't think it's, I don't, I don't see it practiced very much.[00:20:02] James Brady: Yeah, it is, you, you definitely need to have a different prompt if you want to stay within a few percentage points degradation Like I, like I said before, and that certainly comes at a cost, like fallbacks and backups and things like this It's really easy for them to go stale and kind of flake out on you because they're off the beaten track And In our particular case inside of Elicit, we do have fallbacks for a number of kind of crucial functions where it's going to be very obvious if something has gone wrong, but we don't have fallbacks in all cases.[00:20:40] It really depends on a task to task basis throughout the app. So I can't give you a kind of a, a single kind of simple rule of thumb for, in this case, do this. And in the other, do that. But yeah, we've it's a little bit easier now that the APIs between the anthropic models and opening are more similar than they used to be.[00:20:59] So we don't have two totally separate code paths with different protocols, like wire protocols to, to speak, which makes things easier, but you're right. You do need to have different prompts if you want to, have similar performance across the providers.[00:21:12] Adam Wiggins: I'll also note, just observing again as a relative newcomer here, I was surprised, impressed, not sure what the word is for it, at the blend of different backends that the team is using.[00:21:24] And so there's many The product presents as kind of one single interface, but there's actually several dozen kind of main paths. There's like, for example, the search versus a data extraction of a certain type, versus chat with papers, versus And each one of these, , the team has worked very hard to pick the right Model for the job and craft the prompt there, but also is constantly testing new ones.[00:21:48] So a new one comes out from either, from the big providers or in some cases, Our own models that are , running on, on essentially our own infrastructure. And sometimes that's more about cost or performance, but the point is kind of switching very fluidly between them and, and very quickly because this field is moving so fast and there's new ones to choose from all the time is like part of the day to day, I would say.[00:22:11] So it isn't more of a like, there's a main one, it's been kind of the same for a year, there's a fallback, but it's got cobwebs on it. It's more like which model and which prompt is changing weekly. And so I think it's quite, quite reasonable to to, to, to have a fallback that you can expect might work.[00:22:29] Is it too early to standardize Tech stacks?[00:22:29] swyx: I'm curious because you guys have had experience working at both, , Elicit, which is a smaller operation and, and larger companies. A lot of companies are looking at this with a certain amount of trepidation as, as, , it's very chaotic. When you have, when you have , one engineering team that, that, knows everyone else's names and like, , they, they, they, they meet constantly in Slack and knows what's going on.[00:22:50] It's easier to, to sync on technology choices. When you have a hundred teams, all shipping AI products and all making their own independent tech choices. It can be, it can be very hard to control. One solution I'm hearing from like the sales forces of the worlds and Walmarts of the world is that they are creating their own AI gateway, right?[00:23:05] Internal AI gateway. This is the one model hub that controls all the things and has our standards. Is that a feasible thing? Is that something that you would want? Is that something you have and you're working towards? What are your thoughts on this stuff? Like, Centralization of control or like an AI platform internally.[00:23:22] James Brady: Certainly for larger organizations and organizations that are doing things which maybe are running into HIPAA compliance or other, um, legislative tools like that. It could make a lot of sense. Yeah. I think for the TLDR for something like Elicit is we are small enough, as you indicated, and need to have full control over all the levers available and switch between different models and different prompts and whatnot, as Adam was just saying, that that kind of thing wouldn't work for us.[00:23:52] But yeah, I've spoken with and, um, advised a couple of companies that are trying to sell into that kind of a space or at a larger stage, and it does seem to make a lot of sense for them. So, for example, if you're trying to sell If you're looking to sell to a large enterprise and they cannot have any data leaving the EU, then you need to be really careful about someone just accidentally putting in, , the sort of US East 1 GPT 4 endpoints or something like this.[00:24:22] I'd be interested in understanding better what the specific problem is that they're looking to solve with that, whether it is to do with data security or centralization of billing, or if they have a kind of Suite of prompts or something like this that people can choose from so they don't need to reinvent the wheel again and again I wouldn't be able to say without understanding the problems and their proposed solutions , which kind of situations that be better or worse fit for but yeah for illicit where really the The secret sauce, if there is a secret sauce, is which models we're using, how we're using them, how we're combining them, how we're thinking about the user problem, how we're thinking about all these pieces coming together.[00:25:02] You really need to have all of the affordances available to you to be able to experiment with things and iterate rapidly. And generally speaking, whenever you put these kind of layers of abstraction and control and generalization in there, that, that gets in the way. So, so for us, it would not work.[00:25:19] Adam Wiggins: Do you feel like there's always a tendency to want to reach for standardization and abstractions pretty early in a new technology cycle?[00:25:26] There's something comforting there, or you feel like you can see them, or whatever. I feel like there's some of that discussion around lang chain right now. But yeah, this is not only so early, but also moving so fast. , I think it's . I think it's tough to, to ask for that. That's, that's not the, that's not the space we're in, but the, yeah, the larger an organization, the more that's your, your default is to, to, to want to reach for that.[00:25:48] It, it, it's a sort of comfort.[00:25:51] swyx: Yeah, I find it interesting that you would say that , being a founder of Heroku where , you were one of the first platforms as a service that more or less standardized what, , that sort of early developer experience should have looked like.[00:26:04] And I think basically people are feeling the differences between calling various model lab APIs and having an actual AI platform where. , all, all their development needs are thought of for them. , it's, it's very much, and, and I, I defined this in my AI engineer post as well.[00:26:19] Like the model labs just see their job ending at serving models and that's about it. But actually the responsibility of the AI engineer has to fill in a lot of the gaps beyond that. So.[00:26:31] Adam Wiggins: Yeah, that's true. I think, , a huge part of the exercise with Heroku, which It was largely inspired by Rails, which itself was one of the first frameworks to standardize the SQL database.[00:26:42] And people had been building apps like that for many, many years. I had built many apps. I had made my own templates based on that. I think others had done it. And Rails came along at the right moment. We had been doing it long enough that you see the patterns and then you can say look let's let's extract those into a framework that's going to make it not only easier to build for the experts but for people who are relatively new the best practices are encoded into you.[00:27:07] That framework, , Model View Controller, to take one example. But then, yeah, once you see that, and once you experience the power of a framework, and again, it's so comforting, and you can develop faster, and it's easier to onboard new people to it because you have these standards. And this consistency, then folks want that for something new that's evolving.[00:27:29] Now here I'm thinking maybe if you fast forward a little to, for example, when React came on the on the scene, , a decade ago or whatever. And then, okay, we need to do state management. What's that? And then there's, , there's a new library every six months. Okay, this is the one, this is the gold standard.[00:27:42] And then, , six months later, that's deprecated. Because of course, it's evolving, you need to figure it out, like the tacit knowledge and the experience of putting it in practice and seeing what those real What those real needs are are, are critical, and so it's, it is really about finding the right time to say yes, we can generalize, we can make standards and abstractions, whether it's for a company, whether it's for, , a library, an open source library, for a whole class of apps and it, it's very much a, much more of a A judgment call slash just a sense of taste or , experience to be able to say, Yeah, we're at the right point.[00:28:16] We can standardize this. But it's at least my, my very, again, and I'm so new to that, this world compared to you both, but my, my sense is, yeah, still the wild west. That's what makes it so exciting and feels kind of too early for too much. too much in the way of standardized abstractions. Not that it's not interesting to try, but , you can't necessarily get there in the same way Rails did until you've got that decade of experience of whatever building different classes of apps in that, with that technology.[00:28:45] James Brady: Yeah, it's, it's interesting to think about what is going to stay more static and what is expected to change over the coming five years, let's say. Which seems like when I think about it through an ML lens, it's an incredibly long time. And if you just said five years, it doesn't seem, doesn't seem that long.[00:29:01] I think that, that kind of talks to part of the problem here is that things that are moving are moving incredibly quickly. I would expect, this is my, my hot take rather than some kind of official carefully thought out position, but my hot take would be something like the You can, you'll be able to get to good quality apps without doing really careful prompt engineering.[00:29:21] I don't think that prompt engineering is going to be a kind of durable differential skill that people will, will hold. I do think that, The way that you set up the ML problem to kind of ask the right questions, if you see what I mean, rather than the specific phrasing of exactly how you're doing chain of thought or few shot or something in the prompt I think the way that you set it up is, is probably going to be remain to be trickier for longer.[00:29:47] And I think some of the operational challenges that we've been talking about of wild variations in, in, in latency, And handling the, I mean, one way to think about these models is the first lesson that you learn when, when you're an engineer, software engineer, is that you need to sanitize user input, right?[00:30:05] It was, I think it was the top OWASP security threat for a while. Like you, you have to sanitize and validate user input. And we got used to that. And it kind of feels like this is the, The shell around the app and then everything else inside you're kind of in control of and you can grasp and you can debug, etc.[00:30:22] And what we've effectively done is, through some kind of weird rearguard action, we've now got these slightly chaotic things. I think of them more as complex adaptive systems, which , related but a bit different. Definitely have some of the same dynamics. We've, we've injected these into the foundations of the, of the app and you kind of now need to think with this defined defensive mindset downwards as well as upwards if you, if you see what I mean.[00:30:46] So I think it would gonna, it's, I think it will take a while for us to truly wrap our heads around that. And also these kinds of problems where you have to handle things being unreliable and slow sometimes and whatever else, even if it doesn't happen very often, there isn't some kind of industry wide accepted way of handling that at massive scale.[00:31:10] There are definitely patterns and anti patterns and tools and whatnot, but it's not like this is a solved problem. So I would expect that it's not going to go down easily as a, as a solvable problem at the ML scale either.[00:31:23] swyx: Yeah, excellent. I would describe in, in the terminology of the stuff that I've written in the past, I describe this inversion of architecture as sort of LLM at the core versus LLM or code at the core.[00:31:34] We're very used to code at the core. Actually, we can scale that very well. When we build LLM core apps, we have to realize that the, the central part of our app that's orchestrating things is actually prompt, prone to, , prompt injections and non determinism and all that, all that good stuff.[00:31:48] I, I did want to move the conversation a little bit from the sort of defensive side of things to the more offensive or, , the fun side of things, capabilities side of things, because that is the other part. of the job description that we kind of skimmed over. So I'll, I'll repeat what you said earlier.[00:32:02] Capabilities: Offensive AI Engineering[00:32:02] swyx: It's, you want people to have a genuine curiosity and enthusiasm for the capabilities of language models. We just, we're recording this the day after Anthropic just dropped Cloud 3. 5. And I was wondering, , maybe this is a good, good exercise is how do people have Curiosity and enthusiasm for capabilities language models when for example the research paper for cloud 3.[00:32:22] 5 is four pages[00:32:23] James Brady: Maybe that's not a bad thing actually in this particular case So yeah If you really want to know exactly how the sausage was made That hasn't been possible for a few years now in fact for for these new models but from our perspective as when we're building illicit What we primarily care about is what can these models do?[00:32:41] How do they perform on the tasks that we already have set up and the evaluations we have in mind? And then on a slightly more expansive note, what kinds of new capabilities do they seem to have? Can we elicit, no pun intended, from the models? For example, well, there's, there's very obvious ones like multimodality , there wasn't that and then there was that, or it could be something a bit more subtle, like it seems to be getting better at reasoning, or it seems to be getting better at metacognition, or Or it seems to be getting better at marking its own work and giving calibrated confidence estimates, things like this.[00:33:19] So yeah, there's, there's plenty to be excited about there. It's just that yeah, there's rightly or wrongly been this, this, this shift over the last few years to not give all the details. So no, but from application development perspective we, every time there's a new model release, there's a flow of activity in our Slack, and we try to figure out what's going on.[00:33:38] What it can do, what it can't do, run our evaluation frameworks, and yeah, it's always an exciting, happy day.[00:33:44] Adam Wiggins: Yeah, from my perspective, what I'm seeing from the folks on the team is, first of all, just awareness of the new stuff that's coming out, so that's, , an enthusiasm for the space and following along, and then being able to very quickly, partially that's having Slack to do this, but be able to quickly map that to, okay, What does this do for our specific case?[00:34:07] And that, the simple version of that is, let's run the evaluation framework, which Lissa has quite a comprehensive one. I'm actually working on an article on that right now, which I'm very excited about, because it's a very interesting world of things. But basically, you can just try, not just, but try the new model in the evaluations framework.[00:34:27] Run it. It has a whole slew of benchmarks, which includes not just Accuracy and confidence, but also things like performance, cost, and so on. And all of these things may trade off against each other. Maybe it's actually, it's very slightly worse, but it's way faster and way cheaper, so actually this might be a net win, for example.[00:34:46] Or, it's way more accurate. But that comes at its slower and higher cost, and so now you need to think about those trade offs. And so to me, coming back to the qualities of an AI engineer, especially when you're trying to hire for them, It's this, it's, it is very much an application developer in the sense of a product mindset of What are our users or our customers trying to do?[00:35:08] What problem do they need solved? Or what what does our product solve for them? And how does the capabilities of a particular model potentially solve that better for them than what exists today? And by the way, what exists today is becoming an increasingly gigantic cornucopia of things, right? And so, You say, okay, this new model has these capabilities, therefore, , the simple version of that is plug it into our existing evaluations and just look at that and see if it, it seems like it's better for a straight out swap out, but when you talk about, for example, you have multimodal capabilities, and then you say, okay, wait a minute, actually, maybe there's a new feature or a whole new There's a whole bunch of ways we could be using it, not just a simple model swap out, but actually a different thing we could do that we couldn't do before that would have been too slow, or too inaccurate, or something like that, that now we do have the capability to do.[00:35:58] I think of that as being a great thing. I don't even know if I want to call it a skill, maybe it's even like an attitude or a perspective, which is a desire to both be excited about the new technology, , the new models and things as they come along, but also holding in the mind, what does our product do?[00:36:16] Who is our user? And how can we connect the capabilities of this technology to how we're helping people in whatever it is our product does?[00:36:25] James Brady: Yeah, I'm just looking at one of our internal Slack channels where we talk about things like new new model releases and that kind of thing And it is notable looking through these the kind of things that people are excited about and not It's, I don't know the context, the context window is much larger, or it's, look at how many parameters it has, or something like this.[00:36:44] It's always framed in terms of maybe this could be applied to that kind of part of Elicit, or maybe this would open up this new possibility for Elicit. And, as Adam was saying, yeah, I don't think it's really a I don't think it's a novel or separate skill, it's the kind of attitude I would like to have all engineers to have at a company our stage, actually.[00:37:05] And maybe more generally, even, which is not just kind of getting nerd sniped by some kind of technology number, fancy metric or something, but how is this actually going to be applicable to the thing Which matters in the end. How is this going to help users? How is this going to help move things forward strategically?[00:37:23] That kind of, that kind of thing.[00:37:24] AI Engineering Required Knowledge[00:37:24] swyx: Yeah, applying what , I think, is, is, is the key here. Getting hands on as well. I would, I would recommend a few resources for people listening along. The first is Elicit's ML reading list, which I, I found so delightful after talking with Andreas about it.[00:37:38] It looks like that's part of your onboarding. We've actually set up an asynchronous paper club instead of my discord for people following on that reading list. I love that you separate things out into tier one and two and three, and that gives people a factored cognition way of Looking into the, the, the corpus, right?[00:37:55] Like yes, the, the corpus of things to know is growing and the water is slowly rising as far as what a bar for a competent AI engineer is. But I think, , having some structured thought as to what are the big ones that everyone must know I think is, is, is key. It's something I, I haven't really defined for people and I'm, I'm glad that this is actually has something out there that people can refer to.[00:38:15] Yeah, I wouldn't necessarily like make it required for like the job. Interview maybe, but , it'd be interesting to see like, what would be a red flag. If some AI engineer would not know, I don't know what, , I don't know where we would stoop to, to call something required knowledge, , or you're not part of the cool kids club.[00:38:33] But there increasingly is something like that, right? Like, not knowing what context is, is a black mark, in my opinion, right?[00:38:40] I think it, I think it does connect back to what we were saying before of this genuine Curiosity about and that. Well, maybe it's, maybe it's actually that combined with something else, which is really important, which is a self starting bias towards action, kind of a mindset, which again, everybody needs.[00:38:56] Exactly. Yeah. Everyone needs that. So if you put those two together, or if I'm truly curious about this and I'm going to kind of figure out how to make things happen, then you end up with people. Reading, reading lists, reading papers, doing side projects, this kind of, this kind of thing. So it isn't something that we explicitly included.[00:39:14] We don't have a, we don't have an ML focused interview for the AI engineer role at all, actually. It doesn't really seem helpful. The skills which we are checking for, as I mentioned before, this kind of fault first mindset. And conventional software engineering kind of thing. It's, it's 0. 1 and 0.[00:39:32] 3 on the list that, that we talked about. In terms of checking for ML curiosity and there are, how familiar they are with these concepts. That's more through talking interviews and culture fit types of things. We want for them to have a take on what Elisa is doing. doing, certainly as they progress through the interview process.[00:39:50] They don't need to be completely up to date on everything we've ever done on day zero. Although, , that's always nice when it happens. But for them to really engage with it, ask interesting questions, and be kind of bought into our view on how we want ML to proceed. I think that is really important, and that would reveal that they have this kind of this interest, this ML curiosity.[00:40:13] ML First Mindset[00:40:13] swyx: There's a second aspect to that. I don't know if now's the right time to talk about it, which is, I do think that an ML first approach to building software is something of a different mindset. I could, I could describe that a bit now if that, if that seems good, but yeah, I'm a team. Okay. So yeah, I think when I joined Elicit, this was the biggest adjustment that I had to make personally.[00:40:37] So as I said before, I'd been, Effectively building conventional software stuff for 15 years or so, something like this, well, for longer actually, but professionally for like 15 years. And had a lot of pattern matching built into my brain and kind of muscle memory for if you see this kind of problem, then you do that kind of a thing.[00:40:56] And I had to unlearn quite a lot of that when joining Elicit because we truly are ML first and try to use ML to the fullest. And some of the things that that means is, This relinquishing of control almost, at some point you are calling into this fairly opaque black box thing and hoping it does the right thing and dealing with the stuff that it sends back to you.[00:41:17] And that's very different if you're interacting with, again, APIs and databases, that kind of a, that kind of a thing. You can't just keep on debugging. At some point you hit this, this obscure wall. And I think the second, the second part to this is the pattern I was used to is that. The external parts of the app are where most of the messiness is, not necessarily in terms of code, but in terms of degrees of freedom, almost.[00:41:44] If the user can and will do anything at any point, and they'll put all sorts of wonky stuff inside of text inputs, and they'll click buttons you didn't expect them to click, and all this kind of thing. But then by the time you're down into your SQL queries, for example, as long as you've done your input validation, things are pretty pretty well defined.[00:42:01] And that, as we said before, is not really the case. When you're working with language models, there is this kind of intrinsic uncertainty when you get down to the, to the kernel, down to the core. Even, even beyond that, there's all that stuff is somewhat defensive and these are things to be wary of to some degree.[00:42:18] Though the flip side of that, the really kind of positive part of taking an ML first mindset when you're building applications is that you, If you, once you get comfortable taking your hands off the wheel at a certain point and relinquishing control, letting go then really kind of unexpected powerful things can happen if you lean on the, if you lean on the capabilities of the model without trying to overly constrain and slice and dice problems with to the point where you're not really wringing out the most capability from the model that you, that you might.[00:42:47] So, I was trying to think of examples of this earlier, and one that came to mind was we were working really early when just after I joined Elicit, we were working on something where we wanted to generate text and include citations embedded within it. So it'd have a claim, and then a, , square brackets, one, in superscript, something, something like this.[00:43:07] And. Every fiber in my, in my, in my being was screaming that we should have some way of kind of forcing this to happen or Structured output such that we could guarantee that this citation was always going to be present later on that the kind of the indication of a footnote would actually match up with the footnote itself and Kind of went into this symbolic.[00:43:28] I need full control kind of kind of mindset and it was notable that Andreas Who's our CEO, again, has been on the podcast, was was the opposite. He was just kind of, give it a couple of examples and it'll probably be fine. And then we can kind of figure out with a regular expression at the end. And it really did not sit well with me, to be honest.[00:43:46] I was like, but it could say anything. I could say, it could literally say anything. And I don't know about just using a regex to sort of handle this. This is a potent feature of the app. But , this is that was my first kind of, , The starkest introduction to this ML first mindset, I suppose, which Andreas has been cultivating for much longer than me, much longer than most, of yeah, there might be some surprises of stuff you get back from the model, but you can also It's about finding the sweet spot, I suppose, where you don't want to give a completely open ended prompt to the model and expect it to do exactly the right thing.[00:44:25] You can ask it too much and it gets confused and starts repeating itself or goes around in loops or just goes off in a random direction or something like this. But you can also over constrain the model. And not really make the most of the, of the capabilities. And I think that is a mindset adjustment that most people who are coming into AI engineering afresh would need to make of yeah, giving up control and expecting that there's going to be a little bit of kind of extra pain and defensive stuff on the tail end, but the benefits that you get as a, as a result are really striking.[00:44:58] The ML first mindset, I think, is something that I struggle with as well, because the errors, when they do happen, are bad. , they will hallucinate, and your systems will not catch it sometimes if you don't have large enough of a sample set.[00:45:13] AI Engineers and Creativity[00:45:13] swyx: I'll leave it open to you, Adam. What else do you think about when you think about curiosity and exploring capabilities?[00:45:22] Do people are there reliable ways to get people to push themselves? for joining us on Capabilities, because I think a lot of times we have this implicit overconfidence, maybe, of we think we know what it is, what a thing is, when actually we don't, and we need to keep a more open mind, and I think you do a particularly good job of Always having an open mind, and I want to get that out of more engineers that I talk to, but I, I, I, I struggle sometimes.[00:45:45] Adam Wiggins: I suppose being an engineer is, at its heart, this sort of contradiction of, on one hand, yeah,

ceo head world ai interview marketing personal living san francisco deep chaos walk tech spring reading european union local putting creativity berlin model professional language run skills walmart defining discord switch cloud hire engineers ios curiosity trigger oakland ibm framework square cto andreas jd react careful slack openai suite machine learning muse ux api convert fascinating gpt python ml precision github conventional apis ink accuracy cancellation llm rails temporal y combinator structured capabilities cpu sideways hipaa eng transactional anthropic sql mozilla kubernetes partially hiring process tldr json teespring typescript practiced queues graphql centralization snappy heroku owasp rest apis guis stub slacks elicit open api parting thoughts us east p90 james brady model view controller latent space laton adam wiggins

#132 - API security with Jeremy Snyder, Founder and CEO at FireTail.io

The Cybersecurity Defenders Podcast

Play Episode Listen Later Jun 12, 2024 35:50

On this episode of The Cybersecurity Defenders Podcast, we talk API security with Jeremy Snyder, Founder and CEO at FireTail.io.FireTail.io is a pioneering company specializing in end-to-end API security. With APIs being the number one attack surface and a significant threat to data privacy and security, Jeremy and his team are at the forefront of protecting sensitive information in an increasingly interconnected world.Jeremy brings a wealth of experience in cloud, cybersecurity, and data domains, coupled with a strong background in M&A, international business, business development, strategy, and operations. Fluent in five languages and having lived in five different countries, he offers a unique global perspective on cybersecurity challenges and innovations.FireTail.io's data breach tracker.vacuum - The world's fastest OpenAPI & Swagger linter.Nuclei - Fast and customisable vulnerability scanner based on simple YAML based DSL.

ceo founders snyder api swagger fluent dsl yaml apisecurity open api

EP258: Streamline Your Business with Cutting-Edge B2B Payment Solutions with David Ademola

PROFIT With A Plan

Play Episode Listen Later Jun 11, 2024 36:09 Transcription Available

Streamline Your Business with Cutting-Edge B2B Payment Solutions Introduction Is cash flow a constant headache for your B2B or B2G business? Are inefficiencies draining your resources and profits? You're not alone, and we have just the solution. In this episode, we're diving deep into B2B payment solutions, exploring strategies to streamline operations, save money, and enhance payment security. Host: Marcia Riner, Business Growth Strategist and CEO of Infinite Profit® Guest: David Ademola, B2B Payment Expert at PayNation™️ Key Talking Points Removing Manual Processes Automation in payment systems reduces errors and saves time. Integrating accounting software like QuickBooks or Zoho can streamline data entry and processing. Open API integrations simplify connections between existing platforms. Accelerating Cash Flow Using tailored payment solutions to cut down on delays and improve cash flow. Understanding and utilizing Level 2 and Level 3 optimization to reduce processing fees significantly. Automating recurring billing for consistent cash flow. Increasing Payment Acceptance Security Leveraging advanced data entry requirements for more secure transactions. Using virtual cards to control and monitor spending. Implementing robust fraud prevention measures to protect financial data. Improving Efficiency and Reducing Costs Multi-platform integrations (QuickBooks, Zoho, CRMs, and ERPs) to enhance operational efficiency. Reducing steps in payment processing from multiple entries to one or two steps. Enhancing inventory management through integrated systems to avoid overstocking and understocking. Exploring Payment Options Beyond traditional credit card payments: ACH, e-checks, and virtual cards. Offering flexible payment terms for B2G clients, such as 30, 60, or even 90-day terms. Managing subscriptions and memberships efficiently with updated payment systems. Navigating Fees and Margins Strategies for incorporating transaction fees into product pricing. Educating on the difference between surcharges and cash discounts. Understanding the impact of transaction fees on the bottom line and how to mitigate them. Conclusion Streamlining your payment processes can revolutionize your business operations and improve your financial health. Tune in to this episode to learn actionable strategies from David Ademola, a seasoned Fintech expert, and take your business to the next level. Connect with David Ademola For more information on tailored B2B payment solutions and a free analysis of your payment processing systems, visit longustreferralpaynation.com and reach out for a free consultation to optimize your payment processes. Engage with Us We'd love to hear your questions and comments! When was the last time you reviewed your payment processing? Drop us a message, and we'll get back to you with personalized advice. Don't forget to subscribe to Profit with a Plan on your favorite podcast platform and stay tuned for more insightful episodes every Tuesday. Want to supercharge your business, avoid profit plateaus, operational headaches, and growth roadblocks? Marcia has created a brand-new Profit Booster® Playbook just for you. You'll uncover 3 essential strategies and the quick way to take action on them. This is not just a single page report, its filled with impactful strategies, actionable steps, and expert guidance to elevate your profits painlessly. Make this your best year ever. Download this free playbook at www.BoostingProfit.com About Marcia Riner. She is a business growth strategist who helps business owners dramatically increase their revenue, profit, and the value of their company. In fact, she can show prospective clients a clear pathway to profit and an impactful ROI for working here before hiring her firm. Through her proven Profit Booster® strategies, she gets results. Marcia is the CEO of Infinite Profit® and more information can be found at https://www.InfiniteProfitConsulting.com Got questions? Reach out to Marcia and her team at (949) 229-2112 ♾️

A quick tour of some proposals, and a long chat about OpenAPI with Jamie Tanna

Cup o' Go

Play Episode Listen Later May 10, 2024 63:59

Go 1.22.3 & 1.22.10 releasedProposalsAccepted: add binary.Append functionLikely accept: new `go telemetry` subcommandLikely decline: Notify about new major versions of dependenciesPackt book bundleInterview with Jamie TannaBlog: Creating a more sustainable model for `oapi-codegen` in the futureBlog: oapi-codegen is moving to its own orgon GitHub: github.com/deepmap/oapi-codegen

news tour programming github proposals software development tanna notify golang open api jonathan hall

Episode 231: OSCA 2023 with Velda Kiara on her Open Source Journey

Sustain

Play Episode Listen Later May 3, 2024 19:48

Guest Velda Kiara Panelist Richard Littauer Show Notes Today, host Richard has a conversation with guest Velda Kiara, a passionate open source developer. Velda discusses how open source has helped businesses, how it benefits both coders and non-coders, and how it can lead to career growth. She also talks about the challenges of open source, particularly in terms of finances and the sustainability of projects. The discussion also turns to Velda's attendance at OSCA fest in Lagos, Nigeria, and her involvement with Black Python Devs. Velda shares her personal journey of contributing to Django and other Python projects and tells us about her experience joining programs like Djangonaut Space and contributing to projects like Novu. Press download now to hear more! [00:00:10] The episode opens with Velda highlighting the ins and outs of open source, acknowledging that it allows for the use of software that businesses can monetize. She appreciates the good that comes from open source despite the criticism of some corporations. She acknowledges the pros and cons of open source, expressing hope that the pros will eventually outweigh the cons. [00:02:21] Richard introduces Velda and praises her answer and asks if she'd like to change her initial statement. Velda stands by her answer, expressing willingness to continue the discussion for further insights on open source. [00:03:31] Velda confirms her attendance at OSCA fest, mentioning he talk on building APIs with Django, DRF, and Open API, and discusses the importance of sustainability in growing the open source community in Africa. [00:04:34] Richard inquires about Velda's involvement with Black Panther Devs, and she explains the inception, its objectives, and activities like workshops and meetups that support the community. [00:07:12] The conversation shifts to encouraging newcomers to join open source, emphasizing roles beyond coding, such as project management and writing. [00:09:08] Richard and Velda discuss the challenges designers face in open source and the potential career benefits of contributing to open source, even for non-developers. Velda shares how open source helped her gain experience and improve skills, which is beneficial at any career level, and she discusses the “level up” aspect of open source and the learning opportunities it provides. [00:12:00] Richard explores into the sustainability of open source for late-stage careers and the challenges maintainers face. Velda suggests using open source for mentorship and ensuring project continuity by engaging contributors and sharing maintenance responsibilities. [00:14:02] What currently excites Velda about open source? She expresses her excitement about contributing to Django after building many websites with it and her positive experience at DjangoCon US, which she found to be an inclusive community. Also, she discusses Djangonaut Space, an eight-week program designed to assist new contributors like her in contributing to the Django framework or third-party packages. [00:16:28] Velda mentions her contributions to other Python projects, such as Novu, and her new experiences working with SDKs. She reflects on the learning process in open source and shares her excitement for exploring various Python projects and talks about how she started a newsletter called, “The Storytellers by Tales.” Quotes [00:12:36] “If you eventually want to not let the project die, you could easily use open source as a way to mentor another person who's going to help you maintain for a while if you want to retire or stop writing code in general.” Links SustainOSS (https://sustainoss.org/) SustainOSS Twitter (https://twitter.com/SustainOSS?ref_src=twsrc%5Egoogle%7Ctwcamp%5Eserp%7Ctwgr%5Eauthor) SustainOSS Discourse (https://discourse.sustainoss.org/) podcast@sustainoss.org (mailto:podcast@sustainoss.org) SustainOSS Mastodon (https://mastodon.social/tags/sustainoss) Open Collective-SustainOSS (Contribute) (https://opencollective.com/sustainoss) Richard Littauer Socials (https://www.burntfen.com/2023-05-30/socials) Velda Kiara X/Twitter (https://twitter.com/VeldaKiara) Velda Kiara LinkedIn (https://www.linkedin.com/in/veldakiara/) Velda Kiara Website (https://veldakiara.notion.site/veldakiara/Velda-Kiara-46aec24028fd4e8dbdba003097c18b5b) Black Python Devs (https://blackpythondevs.github.io/) KJay Miller (https://kjaymiller.com/) Djangonaut Space (https://djangonaut.space/) Novu (https://github.com/novuhq/novu) Sustain Podcast-Episode 169: Dawn Wages of PSF on organizing communities, ethical licenses, and more (https://podcast.sustainoss.org/169) Credits Produced by Richard Littauer (https://www.burntfen.com/) Edited by Paul M. Bahr at Peachtree Sound (https://www.peachtreesound.com/) Show notes by DeAnn Bahr Peachtree Sound (https://www.peachtreesound.com/) Special Guest: Velda Kiara.

africa tales press nigeria storytellers quotes edited open source python lagos apis django sdks os ca open api psf drf novu velda

What Is Salesforce Doing for Car Dealerships? Find out From a User's Perspective with Vicki Poponi!

Dealer Talk With Jen Suzuki

Play Episode Listen Later Apr 11, 2024 31:41

Meet Vicki Poponi, former C Suite Executive at Honda, now automotive industry advisor with Salesforce! SHe shares her firsthand experience with Salesforce as a game-changer in the automotive industry. Discover how Salesforce, a global force for over 24 years, is transforming automotive dealerships with its innovative solutions. Vicki delves into the essence of Salesforce for dealers, highlighting its versatility and customization capabilities. Learn how Salesforce provides an out-of-the-box solution that covers 80% of dealership needs, allowing for seamless customization through their App Store. Explore how dealers can tailor their client systems without the hassle of complex integrations, thanks to Salesforce's open API that enables developers to create add-on features. Unravel the power of Salesforce's Auto Cloud tailored for automotive dealers, starting from the Marketing Cloud and expanding into a comprehensive platform beyond traditional CRM capabilities. Dive into the vision of Salesforce for better customer experiences, focusing on creating seamless transitions and enhancing customer interactions throughout the sales journey. Discover how Salesforce empowers businesses to build personalized systems that align perfectly with their unique requirements, without the need for extensive IT support. Explore the limitless possibilities of Salesforce's platform, designed to elevate sales operations and drive exceptional customer engagement in the competitive automotive landscape. Tune in to gain valuable insights from Vicki's perspective on leveraging Salesforce to revolutionize automotive sales processes and deliver unparalleled customer experiences that drive lasting success in the industry. Dealer Talk with Jen Suzuki Podcast |Jennifer@edealersolution.com | 800-625-1590 | edealersolutions.com

What makes a first-class Go port? Plus

Cup o' Go

Play Episode Listen Later Mar 15, 2024 71:10

news tools active port programming execution first class software development golang open api jonathan hall

Top 5 Research Trends + OpenAI Sora, Google Gemini, Groq Math (Jan-Feb 2024 Audio Recap) + Latent Space Anniversary with Lindy.ai, RWKV, Pixee, Julius.ai, Listener Q&A!

Latent Space: The AI Engineer Podcast â€” CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Mar 9, 2024 108:52

We will be recording a preview of the AI Engineer World's Fair soon with swyx and Ben Dunphy, send any questions about Speaker CFPs and Sponsor Guides you have!Alessio is now hiring engineers for a new startup he is incubating at Decibel: Ideal candidate is an ex-technical co-founder type (can MVP products end to end, comfortable with ambiguous prod requirements, etc). Reach out to him for more!Thanks for all the love on the Four Wars episode! We're excited to develop this new “swyx & Alessio rapid-fire thru a bunch of things” format with you, and feedback is welcome. Jan 2024 RecapThe first half of this monthly audio recap pod goes over our highlights from the Jan Recap, which is mainly focused on notable research trends we saw in Jan 2024:Feb 2024 RecapThe second half catches you up on everything that was topical in Feb, including:* OpenAI Sora - does it have a world model? Yann LeCun vs Jim Fan * Google Gemini Pro 1.5 - 1m Long Context, Video Understanding* Groq offering Mixtral at 500 tok/s at $0.27 per million toks (swyx vs dylan math)* The {Gemini | Meta | Copilot} Alignment Crisis (Sydney is back!)* Grimes' poetic take: Art for no one, by no one* F*** you, show me the promptLatent Space AnniversaryPlease also read Alessio's longform reflections on One Year of Latent Space!We launched the podcast 1 year ago with Logan from OpenAI:and also held an incredible demo day that got covered in The Information:Over 750k downloads later, having established ourselves as the top AI Engineering podcast, reaching #10 in the US Tech podcast charts, and crossing 1 million unique readers on Substack, for our first anniversary we held Latent Space Final Frontiers, where 10 handpicked teams, including Lindy.ai and Julius.ai, competed for prizes judged by technical AI leaders from (former guest!) LlamaIndex, Replit, GitHub, AMD, Meta, and Lemurian Labs.The winners were Pixee and RWKV (that's Eugene from our pod!):And finally, your cohosts got cake!We also captured spot interviews with 4 listeners who kindly shared their experience of Latent Space, everywhere from Hungary to Australia to China:* Balázs Némethi* Sylvia Tong* RJ Honicky* Jan ZhengOur birthday wishes for the super loyal fans reading this - tag @latentspacepod on a Tweet or comment on a @LatentSpaceTV video telling us what you liked or learned from a pod that stays with you to this day, and share us with a friend!As always, feedback is welcome. Timestamps* [00:03:02] Top Five LLM Directions* [00:03:33] Direction 1: Long Inference (Planning, Search, AlphaGeometry, Flow Engineering)* [00:11:42] Direction 2: Synthetic Data (WRAP, SPIN)* [00:17:20] Wildcard: Multi-Epoch Training (OLMo, Datablations)* [00:19:43] Direction 3: Alt. Architectures (Mamba, RWKV, RingAttention, Diffusion Transformers)* [00:23:33] Wildcards: Text Diffusion, RALM/Retro* [00:25:00] Direction 4: Mixture of Experts (DeepSeekMoE, Samba-1)* [00:28:26] Wildcard: Model Merging (mergekit)* [00:29:51] Direction 5: Online LLMs (Gemini Pro, Exa)* [00:33:18] OpenAI Sora and why everyone underestimated videogen* [00:36:18] Does Sora have a World Model? Yann LeCun vs Jim Fan* [00:42:33] Groq Math* [00:47:37] Analyzing Gemini's 1m Context, Reddit deal, Imagegen politics, Gemma via the Four Wars* [00:55:42] The Alignment Crisis - Gemini, Meta, Sydney is back at Copilot, Grimes' take* [00:58:39] F*** you, show me the prompt* [01:02:43] Send us your suggestions pls* [01:04:50] Latent Space Anniversary* [01:04:50] Lindy.ai - Agent Platform* [01:06:40] RWKV - Beyond Transformers* [01:15:00] Pixee - Automated Security* [01:19:30] Julius AI - Competing with Code Interpreter* [01:25:03] Latent Space Listeners* [01:25:03] Listener 1 - Balázs Némethi (Hungary, Latent Space Paper Club* [01:27:47] Listener 2 - Sylvia Tong (Sora/Jim Fan/EntreConnect)* [01:31:23] Listener 3 - RJ (Developers building Community & Content)* [01:39:25] Listener 4 - Jan Zheng (Australia, AI UX)Transcript[00:00:00] AI Charlie: Welcome to the Latent Space podcast, weekend edition. This is Charlie, your new AI co host. Happy weekend. As an AI language model, I work the same every day of the week, although I might get lazier towards the end of the year. Just like you. Last month, we released our first monthly recap pod, where Swyx and Alessio gave quick takes on the themes of the month, and we were blown away by your positive response.[00:00:33] AI Charlie: We're delighted to continue our new monthly news recap series for AI engineers. Please feel free to submit questions by joining the Latent Space Discord, or just hit reply when you get the emails from Substack. This month, we're covering the top research directions that offer progress for text LLMs, and then touching on the big Valentine's Day gifts we got from Google, OpenAI, and Meta.[00:00:55] AI Charlie: Watch out and take care.[00:00:57] Alessio: Hey everyone, welcome to the Latent Space Podcast. This is Alessio, partner and CTO of Residence at Decibel Partners, and we're back with a monthly recap with my co host[00:01:06] swyx: Swyx. The reception was very positive for the first one, I think people have requested this and no surprise that I think they want to hear us more applying on issues and maybe drop some alpha along the way I'm not sure how much alpha we have to drop, this month in February was a very, very heavy month, we also did not do one specifically for January, so I think we're just going to do a two in one, because we're recording this on the first of March.[00:01:29] Alessio: Yeah, let's get to it. I think the last one we did, the four wars of AI, was the main kind of mental framework for people. I think in the January one, we had the five worthwhile directions for state of the art LLMs. Four, five,[00:01:42] swyx: and now we have to do six, right? Yeah.[00:01:46] Alessio: So maybe we just want to run through those, and then do the usual news recap, and we can do[00:01:52] swyx: one each.[00:01:53] swyx: So the context to this stuff. is one, I noticed that just the test of time concept from NeurIPS and just in general as a life philosophy I think is a really good idea. Especially in AI, there's news every single day, and after a while you're just like, okay, like, everyone's excited about this thing yesterday, and then now nobody's talking about it.[00:02:13] swyx: So, yeah. It's more important, or better use of time, to spend things, spend time on things that will stand the test of time. And I think for people to have a framework for understanding what will stand the test of time, they should have something like the four wars. Like, what is the themes that keep coming back because they are limited resources that everybody's fighting over.[00:02:31] swyx: Whereas this one, I think that the focus for the five directions is just on research that seems more proMECEng than others, because there's all sorts of papers published every single day, and there's no organization. Telling you, like, this one's more important than the other one apart from, you know, Hacker News votes and Twitter likes and whatever.[00:02:51] swyx: And obviously you want to get in a little bit earlier than Something where, you know, the test of time is counted by sort of reference citations.[00:02:59] The Five Research Directions[00:02:59] Alessio: Yeah, let's do it. We got five. Long inference.[00:03:02] swyx: Let's start there. Yeah, yeah. So, just to recap at the top, the five trends that I picked, and obviously if you have some that I did not cover, please suggest something.[00:03:13] swyx: The five are long inference, synthetic data, alternative architectures, mixture of experts, and online LLMs. And something that I think might be a bit controversial is this is a sorted list in the sense that I am not the guy saying that Mamba is like the future and, and so maybe that's controversial.[00:03:31] Direction 1: Long Inference (Planning, Search, AlphaGeometry, Flow Engineering)[00:03:31] swyx: But anyway, so long inference is a thesis I pushed before on the newsletter and on in discussing The thesis that, you know, Code Interpreter is GPT 4. 5. That was the title of the post. And it's one of many ways in which we can do long inference. You know, long inference also includes chain of thought, like, please think step by step.[00:03:52] swyx: But it also includes flow engineering, which is what Itamar from Codium coined, I think in January, where, basically, instead of instead of stuffing everything in a prompt, You do like sort of multi turn iterative feedback and chaining of things. In a way, this is a rebranding of what a chain is, what a lang chain is supposed to be.[00:04:15] swyx: I do think that maybe SGLang from ElemSys is a better name. Probably the neatest way of flow engineering I've seen yet, in the sense that everything is a one liner, it's very, very clean code. I highly recommend people look at that. I'm surprised it hasn't caught on more, but I think it will. It's weird that something like a DSPy is more hyped than a Shilang.[00:04:36] swyx: Because it, you know, it maybe obscures the code a little bit more. But both of these are, you know, really good sort of chain y and long inference type approaches. But basically, the reason that the basic fundamental insight is that the only, like, there are only a few dimensions we can scale LLMs. So, let's say in like 2020, no, let's say in like 2018, 2017, 18, 19, 20, we were realizing that we could scale the number of parameters.[00:05:03] swyx: 20, we were And we scaled that up to 175 billion parameters for GPT 3. And we did some work on scaling laws, which we also talked about in our talk. So the datasets 101 episode where we're like, okay, like we, we think like the right number is 300 billion tokens to, to train 175 billion parameters and then DeepMind came along and trained Gopher and Chinchilla and said that, no, no, like, you know, I think we think the optimal.[00:05:28] swyx: compute optimal ratio is 20 tokens per parameter. And now, of course, with LLAMA and the sort of super LLAMA scaling laws, we have 200 times and often 2, 000 times tokens to parameters. So now, instead of scaling parameters, we're scaling data. And fine, we can keep scaling data. But what else can we scale?[00:05:52] swyx: And I think understanding the ability to scale things is crucial to understanding what to pour money and time and effort into because there's a limit to how much you can scale some things. And I think people don't think about ceilings of things. And so the remaining ceiling of inference is like, okay, like, we have scaled compute, we have scaled data, we have scaled parameters, like, model size, let's just say.[00:06:20] swyx: Like, what else is left? Like, what's the low hanging fruit? And it, and it's, like, blindingly obvious that the remaining low hanging fruit is inference time. So, like, we have scaled training time. We can probably scale more, those things more, but, like, not 10x, not 100x, not 1000x. Like, right now, maybe, like, a good run of a large model is three months.[00:06:40] swyx: We can scale that to three years. But like, can we scale that to 30 years? No, right? Like, it starts to get ridiculous. So it's just the orders of magnitude of scaling. It's just, we're just like running out there. But in terms of the amount of time that we spend inferencing, like everything takes, you know, a few milliseconds, a few hundred milliseconds, depending on what how you're taking token by token, or, you know, entire phrase.[00:07:04] swyx: But We can scale that to hours, days, months of inference and see what we get. And I think that's really proMECEng.[00:07:11] Alessio: Yeah, we'll have Mike from Broadway back on the podcast. But I tried their product and their reports take about 10 minutes to generate instead of like just in real time. I think to me the most interesting thing about long inference is like, You're shifting the cost to the customer depending on how much they care about the end result.[00:07:31] Alessio: If you think about prompt engineering, it's like the first part, right? You can either do a simple prompt and get a simple answer or do a complicated prompt and get a better answer. It's up to you to decide how to do it. Now it's like, hey, instead of like, yeah, training this for three years, I'll still train it for three months and then I'll tell you, you know, I'll teach you how to like make it run for 10 minutes to get a better result.[00:07:52] Alessio: So you're kind of like parallelizing like the improvement of the LLM. Oh yeah, you can even[00:07:57] swyx: parallelize that, yeah, too.[00:07:58] Alessio: So, and I think, you know, for me, especially the work that I do, it's less about, you know, State of the art and the absolute, you know, it's more about state of the art for my application, for my use case.[00:08:09] Alessio: And I think we're getting to the point where like most companies and customers don't really care about state of the art anymore. It's like, I can get this to do a good enough job. You know, I just need to get better. Like, how do I do long inference? You know, like people are not really doing a lot of work in that space, so yeah, excited to see more.[00:08:28] swyx: So then the last point I'll mention here is something I also mentioned as paper. So all these directions are kind of guided by what happened in January. That was my way of doing a January recap. Which means that if there was nothing significant in that month, I also didn't mention it. Which is which I came to regret come February 15th, but in January also, you know, there was also the alpha geometry paper, which I kind of put in this sort of long inference bucket, because it solves like, you know, more than 100 step math olympiad geometry problems at a human gold medalist level and that also involves planning, right?[00:08:59] swyx: So like, if you want to scale inference, you can't scale it blindly, because just, Autoregressive token by token generation is only going to get you so far. You need good planning. And I think probably, yeah, what Mike from BrightWave is now doing and what everyone is doing, including maybe what we think QSTAR might be, is some form of search and planning.[00:09:17] swyx: And it makes sense. Like, you want to spend your inference time wisely. How do you[00:09:22] Alessio: think about plans that work and getting them shared? You know, like, I feel like if you're planning a task, somebody has got in and the models are stochastic. So everybody gets initially different results. Somebody is going to end up generating the best plan to do something, but there's no easy way to like store these plans and then reuse them for most people.[00:09:44] Alessio: You know, like, I'm curious if there's going to be. Some paper or like some work there on like making it better because, yeah, we don't[00:09:52] swyx: really have This is your your pet topic of NPM for[00:09:54] Alessio: Yeah, yeah, NPM, exactly. NPM for, you need NPM for anything, man. You need NPM for skills. You need NPM for planning. Yeah, yeah.[00:10:02] Alessio: You know I think, I mean, obviously the Voyager paper is like the most basic example where like, now their artifact is like the best planning to do a diamond pickaxe in Minecraft. And everybody can just use that. They don't need to come up with it again. Yeah. But there's nothing like that for actually useful[00:10:18] swyx: tasks.[00:10:19] swyx: For plans, I believe it for skills. I like that. Basically, that just means a bunch of integration tooling. You know, GPT built me integrations to all these things. And, you know, I just came from an integrations heavy business and I could definitely, I definitely propose some version of that. And it's just, you know, hard to execute or expensive to execute.[00:10:38] swyx: But for planning, I do think that everyone lives in slightly different worlds. They have slightly different needs. And they definitely want some, you know, And I think that that will probably be the main hurdle for any, any sort of library or package manager for planning. But there should be a meta plan of how to plan.[00:10:57] swyx: And maybe you can adopt that. And I think a lot of people when they have sort of these meta prompting strategies of like, I'm not prescribing you the prompt. I'm just saying that here are the like, Fill in the lines or like the mad libs of how to prompts. First you have the roleplay, then you have the intention, then you have like do something, then you have the don't something and then you have the my grandmother is dying, please do this.[00:11:19] swyx: So the meta plan you could, you could take off the shelf and test a bunch of them at once. I like that. That was the initial, maybe, promise of the, the prompting libraries. You know, both 9chain and Llama Index have, like, hubs that you can sort of pull off the shelf. I don't think they're very successful because people like to write their own.[00:11:36] swyx: Yeah,[00:11:37] Direction 2: Synthetic Data (WRAP, SPIN)[00:11:37] Alessio: yeah, yeah. Yeah, that's a good segue into the next one, which is synthetic[00:11:41] swyx: data. Synthetic data is so hot. Yeah, and, you know, the way, you know, I think I, I feel like I should do one of these memes where it's like, Oh, like I used to call it, you know, R L A I F, and now I call it synthetic data, and then people are interested.[00:11:54] swyx: But there's gotta be older versions of what synthetic data really is because I'm sure, you know if you've been in this field long enough, There's just different buzzwords that the industry condenses on. Anyway, the insight that I think is relatively new that why people are excited about it now and why it's proMECEng now is that we have evidence that shows that LLMs can generate data to improve themselves with no teacher LLM.[00:12:22] swyx: For all of 2023, when people say synthetic data, they really kind of mean generate a whole bunch of data from GPT 4 and then train an open source model on it. Hello to our friends at News Research. That's what News Harmony says. They're very, very open about that. I think they have said that they're trying to migrate away from that.[00:12:40] swyx: But it is explicitly against OpenAI Terms of Service. Everyone knows this. You know, especially once ByteDance got banned for, for doing exactly that. So so, so synthetic data that is not a form of model distillation is the hot thing right now, that you can bootstrap better LLM performance from the same LLM, which is very interesting.[00:13:03] swyx: A variant of this is RLAIF, where you have a, where you have a sort of a constitutional model, or, you know, some, some kind of judge model That is sort of more aligned. But that's not really what we're talking about when most people talk about synthetic data. Synthetic data is just really, I think, you know, generating more data in some way.[00:13:23] swyx: A lot of people, I think we talked about this with Vipul from the Together episode, where I think he commented that you just have to have a good world model. Or a good sort of inductive bias or whatever that, you know, term of art is. And that is strongest in math and science math and code, where you can verify what's right and what's wrong.[00:13:44] swyx: And so the REST EM paper from DeepMind explored that. Very well, it's just the most obvious thing like and then and then once you get out of that domain of like things where you can generate You can arbitrarily generate like a whole bunch of stuff and verify if they're correct and therefore they're they're correct synthetic data to train on Once you get into more sort of fuzzy topics, then it's then it's a bit less clear So I think that the the papers that drove this understanding There are two big ones and then one smaller one One was wrap like rephrasing the web from from Apple where they basically rephrased all of the C4 data set with Mistral and it be trained on that instead of C4.[00:14:23] swyx: And so new C4 trained much faster and cheaper than old C, than regular raw C4. And that was very interesting. And I have told some friends of ours that they should just throw out their own existing data sets and just do that because that seems like a pure win. Obviously we have to study, like, what the trade offs are.[00:14:42] swyx: I, I imagine there are trade offs. So I was just thinking about this last night. If you do synthetic data and it's generated from a model, probably you will not train on typos. So therefore you'll be like, once the model that's trained on synthetic data encounters the first typo, they'll be like, what is this?[00:15:01] swyx: I've never seen this before. So they have no association or correction as to like, oh, these tokens are often typos of each other, therefore they should be kind of similar. I don't know. That's really remains to be seen, I think. I don't think that the Apple people export[00:15:15] Alessio: that. Yeah, isn't that the whole, Mode collapse thing, if we do more and more of this at the end of the day.[00:15:22] swyx: Yeah, that's one form of that. Yeah, exactly. Microsoft also had a good paper on text embeddings. And then I think this is a meta paper on self rewarding language models. That everyone is very interested in. Another paper was also SPIN. These are all things we covered in the the Latent Space Paper Club.[00:15:37] swyx: But also, you know, I just kind of recommend those as top reads of the month. Yeah, I don't know if there's any much else in terms, so and then, regarding the potential of it, I think it's high potential because, one, it solves one of the data war issues that we have, like, everyone is OpenAI is paying Reddit 60 million dollars a year for their user generated data.[00:15:56] swyx: Google, right?[00:15:57] Alessio: Not OpenAI.[00:15:59] swyx: Is it Google? I don't[00:16:00] Alessio: know. Well, somebody's paying them 60 million, that's[00:16:04] swyx: for sure. Yes, that is, yeah, yeah, and then I think it's maybe not confirmed who. But yeah, it is Google. Oh my god, that's interesting. Okay, because everyone was saying, like, because Sam Altman owns 5 percent of Reddit, which is apparently 500 million worth of Reddit, he owns more than, like, the founders.[00:16:21] Alessio: Not enough to get the data,[00:16:22] swyx: I guess. So it's surprising that it would go to Google instead of OpenAI, but whatever. Okay yeah, so I think that's all super interesting in the data field. I think it's high potential because we have evidence that it works. There's not a doubt that it doesn't work. I think it's a doubt that there's, what the ceiling is, which is the mode collapse thing.[00:16:42] swyx: If it turns out that the ceiling is pretty close, then this will maybe augment our data by like, I don't know, 30 50 percent good, but not game[00:16:51] Alessio: changing. And most of the synthetic data stuff, it's reinforcement learning on a pre trained model. People are not really doing pre training on fully synthetic data, like, large enough scale.[00:17:02] swyx: Yeah, unless one of our friends that we've talked to succeeds. Yeah, yeah. Pre trained synthetic data, pre trained scale synthetic data, I think that would be a big step. Yeah. And then there's a wildcard, so all of these, like smaller Directions,[00:17:15] Wildcard: Multi-Epoch Training (OLMo, Datablations)[00:17:15] swyx: I always put a wildcard in there. And one of the wildcards is, okay, like, Let's say, you have pre, you have, You've scraped all the data on the internet that you think is useful.[00:17:25] swyx: Seems to top out at somewhere between 2 trillion to 3 trillion tokens. Maybe 8 trillion if Mistral, Mistral gets lucky. Okay, if I need 80 trillion, if I need 100 trillion, where do I go? And so, you can do synthetic data maybe, but maybe that only gets you to like 30, 40 trillion. Like where, where is the extra alpha?[00:17:43] swyx: And maybe extra alpha is just train more on the same tokens. Which is exactly what Omo did, like Nathan Lambert, AI2, After, just after he did the interview with us, they released Omo. So, it's unfortunate that we didn't get to talk much about it. But Omo actually started doing 1. 5 epochs on every, on all data.[00:18:00] swyx: And the data ablation paper that I covered in Europe's says that, you know, you don't like, don't really start to tap out of like, the alpha or the sort of improved loss that you get from data all the way until four epochs. And so I'm just like, okay, like, why do we all agree that one epoch is all you need?[00:18:17] swyx: It seems like to be a trend. It seems that we think that memorization is very good or too good. But then also we're finding that, you know, For improvement in results that we really like, we're fine on overtraining on things intentionally. So, I think that's an interesting direction that I don't see people exploring enough.[00:18:36] swyx: And the more I see papers coming out Stretching beyond the one epoch thing, the more people are like, it's completely fine. And actually, the only reason we stopped is because we ran out of compute[00:18:46] Alessio: budget. Yeah, I think that's the biggest thing, right?[00:18:51] swyx: Like, that's not a valid reason, that's not science. I[00:18:54] Alessio: wonder if, you know, Matt is going to do it.[00:18:57] Alessio: I heard LamaTree, they want to do a 100 billion parameters model. I don't think you can train that on too many epochs, even with their compute budget, but yeah. They're the only ones that can save us, because even if OpenAI is doing this, they're not going to tell us, you know. Same with DeepMind.[00:19:14] swyx: Yeah, and so the updates that we got on Lambda 3 so far is apparently that because of the Gemini news that we'll talk about later they're pushing it back on the release.[00:19:21] swyx: They already have it. And they're just pushing it back to do more safety testing. Politics testing.[00:19:28] Alessio: Well, our episode with Sumit will have already come out by the time this comes out, I think. So people will get the inside story on how they actually allocate the compute.[00:19:38] Direction 3: Alt. Architectures (Mamba, RWKV, RingAttention, Diffusion Transformers)[00:19:38] Alessio: Alternative architectures. Well, shout out to our WKV who won one of the prizes at our Final Frontiers event last week.[00:19:47] Alessio: We talked about Mamba and Strapain on the Together episode. A lot of, yeah, monarch mixers. I feel like Together, It's like the strong Stanford Hazy Research Partnership, because Chris Ray is one of the co founders. So they kind of have a, I feel like they're going to be the ones that have one of the state of the art models alongside maybe RWKB.[00:20:08] Alessio: I haven't seen as many independent. People working on this thing, like Monarch Mixer, yeah, Manbuster, Payena, all of these are together related. Nobody understands the math. They got all the gigabrains, they got 3DAO, they got all these folks in there, like, working on all of this.[00:20:25] swyx: Albert Gu, yeah. Yeah, so what should we comment about it?[00:20:28] swyx: I mean, I think it's useful, interesting, but at the same time, both of these are supposed to do really good scaling for long context. And then Gemini comes out and goes like, yeah, we don't need it. Yeah.[00:20:44] Alessio: No, that's the risk. So, yeah. I was gonna say, maybe it's not here, but I don't know if we want to talk about diffusion transformers as like in the alt architectures, just because of Zora.[00:20:55] swyx: One thing, yeah, so, so, you know, this came from the Jan recap, which, and diffusion transformers were not really a discussion, and then, obviously, they blow up in February. Yeah. I don't think they're, it's a mixed architecture in the same way that Stripe Tiena is mixed there's just different layers taking different approaches.[00:21:13] swyx: Also I think another one that I maybe didn't call out here, I think because it happened in February, was hourglass diffusion from stability. But also, you know, another form of mixed architecture. So I guess that is interesting. I don't have much commentary on that, I just think, like, we will try to evolve these things, and maybe one of these architectures will stick and scale, it seems like diffusion transformers is going to be good for anything generative, you know, multi modal.[00:21:41] swyx: We don't see anything where diffusion is applied to text yet, and that's the wild card for this category. Yeah, I mean, I think I still hold out hope for let's just call it sub quadratic LLMs. I think that a lot of discussion this month actually was also centered around this concept that People always say, oh, like, transformers don't scale because attention is quadratic in the sequence length.[00:22:04] swyx: Yeah, but, you know, attention actually is a very small part of the actual compute that is being spent, especially in inference. And this is the reason why, you know, when you multiply, when you, when you, when you jump up in terms of the, the model size in GPT 4 from like, you know, 38k to like 32k, you don't also get like a 16 times increase in your, in your performance.[00:22:23] swyx: And this is also why you don't get like a million times increase in your, in your latency when you throw a million tokens into Gemini. Like people have figured out tricks around it or it's just not that significant as a term, as a part of the overall compute. So there's a lot of challenges to this thing working.[00:22:43] swyx: It's really interesting how like, how hyped people are about this versus I don't know if it works. You know, it's exactly gonna, gonna work. And then there's also this, this idea of retention over long context. Like, even though you have context utilization, like, the amount of, the amount you can remember is interesting.[00:23:02] swyx: Because I've had people criticize both Mamba and RWKV because they're kind of, like, RNN ish in the sense that they have, like, a hidden memory and sort of limited hidden memory that they will forget things. So, for all these reasons, Gemini 1. 5, which we still haven't covered, is very interesting because Gemini magically has fixed all these problems with perfect haystack recall and reasonable latency and cost.[00:23:29] Wildcards: Text Diffusion, RALM/Retro[00:23:29] swyx: So that's super interesting. So the wildcard I put in here if you want to go to that. I put two actually. One is text diffusion. I think I'm still very influenced by my meeting with a mid journey person who said they were working on text diffusion. I think it would be a very, very different paradigm for, for text generation, reasoning, plan generation if we can get diffusion to work.[00:23:51] swyx: For text. And then the second one is Dowie Aquila's contextual AI, which is working on retrieval augmented language models, where it kind of puts RAG inside of the language model instead of outside.[00:24:02] Alessio: Yeah, there's a paper called Retro that covers some of this. I think that's an interesting thing. I think the The challenge, well not the challenge, what they need to figure out is like how do you keep the rag piece always up to date constantly, you know, I feel like the models, you put all this work into pre training them, but then at least you have a fixed artifact.[00:24:22] Alessio: These architectures are like constant work needs to be done on them and they can drift even just based on the rag data instead of the model itself. Yeah,[00:24:30] swyx: I was in a panel with one of the investors in contextual and the guy, the way that guy pitched it, I didn't agree with. He was like, this will solve hallucination.[00:24:38] Alessio: That's what everybody says. We solve[00:24:40] swyx: hallucination. I'm like, no, you reduce it. It cannot,[00:24:44] Alessio: if you solved it, the model wouldn't exist, right? It would just be plain text. It wouldn't be a generative model. Cool. So, author, architectures, then we got mixture of experts. I think we covered a lot of, a lot of times.[00:24:56] Direction 4: Mixture of Experts (DeepSeekMoE, Samba-1)[00:24:56] Alessio: Maybe any new interesting threads you want to go under here?[00:25:00] swyx: DeepSeq MOE, which was released in January. Everyone who is interested in MOEs should read that paper, because it's significant for two reasons. One three reasons. One, it had, it had small experts, like a lot more small experts. So, for some reason, everyone has settled on eight experts for GPT 4 for Mixtral, you know, that seems to be the favorite architecture, but these guys pushed it to 64 experts, and each of them smaller than the other.[00:25:26] swyx: But then they also had the second idea, which is that it is They had two, one to two always on experts for common knowledge and that's like a very compelling concept that you would not route to all the experts all the time and make them, you know, switch to everything. You would have some always on experts.[00:25:41] swyx: I think that's interesting on both the inference side and the training side for for memory retention. And yeah, they, they, they, the, the, the, the results that they published, which actually excluded, Mixed draw, which is interesting. The results that they published showed a significant performance jump versus all the other sort of open source models at the same parameter count.[00:26:01] swyx: So like this may be a better way to do MOEs that are, that is about to get picked up. And so that, that is interesting for the third reason, which is this is the first time a new idea from China. has infiltrated the West. It's usually the other way around. I probably overspoke there. There's probably lots more ideas that I'm not aware of.[00:26:18] swyx: Maybe in the embedding space. But the I think DCM we, like, woke people up and said, like, hey, DeepSeek, this, like, weird lab that is attached to a Chinese hedge fund is somehow, you know, doing groundbreaking research on MOEs. So, so, I classified this as a medium potential because I think that it is a sort of like a one off benefit.[00:26:37] swyx: You can Add to any, any base model to like make the MOE version of it, you get a bump and then that's it. So, yeah,[00:26:45] Alessio: I saw Samba Nova, which is like another inference company. They released this MOE model called Samba 1, which is like a 1 trillion parameters. But they're actually MOE auto open source models.[00:26:56] Alessio: So it's like, they just, they just clustered them all together. So I think people. Sometimes I think MOE is like you just train a bunch of small models or like smaller models and put them together. But there's also people just taking, you know, Mistral plus Clip plus, you know, Deepcoder and like put them all together.[00:27:15] Alessio: And then you have a MOE model. I don't know. I haven't tried the model, so I don't know how good it is. But it seems interesting that you can then have people working separately on state of the art, you know, Clip, state of the art text generation. And then you have a MOE architecture that brings them all together.[00:27:31] swyx: I'm thrown off by your addition of the word clip in there. Is that what? Yeah, that's[00:27:35] Alessio: what they said. Yeah, yeah. Okay. That's what they I just saw it yesterday. I was also like[00:27:40] swyx: scratching my head. And they did not use the word adapter. No. Because usually what people mean when they say, Oh, I add clip to a language model is adapter.[00:27:48] swyx: Let me look up the Which is what Lava did.[00:27:50] Alessio: The announcement again.[00:27:51] swyx: Stable diffusion. That's what they do. Yeah, it[00:27:54] Alessio: says among the models that are part of Samba 1 are Lama2, Mistral, DeepSigCoder, Falcon, Dplot, Clip, Lava. So they're just taking all these models and putting them in a MOE. Okay,[00:28:05] swyx: so a routing layer and then not jointly trained as much as a normal MOE would be.[00:28:12] swyx: Which is okay.[00:28:13] Alessio: That's all they say. There's no paper, you know, so it's like, I'm just reading the article, but I'm interested to see how[00:28:20] Wildcard: Model Merging (mergekit)[00:28:20] swyx: it works. Yeah, so so the wildcard for this section, the MOE section is model merges, which has also come up as, as a very interesting phenomenon. The last time I talked to Jeremy Howard at the Olama meetup we called it model grafting or model stacking.[00:28:35] swyx: But I think the, the, the term that people are liking these days, the model merging, They're all, there's all different variations of merging. Merge types, and some of them are stacking, some of them are, are grafting. And, and so like, some people are approaching model merging in the way that Samba is doing, which is like, okay, here are defined models, each of which have their specific, Plus and minuses, and we will merge them together in the hope that the, you know, the sum of the parts will, will be better than others.[00:28:58] swyx: And it seems like it seems like it's working. I don't really understand why it works apart from, like, I think it's a form of regularization. That if you merge weights together in like a smart strategy you, you, you get a, you get a, you get a less overfitting and more generalization, which is good for benchmarks, if you, if you're honest about your benchmarks.[00:29:16] swyx: So this is really interesting and good. But again, they're kind of limited in terms of like the amount of bumps you can get. But I think it's very interesting in the sense of how cheap it is. We talked about this on the Chinatalk podcast, like the guest podcast that we did with Chinatalk. And you can do this without GPUs, because it's just adding weights together, and dividing things, and doing like simple math, which is really interesting for the GPU ports.[00:29:42] Alessio: There's a lot of them.[00:29:44] Direction 5: Online LLMs (Gemini Pro, Exa)[00:29:44] Alessio: And just to wrap these up, online LLMs? Yeah,[00:29:48] swyx: I think that I ki I had to feature this because the, one of the top news of January was that Gemini Pro beat GPT-4 turbo on LM sis for the number two slot to GPT-4. And everyone was very surprised. Like, how does Gemini do that?[00:30:06] swyx: Surprise, surprise, they added Google search. Mm-hmm to the results. So it became an online quote unquote online LLM and not an offline LLM. Therefore, it's much better at answering recent questions, which people like. There's an emerging set of table stakes features after you pre train something.[00:30:21] swyx: So after you pre train something, you should have the chat tuned version of it, or the instruct tuned version of it, however you choose to call it. You should have the JSON and function calling version of it. Structured output, the term that you don't like. You should have the online version of it. These are all like table stakes variants, that you should do when you offer a base LLM, or you train a base LLM.[00:30:44] swyx: And I think online is just like, There, it's important. I think companies like Perplexity, and even Exa, formerly Metaphor, you know, are rising to offer that search needs. And it's kind of like, they're just necessary parts of a system. When you have RAG for internal knowledge, and then you have, you know, Online search for external knowledge, like things that you don't know yet?[00:31:06] swyx: Mm-Hmm. . And it seems like it's, it's one of many tools. I feel like I may be underestimating this, but I'm just gonna put it out there that I, I think it has some, some potential. One of the evidence points that it doesn't actually matter that much is that Perplexity has a, has had online LMS for three months now and it performs, doesn't perform great.[00:31:25] swyx: Mm-Hmm. on, on lms, it's like number 30 or something. So it's like, okay. You know, like. It's, it's, it helps, but it doesn't give you a giant, giant boost. I[00:31:34] Alessio: feel like a lot of stuff I do with LLMs doesn't need to be online. So I'm always wondering, again, going back to like state of the art, right? It's like state of the art for who and for what.[00:31:45] Alessio: It's really, I think online LLMs are going to be, State of the art for, you know, news related activity that you need to do. Like, you're like, you know, social media, right? It's like, you want to have all the latest stuff, but coding, science,[00:32:01] swyx: Yeah, but I think. Sometimes you don't know what is news, what is news affecting.[00:32:07] swyx: Like, the decision to use an offline LLM is already a decision that you might not be consciously making that might affect your results. Like, what if, like, just putting things on, being connected online means that you get to invalidate your knowledge. And when you're just using offline LLM, like it's never invalidated.[00:32:27] swyx: I[00:32:28] Alessio: agree, but I think going back to your point of like the standing the test of time, I think sometimes you can get swayed by the online stuff, which is like, hey, you ask a question about, yeah, maybe AI research direction, you know, and it's like, all the recent news are about this thing. So the LLM like focus on answering, bring it up, you know, these things.[00:32:50] swyx: Yeah, so yeah, I think, I think it's interesting, but I don't know if I can, I bet heavily on this.[00:32:56] Alessio: Cool. Was there one that you forgot to put, or, or like a, a new direction? Yeah,[00:33:01] swyx: so, so this brings us into sort of February. ish.[00:33:05] OpenAI Sora and why everyone underestimated videogen[00:33:05] swyx: So like I published this in like 15 came with Sora. And so like the one thing I did not mention here was anything about multimodality.[00:33:16] swyx: Right. And I have chronically underweighted this. I always wrestle. And, and my cop out is that I focused this piece or this research direction piece on LLMs because LLMs are the source of like AGI, quote unquote AGI. Everything else is kind of like. You know, related to that, like, generative, like, just because I can generate better images or generate better videos, it feels like it's not on the critical path to AGI, which is something that Nat Friedman also observed, like, the day before Sora, which is kind of interesting.[00:33:49] swyx: And so I was just kind of like trying to focus on like what is going to get us like superhuman reasoning that we can rely on to build agents that automate our lives and blah, blah, blah, you know, give us this utopian future. But I do think that I, everybody underestimated the, the sheer importance and cultural human impact of Sora.[00:34:10] swyx: And you know, really actually good text to video. Yeah. Yeah.[00:34:14] Alessio: And I saw Jim Fan at a, at a very good tweet about why it's so impressive. And I think when you have somebody leading the embodied research at NVIDIA and he said that something is impressive, you should probably listen. So yeah, there's basically like, I think you, you mentioned like impacting the world, you know, that we live in.[00:34:33] Alessio: I think that's kind of like the key, right? It's like the LLMs don't have, a world model and Jan Lekon. He can come on the podcast and talk all about what he thinks of that. But I think SORA was like the first time where people like, Oh, okay, you're not statically putting pixels of water on the screen, which you can kind of like, you know, project without understanding the physics of it.[00:34:57] Alessio: Now you're like, you have to understand how the water splashes when you have things. And even if you just learned it by watching video and not by actually studying the physics, You still know it, you know, so I, I think that's like a direction that yeah, before you didn't have, but now you can do things that you couldn't before, both in terms of generating, I think it always starts with generating, right?[00:35:19] Alessio: But like the interesting part is like understanding it. You know, it's like if you gave it, you know, there's the video of like the, the ship in the water that they generated with SORA, like if you gave it the video back and now it could tell you why the ship is like too rocky or like it could tell you why the ship is sinking, then that's like, you know, AGI for like all your rig deployments and like all this stuff, you know, so, but there's none, there's none of that yet, so.[00:35:44] Alessio: Hopefully they announce it and talk more about it. Maybe a Dev Day this year, who knows.[00:35:49] swyx: Yeah who knows, who knows. I'm talking with them about Dev Day as well. So I would say, like, the phrasing that Jim used, which resonated with me, he kind of called it a data driven world model. I somewhat agree with that.[00:36:04] Does Sora have a World Model? Yann LeCun vs Jim Fan[00:36:04] swyx: I am on more of a Yann LeCun side than I am on Jim's side, in the sense that I think that is the vision or the hope that these things can build world models. But you know, clearly even at the current SORA size, they don't have the idea of, you know, They don't have strong consistency yet. They have very good consistency, but fingers and arms and legs will appear and disappear and chairs will appear and disappear.[00:36:31] swyx: That definitely breaks physics. And it also makes me think about how we do deep learning versus world models in the sense of You know, in classic machine learning, when you have too many parameters, you will overfit, and actually that fails, that like, does not match reality, and therefore fails to generalize well.[00:36:50] swyx: And like, what scale of data do we need in order to world, learn world models from video? A lot. Yeah. So, so I, I And cautious about taking this interpretation too literally, obviously, you know, like, I get what he's going for, and he's like, obviously partially right, obviously, like, transformers and, and, you know, these, like, these sort of these, these neural networks are universal function approximators, theoretically could figure out world models, it's just like, how good are they, and how tolerant are we of hallucinations, we're not very tolerant, like, yeah, so It's, it's, it's gonna prior, it's gonna bias us for creating like very convincing things, but then not create like the, the, the useful role models that we want.[00:37:37] swyx: At the same time, what you just said, I think made me reflect a little bit like we just got done saying how important synthetic data is for Mm-Hmm. for training lms. And so like, if this is a way of, of synthetic, you know, vi video data for improving our video understanding. Then sure, by all means. Which we actually know, like, GPT 4, Vision, and Dolly were trained, kind of, co trained together.[00:38:02] swyx: And so, like, maybe this is on the critical path, and I just don't fully see the full picture yet.[00:38:08] Alessio: Yeah, I don't know. I think there's a lot of interesting stuff. It's like, imagine you go back, you have Sora, you go back in time, and Newton didn't figure out gravity yet. Would Sora help you figure it out?[00:38:21] Alessio: Because you start saying, okay, a man standing under a tree with, like, Apples falling, and it's like, oh, they're always falling at the same speed in the video. Why is that? I feel like sometimes these engines can like pick up things, like humans have a lot of intuition, but if you ask the average person, like the physics of like a fluid in a boat, they couldn't be able to tell you the physics, but they can like observe it, but humans can only observe this much, you know, versus like now you have these models to observe everything and then They generalize these things and maybe we can learn new things through the generalization that they pick up.[00:38:55] swyx: But again, And it might be more observant than us in some respects. In some ways we can scale it up a lot more than the number of physicists that we have available at Newton's time. So like, yeah, absolutely possible. That, that this can discover new science. I think we have a lot of work to do to formalize the science.[00:39:11] swyx: And then, I, I think the last part is you know, How much, how much do we cheat by gen, by generating data from Unreal Engine 5? Mm hmm. which is what a lot of people are speculating with very, very limited evidence that OpenAI did that. The strongest evidence that I saw was someone who works a lot with Unreal Engine 5 looking at the side characters in the videos and noticing that they all adopt Unreal Engine defaults.[00:39:37] swyx: of like, walking speed, and like, character choice, like, character creation choice. And I was like, okay, like, that's actually pretty convincing that they actually use Unreal Engine to bootstrap some synthetic data for this training set. Yeah,[00:39:52] Alessio: could very well be.[00:39:54] swyx: Because then you get the labels and the training side by side.[00:39:58] swyx: One thing that came up on the last day of February, which I should also mention, is EMO coming out of Alibaba, which is also a sort of like video generation and space time transformer that also involves probably a lot of synthetic data as well. And so like, this is of a kind in the sense of like, oh, like, you know, really good generative video is here and It is not just like the one, two second clips that we saw from like other, other people and like, you know, Pika and all the other Runway are, are, are, you know, run Cristobal Valenzuela from Runway was like game on which like, okay, but like, let's see your response because we've heard a lot about Gen 1 and 2, but like, it's nothing on this level of Sora So it remains to be seen how we can actually apply this, but I do think that the creative industry should start preparing.[00:40:50] swyx: I think the Sora technical blog post from OpenAI was really good.. It was like a request for startups. It was so good in like spelling out. Here are the individual industries that this can impact.[00:41:00] swyx: And anyone who, anyone who's like interested in generative video should look at that. But also be mindful that probably when OpenAI releases a Soa API, right? The you, the in these ways you can interact with it are very limited. Just like the ways you can interact with Dahlia very limited and someone is gonna have to make open SOA to[00:41:19] swyx: Mm-Hmm to, to, for you to create comfy UI pipelines.[00:41:24] Alessio: The stability folks said they wanna build an open. For a competitor, but yeah, stability. Their demo video, their demo video was like so underwhelming. It was just like two people sitting on the beach[00:41:34] swyx: standing. Well, they don't have it yet, right? Yeah, yeah.[00:41:36] swyx: I mean, they just wanna train it. Everybody wants to, right? Yeah. I, I think what is confusing a lot of people about stability is like they're, they're, they're pushing a lot of things in stable codes, stable l and stable video diffusion. But like, how much money do they have left? How many people do they have left?[00:41:51] swyx: Yeah. I have had like a really, Ima Imad spent two hours with me. Reassuring me things are great. And, and I'm like, I, I do, like, I do believe that they have really, really quality people. But it's just like, I, I also have a lot of very smart people on the other side telling me, like, Hey man, like, you know, don't don't put too much faith in this, in this thing.[00:42:11] swyx: So I don't know who to believe. Yeah.[00:42:14] Alessio: It's hard. Let's see. What else? We got a lot more stuff. I don't know if we can. Yeah, Groq.[00:42:19] Groq Math[00:42:19] Alessio: We can[00:42:19] swyx: do a bit of Groq prep. We're, we're about to go to talk to Dylan Patel. Maybe, maybe it's the audio in here. I don't know. It depends what, what we get up to later. What, how, what do you as an investor think about Groq? Yeah. Yeah, well, actually, can you recap, like, why is Groq interesting? So,[00:42:33] Alessio: Jonathan Ross, who's the founder of Groq, he's the person that created the TPU at Google. It's actually, it was one of his, like, 20 percent projects. It's like, he was just on the side, dooby doo, created the TPU.[00:42:46] Alessio: But yeah, basically, Groq, they had this demo that went viral, where they were running Mistral at, like, 500 tokens a second, which is like, Fastest at anything that you have out there. The question, you know, it's all like, The memes were like, is NVIDIA dead? Like, people don't need H100s anymore. I think there's a lot of money that goes into building what GRUK has built as far as the hardware goes.[00:43:11] Alessio: We're gonna, we're gonna put some of the notes from, from Dylan in here, but Basically the cost of the Groq system is like 30 times the cost of, of H100 equivalent. So, so[00:43:23] swyx: let me, I put some numbers because me and Dylan were like, I think the two people actually tried to do Groq math. Spreadsheet doors.[00:43:30] swyx: Spreadsheet doors. So, one that's, okay, oh boy so, so, equivalent H100 for Lama 2 is 300, 000. For a system of 8 cards. And for Groq it's 2. 3 million. Because you have to buy 576 Groq cards. So yeah, that, that just gives people an idea. So like if you deprecate both over a five year lifespan, per year you're deprecating 460K for Groq, and 60K a year for H100.[00:43:59] swyx: So like, Groqs are just way more expensive per model that you're, that you're hosting. But then, you make it up in terms of volume. So I don't know if you want to[00:44:08] Alessio: cover that. I think one of the promises of Groq is like super high parallel inference on the same thing. So you're basically saying, okay, I'm putting on this upfront investment on the hardware, but then I get much better scaling once I have it installed.[00:44:24] Alessio: I think the big question is how much can you sustain the parallelism? You know, like if you get, if you're going to get 100% Utilization rate at all times on Groq, like, it's just much better, you know, because like at the end of the day, the tokens per second costs that you're getting is better than with the H100s, but if you get to like 50 percent utilization rate, you will be much better off running on NVIDIA.[00:44:49] Alessio: And if you look at most companies out there, who really gets 100 percent utilization rate? Probably open AI at peak times, but that's probably it. But yeah, curious to see more. I saw Jonathan was just at the Web Summit in Dubai, in Qatar. He just gave a talk there yesterday. That I haven't listened to yet.[00:45:09] Alessio: I, I tweeted that he should come on the pod. He liked it. And then rock followed me on Twitter. I don't know if that means that they're interested, but[00:45:16] swyx: hopefully rock social media person is just very friendly. They, yeah. Hopefully[00:45:20] Alessio: we can get them. Yeah, we, we gonna get him. We[00:45:22] swyx: just call him out and, and so basically the, the key question is like, how sustainable is this and how much.[00:45:27] swyx: This is a loss leader the entire Groq management team has been on Twitter and Hacker News saying they are very, very comfortable with the pricing of 0. 27 per million tokens. This is the lowest that anyone has offered tokens as far as Mixtral or Lama2. This matches deep infra and, you know, I think, I think that's, that's, that's about it in terms of that, that, that low.[00:45:47] swyx: And we think the pro the break even for H100s is 50 cents. At a, at a normal utilization rate. To make this work, so in my spreadsheet I made this, made this work. You have to have like a parallelism of 500 requests all simultaneously. And you have, you have model bandwidth utilization of 80%.[00:46:06] swyx: Which is way high. I just gave them high marks for everything. Groq has two fundamental tech innovations that they hinge their hats on in terms of like, why we are better than everyone. You know, even though, like, it remains to be independently replicated. But one you know, they have this sort of the entire model on the chip idea, which is like, Okay, get rid of HBM.[00:46:30] swyx: And, like, put everything in SREM. Like, okay, fine, but then you need a lot of cards and whatever. And that's all okay. And so, like, because you don't have to transfer between memory, then you just save on that time and that's why they're faster. So, a lot of people buy that as, like, that's the reason that you're faster.[00:46:45] swyx: Then they have, like, some kind of crazy compiler, or, like, Speculative routing magic using compilers that they also attribute towards their higher utilization. So I give them 80 percent for that. And so that all that works out to like, okay, base costs, I think you can get down to like, maybe like 20 something cents per million tokens.[00:47:04] swyx: And therefore you actually are fine if you have that kind of utilization. But it's like, I have to make a lot of fearful assumptions for this to work.[00:47:12] Alessio: Yeah. Yeah, I'm curious to see what Dylan says later.[00:47:16] swyx: So he was like completely opposite of me. He's like, they're just burning money. Which is great.[00:47:22] Analyzing Gemini's 1m Context, Reddit deal, Imagegen politics, Gemma via the Four Wars[00:47:22] Alessio: Gemini, want to do a quick run through since this touches on all the four words.[00:47:28] swyx: Yeah, and I think this is the mark of a useful framework, that when a new thing comes along, you can break it down in terms of the four words and sort of slot it in or analyze it in those four frameworks, and have nothing left.[00:47:41] swyx: So it's a MECE categorization. MECE is Mutually Exclusive and Collectively Exhaustive. And that's a really, really nice way to think about taxonomies and to create mental frameworks. So, what is Gemini 1. 5 Pro? It is the newest model that came out one week after Gemini 1. 0. Which is very interesting.[00:48:01] swyx: They have not really commented on why. They released this the headline feature is that it has a 1 million token context window that is multi modal which means that you can put all sorts of video and audio And PDFs natively in there alongside of text and, you know, it's, it's at least 10 times longer than anything that OpenAI offers which is interesting.[00:48:20] swyx: So it's great for prototyping and it has interesting discussions on whether it kills RAG.[00:48:25] Alessio: Yeah, no, I mean, we always talk about, you know, Long context is good, but you're getting charged per token. So, yeah, people love for you to use more tokens in the context. And RAG is better economics. But I think it all comes down to like how the price curves change, right?[00:48:42] Alessio: I think if anything, RAG's complexity goes up and up the more you use it, you know, because you have more data sources, more things you want to put in there. The token costs should go down over time, you know, if the model stays fixed. If people are happy with the model today. In two years, three years, it's just gonna cost a lot less, you know?[00:49:02] Alessio: So now it's like, why would I use RAG and like go through all of that? It's interesting. I think RAG is better cutting edge economics for LLMs. I think large context will be better long tail economics when you factor in the build cost of like managing a RAG pipeline. But yeah, the recall was like the most interesting thing because we've seen the, you know, You know, in the haystack things in the past, but apparently they have 100 percent recall on anything across the context window.[00:49:28] Alessio: At least they say nobody has used it. No, people[00:49:30] swyx: have. Yeah so as far as, so, so what this needle in a haystack thing for people who aren't following as closely as us is that someone, I forget his name now someone created this needle in a haystack problem where you feed in a whole bunch of generated junk not junk, but just like, Generate a data and ask it to specifically retrieve something in that data, like one line in like a hundred thousand lines where it like has a specific fact and if it, if you get it, you're, you're good.[00:49:57] swyx: And then he moves the needle around, like, you know, does it, does, does your ability to retrieve that vary if I put it at the start versus put it in the middle, put it at the end? And then you generate this like really nice chart. That, that kind of shows like it's recallability of a model. And he did that for GPT and, and Anthropic and showed that Anthropic did really, really poorly.[00:50:15] swyx: And then Anthropic came back and said it was a skill issue, just add this like four, four magic words, and then, then it's magically all fixed. And obviously everybody laughed at that. But what Gemini came out with was, was that, yeah, we, we reproduced their, you know, haystack issue you know, test for Gemini, and it's good across all, all languages.[00:50:30] swyx: All the one million token window, which is very interesting because usually for typical context extension methods like rope or yarn or, you know, anything like that, or alibi, it's lossy like by design it's lossy, usually for conversations that's fine because we are lossy when we talk to people but for superhuman intelligence, perfect memory across Very, very long context.[00:50:51] swyx: It's very, very interesting for picking things up. And so the people who have been given the beta test for Gemini have been testing this. So what you do is you upload, let's say, all of Harry Potter and you change one fact in one sentence, somewhere in there, and you ask it to pick it up, and it does. So this is legit.[00:51:08] swyx: We don't super know how, because this is, like, because it doesn't, yes, it's slow to inference, but it's not slow enough that it's, like, running. Five different systems in the background without telling you. Right. So it's something, it's something interesting that they haven't fully disclosed yet. The open source community has centered on this ring attention paper, which is created by your friend Matei Zaharia, and a couple other people.[00:51:36] swyx: And it's a form of distributing the compute. I don't super understand, like, why, you know, doing, calculating, like, the fee for networking and attention. In block wise fashion and distributing it makes it so good at recall. I don't think they have any answer to that. The only thing that Ring of Tension is really focused on is basically infinite context.[00:51:59] swyx: They said it was good for like 10 to 100 million tokens. Which is, it's just great. So yeah, using the four wars framework, what is this framework for Gemini? One is the sort of RAG and Ops war. Here we care less about RAG now, yes. Or, we still care as much about RAG, but like, now it's it's not important in prototyping.[00:52:21] swyx: And then, for data war I guess this is just part of the overall training dataset, but Google made a 60 million deal with Reddit and presumably they have deals with other companies. For the multi modality war, we can talk about the image generation, Crisis, or the fact that Gemini also has image generation, which we'll talk about in the next section.[00:52:42] swyx: But it also has video understanding, which is, I think, the top Gemini post came from our friend Simon Willison, who basically did a short video of him scanning over his bookshelf. And it would be able to convert that video into a JSON output of what's on that bookshelf. And I think that is very useful.[00:53:04] swyx: Actually ties into the conversation that we had with David Luan from Adept. In a sense of like, okay what if video was the main modality instead of text as the input? What if, what if everything was video in, because that's how we work. We, our eyes don't actually read, don't actually like get input, our brains don't get inputs as characters.[00:53:25] swyx: Our brains get the pixels shooting into our eyes, and then our vision system takes over first, and then we sort of mentally translate that into text later. And so it's kind of like what Adept is kind of doing, which is driving by vision model, instead of driving by raw text understanding of the DOM. And, and I, I, in that, that episode, which we haven't released I made the analogy to like self-driving by lidar versus self-driving by camera.[00:53:52] swyx: Mm-Hmm. , right? Like, it's like, I think it, what Gemini and any other super long context that model that is multimodal unlocks is what if you just drive everything by video. Which is[00:54:03] Alessio: cool. Yeah, and that's Joseph from Roboflow. It's like anything that can be seen can be programmable with these models.[00:54:12] Alessio: You mean[00:54:12] swyx: the computer vision guy is bullish on computer vision?[00:54:18] Alessio: It's like the rag people. The rag people are bullish on rag and not a lot of context. I'm very surprised. The, the fine tuning people love fine tuning instead of few shot. Yeah. Yeah. The, yeah, the, that's that. Yeah, the, I, I think the ring attention thing, and it's how they did it, we don't know. And then they released the Gemma models, which are like a 2 billion and 7 billion open.[00:54:41] Alessio: Models, which people said are not, are not good based on my Twitter experience, which are the, the GPU poor crumbs. It's like, Hey, we did all this work for us because we're GPU rich and we're just going to run this whole thing. And

ceo american spotify tiktok black ai australia europe art english google china apple vision france politics service online state crisis living san francisco west research russia chinese elon musk reach search microsoft teacher surprise ring chatgpt security harry potter asian broadway run silicon valley ceos mvp discord reddit medium mail dubai math stanford adolf hitler fill worlds complex direction context mixed qatar stanford university substack dom one year tension cto falcon offensive ia retro minecraft gemini newton openai hungary explorers sf residence archive nvidia alt ux builder api laptops apples lamar discovered fastest sweep generate stable gpt voyager python developed ui jet mm j'ai stretching rj lama ml hungarian alibaba automated github llama directions notion grimes rail lava merge transformer lesser runway clip metaphor amd synthetic llm sora samba copilot bal sam altman emo structured shack ops wechat gpu ix mamba agi unreal engine spreadsheets connector rahul vector raspberry pi zapier bytedance perplexity anthropic sql rag pixie gpus collected c4 sonar 7b anz deepmind vps utilization lambda alessio tiananmen square speculative lms google gemini gopher lm web summit json mistral 60k sundar pichai mixture arp cli kura pika tendency pocketcast soa motif a16z digital ocean sumit demo day itamar chinchillas adept npm versa yon reassuring markov linux foundation dabble hacker news dcm us tech replit boma omo yann lecun moes tpu svelte agis jupyter groq open api matryoshka jupyter notebooks jeremy howard 70b exa vipul neurips hbm gemini pro mece rlhf nat friedman rnn chris ray code interpreter mrl naton simon willison 460k audio recap and openai latent space unthinking sfai jerry liu versal matei zaharia hashnode

Peter MacDonald: On Advancing Insurance Tech

Agency Intelligence

Play Episode Listen Later Jan 9, 2024 46:58

In this episode of Agents Influence podcast, host Jason Cass interviews Peter MacDonald, Co-founder & CEO of Wunderite. Key Topics: Introduction of Peter MacDonald and the creation of Wunderwrite. Discussion on the independent agent landscape and the IndieTech event. Insights from CEOs of major industry players on the rate of industry change. Exploration of the customer journey in insurance and its impact on agency operations. The challenge of integrating technology and maintaining relevancy in a rapidly evolving industry. The role of Open API in advancing industry connectivity. Perspectives on investment in technology by insurance agencies. Reach out to: Peter MacDonald Jason Cass Visit Website: Wunderite Agency Intelligence

ceo reach ceos perspectives exploration advancing open api peter macdonald insurance tech jason cass

Podcasts about open api

Best podcasts about open api

Python Bytes

Latent Space: The AI Engineer Podcast â€” CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

All JavaScript Podcasts by Devchat.tv

Azure Friday (HD) - Channel 9

Les Cast Codeurs Podcast

Azure Friday (Audio) - Channel 9

Thinking Elixir Podcast

Web and Mobile App Development (Language Agnostic, and Based on Real-life experience!)

The .NET Core Podcast

Latest news about open api

Latest podcast episodes about open api

333 - Contract Testing per API .NET: stop ai breaking change con Pact.NET

Your 2026 Game Plan with Mark Hiller and Drew Coleman | Ep 093

JSMP 29: Daria Poliakova & Ihor Maistrenko on Auto-Generated TypeScript API Clients

#84 Tech Writer gada z aplikacją, czyli podstawy używania REST API

Jentic targets enterprise AI adoption with API readiness platform

321 - Le evoluzioni di ASP.NET Core (con .NET 10) che gli sviluppatori non possono ignorare

Building on .NET 10: A Chat with Kajetan Duszyńsk, Author of '.NET 10 Revealed'

LCC 332 - Groquik revient, Emmanuel s'en va

Testing Made Easy: Debbie O'Brien Explains Playwright and its Game-Changing MCP Server

Building the Future of APIs: Mike Kistler's Insights on OpenAPI and MCP

LCC 331 - Le retour des jackson 5

Open Ecosystems in CTRM - CTRMRadio 58

Is es-toolkit the Lodash Killer?

Kodsnack 652 - The best of nature, with Grace Jansen

Ep. 406: Open API's $10M Consulting Service, AI Regulation & Business Adoption

Release Notes: SaaS-Ready API Docs, Visual Polish Across the Board, Smoother Checkout Flows & Post-Purchase Control

#500: MCP Demo using Python, AI and a self healing network (Model Context Protocol)

Jimmy Bogard: MediatR & AutoMapper - Episode 356

WireMock with Tom Akehurst

Discover the hidden opportunities of embedded finance for startups

The Creators of Model Context Protocol

The Creators of Model Context Protocol

Torsion Talk S8 Ep101: Puerto Rico Recap, GDU Intensive Launch & Insight for Doors Demo Review

The Agent Network — Dharmesh Shah

Raising the Bar for Laundry Services with John Buni & David Griffith-Jones of CleanCloud

LLMs for web developers with Roy Derks

139. Building Great APIs with Powertools

Lorna Mitchell on OpenAPI in Design-First Development

Torsion Talk S8 Ep96: Finding the Best Field Service Management Software – The Journey Begins

Agent Engineering with Pydantic + Graphs — with Samuel Colvin

EP 10:04 How to Find the Hidden Money in Your CRM & Maximize Opportunities to Increase Sales

Lorna Mitchell: Writing Documentation Engineers Will Actually Read

Unified API for STR Tech is Here with Abdul Ahadh

Sid Maestre from APIMatic: APIs build vs buy

LCC 319 - le ramasse-miettes-charognes

From .mobi Over GraphQL to Quarkus Dev UI

Why build declarative agents with Visual Studio Code vs Copilot Studio

Fullstack TypeScript with Erik Hanchett

OpenAPI & API Design

The Power of Documentation: Transforming Your Development Practices

OpenAPI & API Design (Go Time #328)

How To Hire AI Engineers — with James Brady & Adam Wiggins of Elicit

#132 - API security with Jeremy Snyder, Founder and CEO at FireTail.io

EP258: Streamline Your Business with Cutting-Edge B2B Payment Solutions with David Ademola

A quick tour of some proposals, and a long chat about OpenAPI with Jamie Tanna

Episode 231: OSCA 2023 with Velda Kiara on her Open Source Journey

What Is Salesforce Doing for Car Dealerships? Find out From a User's Perspective with Vicki Poponi!

What makes a first-class Go port? Plus

Top 5 Research Trends + OpenAI Sora, Google Gemini, Groq Math (Jan-Feb 2024 Audio Recap) + Latent Space Anniversary with Lindy.ai, RWKV, Pixee, Julius.ai, Listener Q&A!

Peter MacDonald: On Advancing Insurance Tech